6 min read

Preventing runaway OpenAI / Anthropic bills — spend caps + rate limits + monitoring

Documented Claude Opus victim case ran 4.5 days at $50K. Spend caps + rate limits + monitoring are existential.

Leaked AI key + no spend cap = $50K+ bill in days. Caps + rate-limits + monitoring as defense-in-depth.

What it is

AI provider keys (OpenAI, Anthropic) carry per-call costs that compound. LLMjacking attackers extract value via inference until the cap trips.

Vulnerable example

// No rate limit, no spend cap
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const r = await openai.chat.completions.create({ model: "gpt-5", messages: [{ role: "user", content: prompt }] });
  return Response.json(r);
}

Fixed example

import { ratelimit } from "@upstash/ratelimit";
const limiter = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m") });
export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anon";
  const { success } = await limiter.limit(ip);
  if (!success) return new Response("rate-limit", { status: 429 });
  // Plus per-user spend tracking + provider-side cap as backstop
  const r = await openai.chat.completions.create({ model: "gpt-5-nano" /* default to cheaper model */, messages: [...] });
  return Response.json(r);
}

How Securie catches it

Securie findingmedium
apps/web/app/api/route.ts:22

Preventing runaway OpenAI / Anthropic bills

cost-firewall crate enforces per-tenant + per-feature spend caps; rate-limit specialist catches missing rate limits on paid-API routes; secret_scanner catches leaked keys.

Suggested fix — ready as a PR
import { ratelimit } from "@upstash/ratelimit";
const limiter = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m") });
export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anon";
  const { success } = await limiter.limit(ip);
  if (!success) return new Response("rate-limit", { status: 429 });
  // Plus per-user spend tracking + provider-side cap as backstop
  const r = await openai.chat.completions.create({ model: "gpt-5-nano" /* default to cheaper model */, messages: [...] });
  return Response.json(r);
}
Catch this in my repo →Securie scans every PR · ships the fix as a one-click merge · free during early access

Checklist

  • Per-IP + per-user rate limits
  • Vendor-side spend cap (OpenAI Limits page)
  • Cheapest-model default
  • Spend monitoring alerts
  • Securie cost-firewall enabled

FAQ

What's a reasonable per-user limit?

Depends on use case. Start low (10 req/min/user) + raise on demand.

Related guides