Updated 2026-04-306 min read

Preventing runaway OpenAI / Anthropic bills — spend caps + rate limits + monitoring

Documented Claude Opus victim case ran 4.5 days at $50K. Spend caps + rate limits + monitoring are existential.

Leaked AI key + no spend cap = $50K+ bill in days. Caps + rate-limits + monitoring as defense-in-depth.

What it is

AI provider keys (OpenAI, Anthropic) carry per-call costs that compound. LLMjacking attackers extract value via inference until the cap trips.

Vulnerable example

// No rate limit, no spend cap
export async function POST(req: Request) {
  const { prompt } = await req.json();
  const r = await openai.chat.completions.create({ model: "gpt-5", messages: [{ role: "user", content: prompt }] });
  return Response.json(r);
}

Fixed example

import { ratelimit } from "@upstash/ratelimit";
const limiter = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m") });
export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anon";
  const { success } = await limiter.limit(ip);
  if (!success) return new Response("rate-limit", { status: 429 });
  // Plus per-user spend tracking + provider-side cap as backstop
  const r = await openai.chat.completions.create({ model: "gpt-5-nano" /* default to cheaper model */, messages: [...] });
  return Response.json(r);
}

How Securie catches it

Securie findingmedium

apps/web/app/api/route.ts:22

Preventing runaway OpenAI / Anthropic bills

cost-firewall crate enforces per-tenant + per-feature spend caps; rate-limit specialist catches missing rate limits on paid-API routes; secret_scanner catches leaked keys.

Suggested fix — ready as a PR

import { ratelimit } from "@upstash/ratelimit";
const limiter = new Ratelimit({ redis, limiter: Ratelimit.slidingWindow(10, "1 m") });
export async function POST(req: Request) {
  const ip = req.headers.get("x-forwarded-for") ?? "anon";
  const { success } = await limiter.limit(ip);
  if (!success) return new Response("rate-limit", { status: 429 });
  // Plus per-user spend tracking + provider-side cap as backstop
  const r = await openai.chat.completions.create({ model: "gpt-5-nano" /* default to cheaper model */, messages: [...] });
  return Response.json(r);
}

Catch this in my repo →Securie scans every PR · ships the fix as a one-click merge · free during early access

Checklist

Per-IP + per-user rate limits
Vendor-side spend cap (OpenAI Limits page)
Cheapest-model default
Spend monitoring alerts
Securie cost-firewall enabled

FAQ

What's a reasonable per-user limit?

Depends on use case. Start low (10 req/min/user) + raise on demand.

Share:X Hacker News Reddit LinkedIn

Related guides

Guide

Rate-limiting paid-API routes — Upstash, Cloudflare, edge-native

Every route calling OpenAI / Stripe / Anthropic / paid vendor needs per-IP + per-user rate limits. Edge-native is best for vibe-coded apps.

Guide

Supabase RLS misconfiguration — detect, exploit, and fix

Row-Level-Security bypass is the most common data leak in vibe-coded apps. Here is exactly how it happens, how attackers find it, and how to fix it in Next.js + Supabase with one policy update.

Guide

Broken Object-Level Authorization (BOLA) in Next.js apps

BOLA is the top item on the OWASP API Security Top 10 for a reason — every AI coding assistant introduces it by default. Learn what it looks like in Next.js, how to exploit it, and how to fix it.

Guide

Insecure Direct Object Reference (IDOR) — what it is and how to prevent it

IDOR is the classic name for an authorization bug where a user can change an ID in a URL and access data they should not see. It is BOLA's older cousin and still ships in half of all new apps.