Rate-limiting paid-API routes — Upstash, Cloudflare, edge-native
Every route calling OpenAI / Stripe / Anthropic / paid vendor needs per-IP + per-user rate limits. Edge-native is best for vibe-coded apps.
Paid-API routes are the highest-cost-per-attack surface. Rate-limit at edge to stop the attack before it reaches inference.
What it is
Rate limiting bounds requests per identity per time window. Edge-native (Cloudflare / Vercel Edge / Upstash) puts the limit before the expensive inference call.
Vulnerable example
// /api/chat/route.ts — no rate limit
export async function POST(req: Request) { /* expensive openai call */ }Fixed example
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(20, "1 m"), // 20/min per IP
});
export async function POST(req: Request) {
const ip = req.headers.get("x-forwarded-for") ?? "anon";
const { success } = await ratelimit.limit(ip);
if (!success) return new Response("Too many requests", { status: 429 });
// continue to expensive call
}How Securie catches it
apps/web/app/api/route.ts:22Rate-limiting paid-API routes
Static-rules + AuthAuthz specialist catch paid-API routes without rate limits at PR time.
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";
const ratelimit = new Ratelimit({
redis: Redis.fromEnv(),
limiter: Ratelimit.slidingWindow(20, "1 m"), // 20/min per IP
});
export async function POST(req: Request) {
const ip = req.headers.get("x-forwarded-for") ?? "anon";
const { success } = await ratelimit.limit(ip);
if (!success) return new Response("Too many requests", { status: 429 });
// continue to expensive call
}Checklist
- Per-IP rate limit at edge
- Per-user rate limit (more restrictive)
- Per-tenant spend cap (cost-firewall)
- Vendor-side cap as backstop
- Monitoring alerts
FAQ
Cloudflare Workers vs Upstash?
Both work. Cloudflare for full edge; Upstash for portability across runtimes.
Related guides
Unlimited API endpoints are how $150K OpenAI bills happen. Here is how to add proper rate limiting to a Next.js app using Vercel Edge Middleware, Upstash, or your existing Redis.
Documented Claude Opus victim case ran 4.5 days at $50K. Spend caps + rate limits + monitoring are existential.
Row-Level-Security bypass is the most common data leak in vibe-coded apps. Here is exactly how it happens, how attackers find it, and how to fix it in Next.js + Supabase with one policy update.
BOLA is the top item on the OWASP API Security Top 10 for a reason — every AI coding assistant introduces it by default. Learn what it looks like in Next.js, how to exploit it, and how to fix it.