12 min read

How to handle your first traffic spike (without your AI-built app falling over)

Your launch tweet went viral, or you got featured on Hacker News, or a YouTuber linked to your demo. Now 50,000 people are visiting in an hour and your app is dying. Here is the playbook for surviving the first traffic spike — what fails first, what to fix in the moment, and how to prepare for next time.

You shipped. Your launch tweet went viral. You hit Hacker News front page. A YouTuber featured your demo. Whatever the reason: you're getting 50,000 visitors in an hour and your app is breaking under the load.

This is the playbook. What fails first, what to fix in the moment, what to set up before the next spike.

TL;DR — the failure cascade

In order, this is what breaks under traffic:

1. Database connections — your Postgres connection pool exhausts 2. LLM API rate limits — OpenAI / Anthropic refuse to serve you 3. Bandwidth or function-invocation quotas — Vercel throws 429s 4. DNS or CDN at the edge — usually NOT — Vercel / Cloudflare absorb this 5. Your auth provider — Clerk / Supabase Auth rate-limit you

The fix order during the spike:

1. Stop new signups if signup is overwhelmed (temporary "we're at capacity" page) 2. Cache aggressively — set Cache-Control headers on anything that can be cached 3. Disable expensive features that are not load-bearing for the value prop 4. Talk to your users — a tweet saying "we're handling 50K visitors / sec, may be slow / error, sorry, fixing" is dramatically better than silence 5. Scale up after the spike — the time pressure is over by then; do the real work calmly

What fails first — your database

The most common failure under spike traffic is database connection exhaustion.

Postgres has a hard cap on concurrent connections (Supabase Pro: 60 by default; raise to 200 in dashboard). Each Vercel function invocation can open a new connection. A spike of 1,000 concurrent function invocations tries to open 1,000 connections; the database refuses connections 61+; users see 500 errors.

The fix is connection pooling. Supabase ships with PgBouncer; configure your connection string to point at the pooler endpoint, not direct.

```ts // WRONG — direct connection, exhausts at 60 DATABASE_URL=postgresql://...:5432/postgres

// RIGHT — pooler endpoint, supports 1000s of concurrent DATABASE_URL=postgresql://...:6543/postgres ```

If you're already on the pooler and STILL exhausting: lower per-function connection use (one connection per function invocation, not one per query), batch reads where possible, cache aggressively at the edge.

What fails second — LLM API rate limits

If your app calls OpenAI / Anthropic / Google for any user-facing feature, the spike will hit your provider's rate limit.

  • OpenAI Tier 1 (default for new accounts): 500 requests / minute on most models
  • Anthropic Tier 1: 50 requests / minute on Sonnet
  • Google Gemini: varies, typically 60 requests / minute

A spike of 50K visitors over an hour, where each user makes 3-5 LLM requests, hits these limits hard.

The fixes:

  • Cache aggressively. Same prompt → same response = serve from cache. Most chatbot SaaS gets a 30-60% cache hit rate at modest implementation effort. KV-store cache (Vercel KV, Cloudflare KV, Redis) keyed on the prompt hash.
  • Queue + degrade. When you hit rate limits, queue the request + show a "we're at capacity, your response will be ready in N seconds" UX. Better than 500 errors.
  • Move to higher rate-limit tiers BEFORE the spike. OpenAI's tier increases require both spend history AND time; if you're new and shipping, you cannot get to Tier 5 the day of your viral moment.
  • Switch to cheaper / faster providers for non-critical paths. DeepSeek / GLM are cheaper AND have higher rate limits than OpenAI in many cases. The cost-firewall + multi-provider routing pattern saves you here.

What fails third — your platform's quotas

Vercel Hobby caps function invocations at 100K/month. A real spike can blow through that in 4 hours.

Other quota landmines: - Vercel image optimization: paid feature, easy to forget the cap - Vercel bandwidth: 100GB/month on Hobby, easy to blow past - Cloudflare Workers: 100K requests/day on free tier - Netlify functions: 125K/month on Starter

The fix is upgrading to the paid tier BEFORE the spike. Vercel Pro is $20/mo and removes most caps. If you launched on Hobby + got featured: upgrade RIGHT NOW, in the moment, before more traffic hits.

What fails fourth — auth providers

Clerk's free tier is 10K MAU; their rate limits hold up for spikes well within that. Above 10K MAU + you're not on the paid tier, weird things start happening (sign-ins fail, sessions don't refresh).

Supabase Auth: less likely to fail under spike, but the database underneath can fail (auth tables are part of the same Postgres).

The fix is upgrading auth tier before the spike, same as your other infrastructure.

What fails LAST — DNS and CDN

The edge network (Vercel's, Cloudflare's, Netlify's) handles spike traffic without breaking a sweat. Static content, cached pages, edge functions can typically serve 100K+ concurrent users without configuration.

The implication: cache aggressively. Anything you can cache at the edge does not contribute to the database / function / LLM-API load.

Mid-spike fixes (do these RIGHT NOW)

If you're in the middle of a spike:

1. Upgrade Vercel / Netlify / Cloudflare to paid tier. Removes 80% of quota issues. 5 minutes. 2. Bump your Postgres connection limit. Supabase dashboard → Settings → Database → max connections. Raise from 60 to 200 (Pro) or 400 (Team). 2 minutes. 3. Switch DATABASE_URL to the pooler endpoint if not already. 5 minutes (env var change + redeploy). 4. Add Cache-Control headers on your most-trafficked routes:

return Response.json(data, {
  headers: { "Cache-Control": "public, s-maxage=60, stale-while-revalidate=300" },
});

This caches at Vercel's edge for 60 seconds; serves stale up to 5 minutes while revalidating. Most spike traffic re-requests the same content; this single change can absorb 90% of the load on cached endpoints.

5. Disable expensive features that aren't critical. If your homepage runs an LLM call to "personalize" the headline, kill that for the duration of the spike. The LLM rate limit is gone; the user sees the static homepage; everyone wins.

6. Tweet about it. "We're handling 50K visitors / hour, things might be slow, we're scaling up, sorry for any errors, will update in 30 min" — this turns "the app is broken" into "the app is popular." The narrative impact is huge.

Pre-spike preparation (do these before next time)

After the spike calms down:

1. Run a load test. k6, autocannon, or Vercel's Speed Insights. Simulate 1,000 concurrent users hitting your homepage + signup flow + main app feature. Watch what breaks first.

2. Set up monitoring with alerts. Sentry for errors, Vercel / Cloudflare Analytics for traffic, Supabase dashboard for DB connection count. Alerts at 80% of any quota.

3. Document your "spike playbook." A 1-page runbook in your README — "if traffic spikes, do these 5 things in order." Future-you (or your first hire) will thank you.

4. Pre-warm caches. If certain routes are slow on cold-start, hit them on a cron schedule to keep them warm.

5. Set up the [Securie](/signup) cost-firewall. A spike that includes ANY LLM-driven endpoint can produce a $4,000 OpenAI bill in 4 hours. The cost-firewall throttles when per-tenant spend crosses your tier's soft cap; you find out about the spike from your dashboard, not from the provider's invoice.

6. Add a status page. Better Stack, Atlassian Statuspage, or a Notion page that you can update during incidents. Even a simple "all systems operational" page that you flip to "investigating" during incidents builds enormous trust.

What spike traffic teaches you about your infrastructure

The spike is a free stress test. Whatever breaks under spike traffic was going to break eventually; the spike just forced the timing.

Use the post-mortem to fix the actual issues, not just the symptoms. "Database exhausted" is the symptom; "we don't have connection pooling" is the issue. Fix the issue and the next spike doesn't break anything.

Most solo founders panic during the first spike, spike-fix the symptoms, and never come back to fix the issues. The next spike (which always comes) breaks the same things again.

A note on attacks during spikes

Spike traffic is correlated with attack traffic. Visibility = popularity = "let me try to break this app."

What attackers try during a spike: - BOLA / IDOR — enumerating user data while you're distracted - Auth brute-force — credential stuffing while your monitoring is overwhelmed - Webhook spoofing — sending fake Stripe / Clerk events while you're not watching - API abuse — finding rate-limit gaps to scrape data

The defenses are the same as anytime: rate limiting, sandbox-verified auth checks, signed webhooks. Securie catches these structurally on every PR; if your defenses are in place before the spike, the attacks during the spike fail.

Related

Related posts