What is Sandbox replay?

Updated May 1, 2026

Reproducing a flagged finding as a working exploit in an isolated sandbox before the finding ships. The structural commitment behind 'prove, don't flag' — if the exploit cannot be reproduced, the finding is dropped.

Full explanation

Sandbox replay is the verification step that distinguishes pattern-match-and-hope-it-is-exploitable from prove-it-is-exploitable. Securie's sandbox-v0 crate uses Firecracker microVMs seeded with fixture state mirroring the customer app's vulnerable code path. For each finding above the severity threshold, the sandbox executes the exploit (cross-user request for BOLA, cross-user INSERT for RLS, live-API-call for secret validation, attacker payload for SSRF/SQLi/cmd-injection). The verdict is `Proven`, `Unproven`, or `Flake`. Only `Proven` findings reach the customer's PR comment. The reproduced exploit + the verification trace are persisted as a `ProofArtifact` and signed into the attestation bundle for auditor verification.

Example

A Server Action looks like it has BOLA: it takes user_id from FormData and queries the database. The sandbox seeds two users, signs in as user A, calls the Server Action with user B's ID. If the response contains user B's data, the verdict is `Proven` and the finding ships with the exploit trace + the fix as a Suggested Change. If user A's auth context is enforced upstream and the request returns an error, the verdict is `Unproven` and the finding is silently dropped — no false positive.

FAQ

Doesn't sandbox replay miss bugs that need full app context?

Some — yes. Sandbox seeds approximate the vulnerable code path; pathological cases involving multi-step state or specific feature-flag conditions may not reproduce. The trade-off is structural: every finding that DOES reach the customer is verified to be exploitable. False positives are eliminated by construction; the residual false-negative rate is bounded by sandbox-seed fidelity.

Is sandbox replay the same as DAST?

Related but different. DAST scans a running production application with attacker payloads. Sandbox replay runs each candidate finding against a copy of the application's vulnerable code path with the specific exploit Securie's specialist hypothesized. Sandbox replay is finding-specific, deterministic, and faster than full DAST; DAST covers a broader surface but with more noise.