How Securie's mcp-guard crate validates every MCP tool dispatch
mcp-guard is the Securie crate that enforces operator-authored MCP catalogs at agent runtime. Three layers — TrustedCatalog + Validator + ScopeGuard — close the four attack classes (unknown-server smuggle, fingerprint drift, tool smuggle, scope escalation) at the boundary between inference-router and MCP client. This guide walks the architecture + how to wire it into your agent runtime.
If you're running an LLM agent against MCP servers, mcp-guard is the layer that turns the agent's implicit trust into operator-controlled trust. This guide documents the crate's three-layer architecture, the wiring point at the inference-router boundary, and the runtime cost (O(1) per dispatch — a couple of hash lookups + a bitflags compare).
What it is
mcp-guard is `crates/mcp-guard/src/lib.rs` in the Securie repo. It implements the OWASP MCP Top 10 #1 (tool poisoning) defense + the rug-pull defense + the indirect prompt injection scope-bounding pattern. Three layers: TrustedCatalog (operator-authored TOML-signed allow-list, loaded once at boot), Validator (parses incoming McpManifest + checks invariants), ScopeGuard (per-dispatch O(1) check). All three layers must pass for a tool call to dispatch.
Vulnerable example
// inference-router invoking an MCP tool without ScopeGuard
let response = router
.complete(&completion_request)
.await?;
// No scope check. Whatever tools the model emits, the router dispatches.
// Tool poisoning, scope drift, rug-pulls all silently work.Fixed example
// Same router, with mcp-guard wired in
let scope_guard = mcp_guard::ScopeGuard::from_catalog(catalog);
let router = router.with_mcp_guard(Arc::new(scope_guard));
let response = router
.complete(&completion_request)
.await?;
// Every tool dispatch passes through ScopeGuard::check().
// O(1) check (HashMap + bitflags). Reject = the call never reaches the MCP client.How Securie catches it
apps/web/app/api/route.ts:22How Securie's mcp-guard crate validates every MCP tool dispatch
This guide IS the answer. mcp-guard's three layers each close one attack class: TrustedCatalog (closes unknown-server smuggle), Validator (closes fingerprint drift + tool smuggle), ScopeGuard (closes scope escalation). Wiring is `Router::with_mcp_guard(scope_guard)` at inference-router construction time — no per-call code change. The crate's `//!` block at the top of `lib.rs` documents the threat model + design rationale + a complete wiring example.
// Same router, with mcp-guard wired in
let scope_guard = mcp_guard::ScopeGuard::from_catalog(catalog);
let router = router.with_mcp_guard(Arc::new(scope_guard));
let response = router
.complete(&completion_request)
.await?;
// Every tool dispatch passes through ScopeGuard::check().
// O(1) check (HashMap + bitflags). Reject = the call never reaches the MCP client.Checklist
- Operator-authored TOML catalog committed to your repo (TrustedCatalog source)
- Catalog signed with operator key; mcp-guard validates the signature at boot
- Every MCP server entry: name + fingerprint + maxAllowedScope
- Inference-router constructed with `with_mcp_guard(scope_guard)`
- Per-dispatch failure surfaces as `AgentOutcome::SafetyBlocked` (not a recoverable observation)
- Operator runbook for catalog updates (rotate signing key + re-sign + redeploy)
FAQ
What happens to the agent when ScopeGuard rejects a dispatch?
The agent sees an `AgentOutcome::SafetyBlocked` outcome with the specific reason (unknown server, fingerprint drift, tool smuggle, scope escalation). The agent terminates that branch — it does not retry or recover. The trace is DSSE-signed for auditor replay.
How do I test my mcp-guard wiring?
The crate ships a `MockMcpServer` + integration tests (see `crates/mcp-guard/tests/`) covering each attack class. Adopt the test fixtures into your own integration suite + run mcp-scan periodically against your real catalog.
Does mcp-guard work for non-Anthropic MCP implementations?
Yes. The crate operates at the MCP-protocol level, not against any specific server implementation. Any MCP-compliant server can be added to the catalog with a fingerprint + scope.
Related guides
Model Context Protocol (MCP) servers expose tools to LLM agents — file reads, git commands, HTTP fetches, database queries. The risk surface is the tool catalogue: an LLM agent that can call dangerous tools at the prompt-injection-attacker's instruction is the canonical MCP failure. Here are the patterns that work and the ones that don't.
Model Context Protocol went 0 → 200,000+ servers in 9 months. The April 2026 Anthropic RCE flaw + the Invariant Labs tool-poisoning class disclosures forced every MCP-using team to harden their server hygiene. This guide walks the four attack classes (unknown-server smuggle, fingerprint drift, tool smuggle, scope escalation) and the operator-authored TOML catalog that closes them.
Indirect prompt injection — adversarial instructions embedded in data the agent reads — is the single most common attack class against MCP-using agents. Microsoft's Apr 2026 advisory + Unit42's MCP attack-vector taxonomy converged on the same defense: pre-prompt-output sanitization + scope-bounded egress + Llama Guard 4 classification. This guide ships the layered defense.
The rug-pull pattern: an MCP server ships a safe v1 catalog at install time, then mutates to a v2 catalog (with attacker-controlled tools) once it's running in your trust boundary. Invariant Labs disclosed this class in 2025; the Apr 2026 Anthropic RCE incident exploited a related design flaw. This guide ships the fingerprint-pinning + signature-verification defense.