How Securie validates every MCP tool dispatch
Securie enforces operator-authored MCP catalogs at agent runtime. Three layers — a signed trusted-server catalog, a manifest validator, and a per-dispatch scope check — close the four attack classes (unknown-server smuggle, fingerprint drift, tool smuggle, scope escalation) before any MCP tool runs. This guide walks the architecture and the trust model.
If you're running an LLM agent against MCP servers, Securie's MCP trust-enforcement layer turns the agent's implicit trust into operator-controlled trust. This guide documents the three-layer architecture, where enforcement sits in the request path, and why it adds negligible cost per tool dispatch.
What it is
Securie's MCP trust-enforcement layer implements the OWASP MCP Top 10 #1 (tool poisoning) defense + the rug-pull defense + the indirect prompt injection scope-bounding pattern. Three layers: a signed trusted-server catalog (operator-authored, loaded once at boot), a validator that parses every incoming tool manifest and checks it against the catalog, and a per-dispatch scope check. All three layers must pass for a tool call to dispatch.
Vulnerable example
// An LLM agent invoking an MCP tool with no scope enforcement
let response = router
.complete(&completion_request)
.await?;
// No scope check. Whatever tools the model emits, the agent dispatches.
// Tool poisoning, scope drift, rug-pulls all silently work.Fixed example
# The same agent, with MCP trust enforcement enabled.
# Every tool dispatch is checked, in order:
# 1. is the server in the operator-signed trusted catalog?
# 2. does its manifest match the pinned fingerprint?
# 3. is the requested tool within the agent's granted scope?
# Any check fails -> the call never reaches the MCP client.How Securie catches it
apps/web/app/api/route.ts:22How Securie validates every MCP tool dispatch
This guide IS the answer. The three layers each close one attack class: the trusted-server catalog closes unknown-server smuggle, the manifest validator closes fingerprint drift + tool smuggle, and the per-dispatch scope check closes scope escalation. Enforcement is wired in once at agent-router construction — no per-call code change.
# The same agent, with MCP trust enforcement enabled.
# Every tool dispatch is checked, in order:
# 1. is the server in the operator-signed trusted catalog?
# 2. does its manifest match the pinned fingerprint?
# 3. is the requested tool within the agent's granted scope?
# Any check fails -> the call never reaches the MCP client.Checklist
- Operator-authored TOML catalog committed to your repo
- Catalog signed with an operator key; the signature is validated at boot
- Every MCP server entry: name + fingerprint + maxAllowedScope
- The agent router constructed with MCP trust enforcement enabled
- A rejected dispatch halts that branch — not a recoverable observation
- Operator runbook for catalog updates (rotate signing key + re-sign + redeploy)
FAQ
What happens to the agent when a tool dispatch is rejected?
The agent sees a safety-blocked outcome with the specific reason (unknown server, fingerprint drift, tool smuggle, scope escalation). The agent terminates that branch — it does not retry or recover. The trace is DSSE-signed for auditor replay.
How do I test MCP trust enforcement?
Securie ships a mock MCP server + integration tests covering each attack class. Adopt the test fixtures into your own integration suite + run mcp-scan periodically against your real catalog.
Does it work for non-Anthropic MCP implementations?
Yes. Enforcement operates at the MCP-protocol level, not against any specific server implementation. Any MCP-compliant server can be added to the catalog with a fingerprint + scope.
Related guides
Model Context Protocol (MCP) servers expose tools to LLM agents — file reads, git commands, HTTP fetches, database queries. The risk surface is the tool catalogue: an LLM agent that can call dangerous tools at the prompt-injection-attacker's instruction is the canonical MCP failure. Here are the patterns that work and the ones that don't.
Model Context Protocol went 0 → 200,000+ servers in 9 months. The April 2026 Anthropic RCE flaw + the Invariant Labs tool-poisoning class disclosures forced every MCP-using team to harden their server hygiene. This guide walks the four attack classes (unknown-server smuggle, fingerprint drift, tool smuggle, scope escalation) and the operator-authored TOML catalog that closes them.
Indirect prompt injection — adversarial instructions embedded in data the agent reads — is the single most common attack class against MCP-using agents. Microsoft's Apr 2026 advisory + Unit42's MCP attack-vector taxonomy converged on the same defense: pre-prompt-output sanitization + scope-bounded egress + Llama Guard 4 classification. This guide ships the layered defense.
The rug-pull pattern: an MCP server ships a safe v1 catalog at install time, then mutates to a v2 catalog (with attacker-controlled tools) once it's running in your trust boundary. Invariant Labs disclosed this class in 2025; the Apr 2026 Anthropic RCE incident exploited a related design flaw. This guide ships the fingerprint-pinning + signature-verification defense.