A $200 million contract between Anthropic and the U.S. Department of Defense just collapsed. Not over technology or capability, but over what words should appear in a document. Anthropic wanted explicit prohibitions on mass domestic surveillance and autonomous weapons; the Pentagon demanded the right to use the technology for “any lawful use.” When no compromise materialized, the administration designated Anthropic a supply-chain risk and Anthropic filed suit. Hours later, OpenAI signed its own deal, citing existing law as sufficient protection.
All three parties entered this dispute with the same underlying assumption: that contract language is what governs AI behavior at runtime. But a contract is a legal instrument, enforced after the fact through discovery and litigation. It has no presence inside a compute environment, no ability to observe what a model is doing or intervene when it shouldn't be. And when a $200 million dispute plays out in public, as this one just did, no party can point to cryptographic proof of what actually happened.
Contract language was never designed to enforce behavior inside a compute environment, and no amount of negotiation will change that. Runtime enforcement is an engineering problem, not a legal one.
Where Governance Ends and Execution Begins
Every approach in this dispute is a variant of contractual governance. Anthropic's explicit prohibitions, OpenAI's statutory references, even Anthropic's now-revised Responsible Scaling Policy, which the company modified the same week the Pentagon dispute became public, replacing a hard commitment to pause training with a nonbinding framework of goals. All rely on the same enforcement model: declarations of intent, enforced through legal mechanisms, after a violation has already occurred.
Once an AI model is deployed in a runtime environment, no technical mechanism proves governance was maintained. Log files are mutable. They can be altered, deleted, or fabricated without cryptographic detection. Meanwhile, the contract that supposedly governs the model's behavior sits in a filing cabinet with no connection whatsoever to the compute environment where the model is actually executing.
In classified environments, the deploying organization controls the audit infrastructure entirely. The model provider can't independently verify what happened. In enterprise environments, the same gap exists: the audit trail proving an AI agent complied with its governance policy is stored in the same mutable infrastructure the agent operates within. The Pentagon dispute made this gap visible. But every organization deploying AI agents has the same exposure.
The entire debate has focused on what the contract should say. Nobody is asking the harder question: how does either party prove, with cryptographic certainty, that an AI system was or was not used for a specific purpose? Contractual governance can't answer that. The real demand here isn't “trust me.” It's “verify me.” That kind of verification is an engineering capability, and it has to be built into the system architecture.
The Fail-Closed Mandate
Governance that can't be independently verified is indistinguishable from a promise. Closing that gap requires hard architectural constraints baked into the system itself, requirements the system either satisfies or fails to satisfy, with no room for interpretation.
The agent holds no keys.
Governance constraints must be encoded into cryptographically signed objects that a mandatory runtime boundary parses and enforces. Not advises. Enforces. Modification of the governance object invalidates its signature. The governed agent can't alter its own constraints because it holds no cryptographic keys; only the enforcement boundary has signing authority. If the governance object fails verification for any reason (invalid signature, expired effective period, initial integrity mismatch) execution is blocked. The default state is denial. Fail-closed semantics. This is the architectural principle that separates technical governance from policy governance: the agent can't self-authorize.
Every measurement produces proof.
Governance compliance must be measured continuously at a cadence defined in the sealed governance object, not checked through periodic audits. Each measurement produces a signed, tamper-evident record appended to an append-only structure. Modification of any single record invalidates every record that follows, producing a tamper-evident enforcement history that neither the deployer nor the model provider can unilaterally alter. NIST calls this the Measure function. In practice, most implementations just produce logs. This architecture produces cryptographic receipts: signed, chained, independently verifiable. In NIST SP 800-207 terms, the runtime boundary operates as a Policy Enforcement Point.
Verification works without a network.
A third party must be able to verify governance compliance without network access to the deploying organization or the model provider. Self-contained verification packages (signed artifacts, signed measurement receipts, Merkle inclusion proofs, checkpoint references) must work in air-gapped environments. For classified and disconnected deployments, this isn't a secondary feature. It's the primary requirement. EO 14110 mandates safety testing for AI with critical infrastructure impact. If the evidence of compliance requires a network callback to the system being audited, it isn't evidence. It's a trust relationship.
Current Standards Are Built for Paper Audits
The current standards cover pieces of this problem, but none close the loop from sealed reference to continuous measurement to cryptographic proof. The NIST AI RMF prescribes Measure and Manage functions but no enforcement mechanism. SP 800-53 addresses integrity monitoring, SP 800-207 defines Policy Enforcement Points, SP 800-218 covers secure development, and none extend to continuous runtime measurement of AI agents against sealed, immutable references. EO 14110 mandates safety testing but not continuous post-deployment governance verification. The result is a set of frameworks built for periodic compliance audits applied to systems that make decisions in milliseconds.
This gap runs all the way down to the protocols. The Model Context Protocol, now under the Agentic AI Foundation at the Linux Foundation with over 10,000 active servers, standardizes how AI agents connect to tools and data. It doesn't standardize governance enforcement on those connections. An April 2025 security analysis identified prompt injection, tool permission exploitation, and lookalike tool attacks as outstanding MCP vulnerabilities. There's no built-in mechanism for proving that an agent operated within its authorized tool-access boundaries. MCP connects agents to capabilities. What they actually do with those capabilities, whether they stayed within their authorized boundaries, whether they accessed tools they were never supposed to touch: none of that is recorded in any verifiable way.
NIST has begun to acknowledge this gap directly. The NCCoE published a concept paper on AI Agent Identity and Authorization, and CAISI published an RFI on AI agent security (Docket NIST-2025-0035, 91 FR 698). Both solicit technical controls, not policy recommendations. We submitted detailed technical responses to both: to the NCCoE describing runtime governance integration with MCP server deployments and SPIFFE/SPIRE identity frameworks, and to CAISI describing sealed governance with continuous cryptographic enforcement receipts. Both submissions are public record.
Attested Governance Artifacts
We built AGA because the same problem kept surfacing: autonomous systems that needed to prove they were governed, not just assert it, and no tooling existed to do so. Attested Governance Artifacts (AGA) are sealed, cryptographically signed policy objects that bind agent identity to authorized behavior and enforce that binding continuously at runtime. The sealed artifact functions as an active compliance program: it doesn't record what happened. It governs what's permitted to happen. The architecture is the subject of USPTO Application No. 19/433,835, filed December 28, 2025 (20 claims, 3 independent). A working reference implementation is at github.com/attestedintelligence/AGA, with 112+ automated tests and an independent evidence verifier built with zero AGA imports.
Sealed enforcement.
The Policy Artifact is a cryptographically signed JSON object encoding a Subject Identifier (computed from hashes of the agent's normalized bytes and canonicalized metadata) along with enforcement parameters (measurement cadence, time-to-live, enforcement triggers) and a sealed hash value representing the agent's attested known-good state. Ed25519 binds all fields; modification invalidates the signature. The Portal, the mandatory runtime boundary, parses and enforces the Policy Artifact before the agent executes. Two-process mandatory separation: the Portal holds all cryptographic keys. The agent holds none. The agent can't self-authorize, self-attest, or modify its own governance parameters. If the artifact can't be parsed, if the signature is invalid, if the effective period has expired, or if a measurement doesn't match the sealed reference, execution is blocked.
Continuous measurement with receipts.
The Portal computes a cryptographic hash of the agent's runtime state at the cadence the sealed Policy Artifact specifies and compares it to the sealed reference. Match: continue monitoring. Mismatch: execute a predetermined enforcement action. TERMINATE, SAFE_STATE, ALERT_ONLY, or QUARANTINE.
Quarantine inverts the typical incident response model. The conventional approach is to kill the process, triage the damage, and try to reconstruct what happened from whatever logs survived. That approach throws away the single most valuable thing you have during a compromise: the attacker's ongoing behavior. Instead, the Portal transitions the agent to a sandboxed phantom execution environment. All connections to protected resources (physical actuators, network endpoints, data stores) are severed, but the agent doesn't know this. It continues operating, processing inputs, generating outputs, believing it's functioning normally while every output is captured rather than delivered and every input, including attacker commands, continues flowing in. The compromised agent is running a mission it will never complete, and every step of that mission is being signed and appended to the Continuity Chain as Enforcement Receipts. In defense and critical infrastructure contexts, this is the difference between an incident report that says “we detected and terminated” and one that says “we captured the complete attack sequence, cryptographically signed, while containing all damage in real time.”
Each enforcement action generates a signed Enforcement Receipt appended to the Continuity Chain, an append-only event sequence linked by structural metadata hashes. The privacy-preserving design excludes payload data from the leaf hash computation. Third parties can verify the complete integrity of the enforcement chain, confirming that every measurement occurred on schedule and every action executed as specified, without accessing sensitive operational content. You can prove governance happened without revealing what the agent was doing.
Offline-verifiable evidence.
Evidence Bundles are portable verification packages containing the Policy Artifact, signed Enforcement Receipts, Merkle inclusion proofs, and checkpoint references. All verification except optional checkpoint anchor validation works fully offline. The reference implementation includes an independent verifier built with zero AGA imports, deliberately separated from the AGA codebase to demonstrate that verification requires no trust in the system itself. The verifier needs only the bundle and a public key.
What AGA does not require:
No trusted execution environment. No specialized hardware. No zero-knowledge proof circuits. That's a deliberate design choice, not a limitation. TEE-dependent architectures inherit the supply chain risks and side-channel vulnerabilities of their hardware. Intel SGX attestation has been broken repeatedly. AMD SEV has its own history of key extraction attacks. Any architecture that requires specific silicon creates a procurement dependency that air-gapped and forward-deployed environments can't always satisfy. AGA relies exclusively on standard cryptographic primitives: Ed25519, SHA-256, BLAKE2b-256, HKDF-SHA256, RFC 8785 JSON Canonicalization Scheme, and Merkle trees. These run on anything with a CPU.
AGA is complementary to model-level safety measures: constitutional AI, RLHF, alignment training. It operates at the system level, governing what the agent is permitted to do with the access it has been granted, regardless of what its reasoning engine decides to attempt. It doesn't prevent prompt injection. It contains the consequences. Prevention is a model-level problem. Containment and forensic capture are system-level problems, and the system level is where enforcement has to live.
Governance Enforcement for MCP
The Portal architecture maps directly onto MCP as a governance enforcement layer: intercepting tool invocations, verifying each call against the sealed Policy Artifact's enforcement parameters, and generating signed Enforcement Receipts for every interaction. We have described this integration in public federal submissions and have built a working MCP server demonstrating it.
This is what last week's dispute was missing, and what every organization deploying MCP-connected agents will need: technical proof that an AI agent operated within its authorized boundaries, verifiable by any party, without trusting either the deployer or the model provider.
The Shift from Policy to Proof
Reports that AI models were being used to process intelligence and targeting data during active military operations make the stakes concrete. The Anthropic-Pentagon dispute won't be the last of its kind. When autonomous systems operate at machine speed in consequential environments, contracts can't constrain them in real time. Any governance model built on that assumption will eventually produce the same result.
Forget which side had the better contract language. The industry is relying on contract language at all when cryptographic enforcement is technically feasible. That is the gap worth closing. Whether one believes the model provider or the deploying organization should set governance boundaries, both sides would benefit from a system that cryptographically proves those boundaries were respected.
AI has crossed a threshold. It no longer just generates answers. It takes actions, in environments where those actions have real consequences. The governance infrastructure hasn't kept pace. If your AI governance still lives in a PDF, what you have isn't governance. It's documentation of intent, disconnected from the system it claims to constrain. The shift from generative AI to agentic AI demands a corresponding shift from policy to proof. The cryptographic primitives to make that shift are already standardized, open, and running in production. The only question left is whether governance catches up to capability before the next dispute makes the gap impossible to ignore.
