9606
9606.ai
2026 EDITION
— GOVERNED INFERENCE · FORMAL ARCHITECTURE FOR ACCOUNTABLE AI

Governed InferenceA Formal Architecture for Accountable AI in Regulated Domains

Twelve Parts, four volumes, one claim: accountable AI in regulated domains is achieved by composing the model with a deterministic substrate on both sides of inference, not by improving the model. This is the architecture, the proof, and what it requires of every party who touches it.

This page is an architecture and assurance argument. It is not a proof of legal compliance, a certification of any specific deployment, or a claim that any model's output is true. Where a claim depends on counsel validation, carrier confirmation, or customer-specific fact clearance, the full paper marks it explicitly.

SECTION 01

The Thesis

5 claim types

State the bridging-object thesis — the Regulatory Interpretation (legal register) and the Governance Specification (technical register) are the same artifact; the Custodial Architecture is the accountable human process that authors, owns, and certifies it. Constrained decoding compiles a grammar into a runtime token mask; constrained encoding renders a Governance Specification into a governed prompt. Same pattern, both sides of inference.

Two Non-Negotiables
BOUNDARY CONDITION

The Seam

The boundary between deterministic structure and probabilistic content is drawn explicitly everywhere it appears; each side of the seam is scoped to exactly what it can guarantee.

OWNERSHIP

The IP Allocation

9606 owns the encoding surface and platform; the custodial function owns certification and custody; the customer owns the Regulatory Interpretation and audit record. Stated to the byte, not left to drift.

EPISTEMIC DISCIPLINE

The Claim Hierarchy

Every claim in the underlying paper is tagged as mathematical, empirical, engineering, legal, or regulatory-posture. A claim that doesn't announce its category is the paper's own failure, not the reader's to guess.

SECTION 02

Why a Model Cannot Govern Itself

4 structural limits

A transformer is a parameterized, continuous, almost-everywhere-differentiable function from token sequences to a probability distribution — extraordinarily capable, and structurally incapable of self-certification. Four independent results establish this; none depends on the others, and none is solved by scale.

REPRESENTATIONAL LIMIT

Rank Collapse

Pure self-attention drives token representations toward a rank-one matrix at a doubly-exponential rate in depth; reliability is a contingent internal equilibrium the residual stream and FFN hold off, not a guaranteed property.

CAPABILITY VS RELIABILITY

The Reasoning Ceiling

Capability is not reliability. Reasoning is brittle under semantics-preserving reformulation in ways the model does not signal; this boundary does not move under scaling the way the capability boundary does.

EPISTEMIC LIMIT

The Causal Rung Problem

Training data sits at Pearl's Rung 1 (association). A model trained on association cannot, by that training alone, certify Rung-2/3 (intervention/counterfactual) claims — exactly the claims a deterministic governance layer needs to make.

RUNTIME LIMIT

Non-Determinism by Default

Autoregressive serving introduces batch-dependent, floating-point-order non-determinism a bare model does not control for and an audit cannot replay without pinning it explicitly.

The model will sometimes be wrong, and has no internal signal for when. The architecture's claim has never depended on that being false.
SECTION 03

The Architecture

4 layers

Execution authority lives outside the model, in a formally verifiable Control Graph built on finite state machines and Petri nets. The model is bracketed: a deterministic encode stage assembles the governed prompt before inference; a deterministic decode stage verifies and constrains the output after it. C = M_post ∘ L_θ ∘ A_pre — and the composability theorem proves the bracketed system inherits the substrate's safety and authorization guarantees for every model placed inside it.

LAYER 01

Executable Control Graph

All permitted actions, sequencing, and compliance rules encoded as topology (FSM/Petri net), not as prompts. A policy encoded as a transition cannot be violated by the model; it can only fail to be reached.

LAYER 02

Model-Agnostic Inference Fabric

Generative models are interchangeable workers behind a stable control surface. Switching models is a configuration change, not a rewrite — the control graph's guarantees are proven independent of which L_θ sits inside the bracket.

LAYER 03

Constrained Encoding (A_pre)

A versioned, five-component Governance Specification (Domain Definition, Structural Schema, Relationship Map, Output Grammar, Guardrail Set) is rendered deterministically into the governed prompt before the model ever runs.

LAYER 04

Constrained Decoding & Verification (M_post)

The Output Grammar declared at encode time is enforced as a hard token mask where the endpoint permits, a checked guarantee where it does not, then layered content verification runs on the committed output.

Finite state machinesPetri netsDeterministic by constructionModel-agnosticCryptographic audit trail
SECTION 04

The Correctness Envelope

6 layers · 3 tiers

You cannot prevent a language model from being wrong. The architecture does not claim to. What it claims, and proves, is that every way a governed output can be wrong falls into exactly one of six layers, and every layer is handled by one of three treatments — ordered by strength, applied at the strongest tier available to that layer.

Tier ordering — strongest treatment available is applied per layer
TierTreatmentWhat it meansExample layers
1Eliminated by constructionOccurrence probability zero, not lowSafety, authorization, reproducibility
2Bounded and detectedCannot be eliminated; probability is bounded and occurrence is detectableModel-content residual (the one layer the model itself owns)
3AttributedCannot be fully eliminated or detected, but maps to a named, accountable party with an evidence trailLegal-adequacy, certification, use-scope errors
NOTE

The lattice is revisable. An incident-review mechanism reclassifies failure modes the current enumeration hasn't yet anticipated — totality is a claim about the architecture's six-layer pipeline structure, not an assertion that every failure mode is already catalogued.

SECTION 05

The Custodial Architecture

4-party federation

Accountability requires more than an architecture — it requires an institution. The Custodial Architecture is the federated structure that authors, owns, and certifies the Governance Specification, allocating responsibility to a named party at every layer before anything goes wrong, not after.

PLATFORM

9606

Owns the encoding surface, the encoding methodology, and the platform. Guarantees faithful execution: that a correct specification is rendered correctly.

LEGAL ADEQUACY

Specialist counsel

Owns legal-adequacy: that the Governance Specification correctly states what governing law or policy requires. A professional judgment, not a technical one.

CERTIFICATION

Custodial / certification function

Owns the certification framework and the custody record — the versioned, dated proof that a given specification was reviewed and by whom.

DEPLOYMENT

Customer

Owns the Regulatory Interpretation and the audit record for their own deployment; bears use-scope responsibility for how the certified architecture is actually deployed.

A wrong governed prompt is always attributable to exactly one side of a drawn line: a correct specification rendered unfaithfully, or a specification that was itself wrong. The seam doesn't eliminate error — it makes every error locatable.
SECTION 06

Economics

25–60% cost reduction

The cost advantage and the governance are the same architectural decision, viewed twice. A system that assembles a bounded governed prompt before inference spends fewer tokens than a system asking the model to manage itself — and the saving is structural, not promotional.

STATUS QUO

The Thinking Tax

Self-managing agent stacks bill every planning, routing, and reflection step as consumed tokens, on infrastructure whose providers face a real, if non-absolute, tension between consumption revenue and consumption-reducing tooling.

GOVERNED PROMPT

The Architecture of Enough

A governed prompt gives the model exactly one bounded task with no self-management overhead. The tokens spent are the tokens the task requires, not the tokens the model's own self-management would add.

EXECUTION

Routing & Caching

Each task routes to the least expensive model that satisfies the governed requirement; a stable governance prefix makes long governed prompts cache-efficient at scale. Both savings are the same ordering decision, viewed from two angles.

3–20ms
Control overhead / request
25–60%
Total inference cost reduction
60–80%
Reduction on high-volume classification
SECTION 07

Regulatory Alignment

architecture-to-requirement mapping

The architecture is not advanced as a compliance conclusion — that determination belongs to counsel and the regulator against specific facts. What it supplies, by construction, is the set of properties regulated environments are increasingly built to demand.

Architecture supplies the property; counsel determines compliance.
Regulatory requirementWhat the architecture supplies
ReproducibilityThe same trigger and configuration version yield the same governed prompt, every time
AuditabilityEvery inference event is bound to a versioned governance state and a provenance-tagged context package
Pre-execution controlCompliance is enforced before the model acts, not checked after
Data residencyThe governed prompt is assembled under a customer-selected data-boundary mode (Strong / Moderate / Weak), enforced as a constraint on assembly, not a policy

Regulatory frameworks are stated at a level of generality that does not depend on specific provisions; verify current authority with qualified counsel before any compliance representation.

SECTION 08

Reference Deployments

2 domains

An architecture is validated by contact with a real regulated domain, not by internal coherence. Two reference engagements supply that contact, presented at the conservative register their actual stage warrants.

PUBLIC-INTEREST DOMAIN

GetWater — Water Rights Verification

Applies the architecture to verifying water deliveries under Utah water-rights law, a domain that is simultaneously legal, physical, and auditable, with no human-in-the-loop absorbing model uncertainty. Tests source-corpus legal adequacy and the legal-to-deterministic conversion under public-interest stakes.

REGULATED COMMUNICATIONS

Registered Investment Adviser Use Case

Encodes governing compliance obligations (Investment Advisers Act, SEC marketing-rule requirements, firm policy) for a regulated-communications workflow, producing a customer-specific audit package reviewable by counsel, a board, or an examiner.

Deployment status and customer relationships are stated at the stage represented in current internal materials; specific facts are described at a level of generality that does not depend on publication clearance.

SECTION 09

Strategic Position & Backing

unoccupied middle layer

The AI infrastructure market has three layers — model providers, application platforms, and the largely unoccupied middle: reasoning and orchestration infrastructure between the two. 9606 occupies that layer, defined by pre-inference specification, a deterministic execution boundary, constrained output, verification, audit, and accountable ownership.

LEADERSHIP
David Robinson
CEO

Chief Executive Officer.

Burke Powers
CTO

Chief Technology Officer.

Ryan Shepherd
COO

Chief Operating Officer.