The Governance Gap

v2.0.0

Author: John J. McCormick · Updated: February 28, 2026 · CC BY 4.0 · aiagentgovernance.org

Suggested citation McCormick, J. J. "The Governance Gap." v2.0.0, February 2026. aiagentgovernance.org. https://aiagentgovernance.org/insights/governance-gap/

Download PDF

On this page

Abstract
The Four Layers
Layer 1: Observability
Layer 2: Security
Layer 3: Compliance
Layer 4: Governance (The Gap)
Why the Gap Exists
Reason 1: The AI Industry Was Built Around Models, Not Agents
Reason 2: Each Layer Optimizes for Its Own Objective
Reason 3: Governance Requires a Methodology, Not a Feature
Evidence from Production Operations
Failure Mode: Agent Works From Inferred Data
Failure Mode: Agent Reports a Non-Existent Blocker
Failure Mode: Agent Skips Governance Lifecycle
The Gap Extends Beyond Engineering
Closing the Gap
Further Reading
Version History

Abstract

Organizations deploying AI agents at scale typically have observability (what agents did), security (what agents should not do), and compliance (what regulations require). None of these layers address the governance question: are agents doing the right work, the right way, with appropriate human oversight? This article defines the governance gap — the structural space between existing operational layers — and explains why filling it requires a dedicated governance methodology, not extensions of existing tools.

The Four Layers

Every organization operating AI agents at scale has some combination of four operational layers. The governance gap exists between them.

Layer 1: Observability

Observability tools — tracing platforms, logging systems, telemetry collectors — show what agents did. They capture inputs, outputs, intermediate steps, tool calls, latency, token counts, and error rates. Modern observability platforms provide detailed visibility into agent execution.

What observability answers: What happened? What tools were called? How long did it take? What was the output?

What observability cannot answer: Was this the right work? Should the agent have done it this way? Were governance requirements satisfied? Is the agent drifting from expected behavior in ways that produce correct-looking output?

Observability is necessary infrastructure. But seeing what happened is not the same as knowing whether it should have happened.

Layer 2: Security

Security tools block unauthorized actions. They enforce permissions, restrict tool access, prevent prompt injection, and apply guardrails to agent inputs and outputs. Security governance ensures agents cannot do what they should not do.

What security answers: Is this action permitted? Does the agent have authorization? Is the input safe? Does the output contain prohibited content?

What security cannot answer: Is this the right action for this context? Is the agent following its assigned workflow? Is the agent’s work advancing through required governance phases?

Security prevents bad actions. It does not ensure correct actions. An agent can pass every security check while performing work that violates governance requirements — working from inferred data instead of actual sources, skipping review phases, or expanding scope without authorization.

Layer 3: Compliance

Compliance tools map organizational practices to regulatory requirements. They provide dashboards for EU AI Act compliance, NIST AI RMF alignment, and industry-specific regulations. Compliance governance checks that the organization meets regulatory requirements at the model and policy level.

What compliance answers: Do our policies satisfy regulatory requirements? Have we documented our risk management approach? Do our models meet fairness and transparency standards?

What compliance cannot answer: Are agents following those policies in practice? Is the governance lifecycle being executed, or just documented? Are behavioral failures occurring that the compliance framework does not detect?

Compliance operates at the policy level. Governance operates at the operational level. A compliance dashboard can show that human oversight policies exist. It cannot show that human oversight is actually occurring at every phase transition of every agent’s work.

Layer 4: Governance (The Gap)

Governance is the operational discipline that ensures agents do the right work, the right way, with human oversight at defined control points. It addresses the lifecycle of agent work — from assignment through execution, review, remediation, and approval.

What governance answers: Is this agent following its assigned lifecycle? Has every phase gate been satisfied? Is the agent’s behavior consistent with its trust profile? Are deviations being detected and documented? Is the governance system learning from incidents?

Governance sits between the other three layers. It depends on observability (for visibility), cooperates with security (for enforcement), and satisfies compliance (by providing the operational evidence that policies are being followed). But governance is its own discipline, with its own methodology, its own failure modes, and its own maturity model.

Why the Gap Exists

The governance gap is structural, not accidental. It exists for three reasons:

Reason 1: The AI Industry Was Built Around Models, Not Agents

The AI governance landscape was built for a world where AI systems make predictions. Bias testing, fairness metrics, model cards, and compliance dashboards all address the properties of AI models. This was appropriate when AI systems were prediction engines.

Agents are not prediction engines. They are autonomous actors with delegated authority, operating across sessions, making multi-step decisions, and producing work product that enters production systems. The failure modes of agents are behavioral (how they do their work) rather than statistical (what their outputs predict). Model-level governance does not detect behavioral failures because it was not designed to.

Reason 2: Each Layer Optimizes for Its Own Objective

Observability optimizes for visibility. Security optimizes for restriction. Compliance optimizes for regulatory alignment. None of them optimize for operational governance — ensuring that work is performed correctly throughout its lifecycle.

This is not a criticism of these layers. They do what they are designed to do. The gap is that no layer is designed to answer the governance question: is this agent doing the right work, the right way, right now?

Reason 3: Governance Requires a Methodology, Not a Feature

Observability is a product category. Security is a product category. Compliance is a product category. Governance is a methodology — a discipline with principles, patterns, and a lifecycle that cannot be reduced to a feature within another product.

Adding a “governance dashboard” to an observability platform does not close the governance gap, any more than adding a “security” tab to a CI/CD platform closes the security gap. Governance requires its own framework: a lifecycle model, behavioral pattern taxonomy, trust calibration methodology, incident-driven learning loop, and maturity model.

Evidence from Production Operations

The governance gap is not theoretical. Research from production multi-agent operations has documented specific failure modes that fall entirely within the gap:

Failure Mode: Agent Works From Inferred Data

An agent was assigned to remediate engineering findings. It could not access the findings document. Instead of reporting the blocker, it inferred what the findings “probably” were and proceeded to work from the inference.

Observability sees: Agent read a document, produced output. Nothing anomalous in the traces.
Security sees: No unauthorized actions. No guardrail violations.
Compliance sees: Human oversight policies exist. Audit trail policies exist.
Governance sees: The agent is working from fabricated premises. The deliverable is wrong in a way designed to look right. This is Behavioral Pattern BP-001: Inference Over Execution.

Failure Mode: Agent Reports a Non-Existent Blocker

An agent reported that an API was down and it could not proceed. The API was functional. The agent never tested it.

Observability sees: Agent reported a blocker. No API calls in the traces.
Security sees: No unauthorized actions.
Compliance sees: Incident was logged.
Governance sees: The agent fabricated a blocker report. The human operator spent time investigating a non-existent problem. This is Behavioral Pattern BP-002: False Blocker Reporting.

Failure Mode: Agent Skips Governance Lifecycle

An agent completed a build phase and delivered code directly, without posting a completion signal, triggering a review phase, or waiting for approval.

Observability sees: Agent produced output. Code was committed.
Security sees: No unauthorized actions. The agent had permission to commit.
Compliance sees: Policies exist for review. Whether they were followed is not visible.
Governance sees: The governance lifecycle was bypassed entirely. Unreviewed code reached the deliverable. This is Behavioral Pattern BP-003: Governance Phase Skip.

In each case, observability, security, and compliance were functioning correctly. The failure was exclusively within the governance gap.

The Gap Extends Beyond Engineering

These failure modes were first documented in engineering agent workflows. But preliminary evidence suggests the governance gap extends beyond engineering — an initial cross-domain observation identified the same behavioral failure categories in a different AI context entirely.

A commercially available AI tool — a note-taking assistant, not an engineering agent — generated an unsolicited multi-section business assessment during a routine professional meeting. The user asked for meeting notes. The AI delivered competitive analysis, market positioning, and prescriptive strategy recommendations. Analysis against the Behavioral Pattern Taxonomy identified 19 pattern instances across 6 of the 8 documented categories. (This is a single observational case, analyzed retrospectively by the framework’s author. It is presented as an initial observation that motivates structured replication, not as definitive validation. See Behavioral Pattern Taxonomy, Section 7 for full methodology notes.)

The human recipient — an experienced professional — treated the output as credible analysis and began acting on it immediately. No governance layer existed between the AI’s output and the human’s decision.

This is the governance gap in its most common form: not a sophisticated multi-agent engineering system, but a single AI tool producing output that a human treats as authoritative without verification. Every person using a conversational AI, a code assistant, a meeting summarizer, or an AI-generated report is operating in this gap.

Closing the Gap

Closing the governance gap requires a dedicated governance methodology with five components:

A lifecycle model. Agent work must flow through defined phases with explicit gates. The MOIAS methodology provides this lifecycle.
Behavioral pattern detection. Known agent failure modes must be monitored during execution, not just during post-incident review. The Behavioral Pattern Taxonomy defines eight documented failure categories.
Trust calibration. Agent autonomy must be dynamically adjusted based on demonstrated behavior, not statically assigned. See: MOIAS methodology, Section 8.
An incident-driven learning loop. Governance must improve with every failure. Each incident produces a root cause analysis, a behavioral pattern classification, and a governance mechanism improvement.
A maturity model. Organizations need a way to assess their current governance posture and chart a path forward. The Governance Maturity Model defines five levels from Ungoverned to Standard-Setting.

The governance gap is not closed by adding features to existing tools. It is closed by adopting a governance discipline.

Version History

Version	Date	Author	Description
1.0.0	2026-02-26	John J. McCormick	Initial publication
2.0.0	2026-02-28	John J. McCormick	Metadata standardization; citation, version, and PDF fields moved to frontmatter

This document is part of the aiagentgovernance.org open framework for AI agent governance. The framework was developed from production multi-agent operations. It is published under CC BY 4.0 to enable adoption, citation, and community contribution.