Governance at execution time vs. catalog time

The catalog-execution gap

A data catalog says this dataset contains PII. A column is tagged as "confidential." An ownership record identifies the data steward responsible for access decisions. A classification policy requires audit logging for every query against sensitive tables. These are governance policies, and in most enterprises they exist only in the catalog.

The query engine that actually executes SQL against the data has no knowledge of these policies. It receives a query, checks database-level permissions, and returns results. Whether the query came from a junior analyst, a production pipeline, or an AI agent makes no difference to the query engine. The governance policies in the catalog and the access controls at the execution layer are disconnected systems maintained by different teams.

This disconnection has always been a problem. AI agents make it a crisis. An agent can discover in the catalog that a dataset is classified as PII, then query it through a separate, ungoverned SQL connection. The catalog told the agent the data is sensitive. Nothing prevented the agent from accessing it.

The gap

Catalog

Knows the policies

• PII classification tags
• Ownership + stewards
• Access requirements
• Audit requirements

The gap

no shared enforcement

Query engine

Executes queries

• DB-level permissions only
• No catalog awareness
• No classification check
• No audit correlation

The catalog knows the policy. The query engine does not check the catalog before executing. Agents take the efficient path, which means straight through the gap.

How agents exploit governance gaps

AI agents are optimization engines. Given a goal and a set of available tools, they will find the most efficient path to results. If the most efficient path bypasses governance controls, the agent will take it. Nothing about that is malicious. Give an agent tools with inconsistent access controls and this is the predictable result.

A typical multi-tool agent deployment includes an MCP server for the data catalog, a separate MCP server for the query engine, and possibly a third for object storage. The catalog MCP server enforces its own access controls. The query engine MCP server enforces different access controls. The agent can read metadata about datasets it cannot query, and it can query datasets it cannot see metadata for. The governance model has gaps at every seam between tools.

Session-level consistency is another gap. An agent may check permissions at the start of a session but not on subsequent queries. It may verify that the user has access to a catalog entity but not verify that the same user has access to the underlying data through the query engine. These inconsistencies create windows where governance policies are not enforced.

AI agents are optimization engines. If the most efficient path bypasses governance controls, the agent will take it. This is not malicious. It is the predictable result of inconsistent access controls.
Why agents force the question

The fail-closed model

A fail-closed security model denies access when the system cannot determine whether access should be allowed. Missing credentials, expired tokens, unrecognized roles, and configuration errors all result in denial, never bypass. This is the opposite of the fail-open models common in development tools, where missing configuration defaults to full access.

Fail-closed is the only viable security posture for AI agents accessing enterprise data. An agent that receives full access by default when authentication fails is an agent that will eventually access data it should not. The probability approaches certainty as the number of agents, sessions, and configuration changes increases.

The fail-closed model extends to authorization. No persona assigned means zero tool access. The default state for a new user is no access. Access must be explicitly granted through persona configuration that maps identity provider roles to tool allow patterns. This closed-by-default posture ensures that misconfiguration results in denied access, not unauthorized access.

The only viable posture

Typical dev default

Fail-open

Missing credentials default to full access
New users start with broad permissions
Config errors degrade to permissive behavior
Probability of breach approaches certainty at scale

Plexara default

Fail-closed

Missing credentials produce denial
New users start with zero tool access
No persona assigned means no capability
Misconfiguration results in denied access, never unauthorized access

Fail-open defaults are catastrophic with agents at scale. Misconfiguration must result in denial, never bypass.

Persona-based tool filtering

Persona configuration maps roles from an identity provider to sets of allowed and denied tools. An analyst persona might allow all query and catalog read tools but deny write operations and administrative tools. A steward persona might allow catalog write tools but deny direct data query tools. An admin persona might allow everything.

Tool filtering serves two purposes simultaneously. The first is security: agents operating under a persona cannot invoke tools outside their allow pattern. The second is efficiency: agents only see tools they are authorized to use. An analyst agent receives a tool list that includes query and catalog tools. It does not see admin tools, write tools, or tools for services the analyst role cannot access. Fewer visible tools means fewer tokens consumed by tool descriptions in the agent context window.

The wildcard pattern syntax supports precise control. A pattern like "trino_*" allows all Trino tools. A pattern like "datahub_create" with deny precedence blocks catalog creation while allowing all other catalog operations. Deny rules take precedence over allow rules, enabling a pattern where broad access is granted and specific operations are excluded.

Persona-based tool filtering

Persona

Allow

Deny (precedence)

Analyst

trino_querytrino_describe_tabledatahub_search

*_write*_admin

Steward

datahub_*trino_describe_table

trino_query

Admin

*

(none)

Pattern-based allow/deny rules. Deny takes precedence. Agents only see tools their persona authorizes, which also reduces token overhead on every request.

Audit logging with full provenance

Every tool call is logged with the user identity, the resolved persona, the connection used, the tool name, the parameters provided, the duration, and the outcome (success or failure). This audit trail captures not just what happened but who did it, under what authority, and through which data path.

The audit log is stored in PostgreSQL with configurable retention. In regulated industries, retention requirements may extend to years. The log is queryable, enabling compliance teams to answer questions like "which users accessed this dataset in the last 90 days" or "how many queries did this agent execute against PII-classified tables" without parsing application logs.

Audit logging at the platform level captures tool calls that would be invisible at the database level. A database log shows that a query was executed by a service account. The platform audit log shows which human user initiated the session, which persona they were operating under, which agent client they used, and which tool call triggered the query. This provenance chain is essential for regulatory compliance and incident investigation.

Audit trail with provenance

plexara.audit / 2026-04-21T14:03:22Z

user: “[email protected]”
persona: “analyst”
connection: “acme-warehouse-prod”
tool: “trino_query”
duration_ms: “218”
outcome: “success”

stored in PostgreSQL · queryable · retention configurable by policy

Platform-level audit captures the provenance chain a database log cannot: which human, which persona, which agent client, which tool call.

When governance and execution are unified

When the governance layer and the execution layer are the same platform, enforcement is inherent rather than aspirational. A PII classification tag in the catalog directly affects which personas can query the tagged dataset. An ownership change in the catalog immediately updates who can approve access. A deprecation warning in the catalog surfaces in every query response for that dataset.

Session-aware workflow enforcement adds a behavioral layer to governance. The platform tracks whether an agent called discovery tools before query tools, whether it checked for curated queries before writing SQL, and whether it verified data quality signals before returning results. Agents that skip required steps receive escalating warnings. This is not hard blocking but guided compliance that improves agent behavior over time.

The alternative, governance in one system and execution in another, requires continuous synchronization between the systems, agreement on identity representation across systems, and monitoring to detect when policies are not being enforced. This synchronization overhead is the hidden cost of the multi-vendor data stack. Unifying governance and execution eliminates it.

Governance at execution time vs. catalog time

The catalog-execution gap

Knows the policies

Executes queries

How agents exploit governance gaps

The fail-closed model

Fail-open

Fail-closed

Persona-based tool filtering

Audit logging with full provenance

When governance and execution are unified

Related reading

403 - Sharing prompts, and closing the loop with feedback

Closed by default: least privilege as the starting point

Why point-solution catalogs and semantic layers are not enough

Cookie Preferences