The catalog-execution gap
A data catalog says this dataset contains PII. A column is tagged as "confidential." An ownership record identifies the data steward responsible for access decisions. A classification policy requires audit logging for every query against sensitive tables. These are governance policies, and in most enterprises they exist only in the catalog.
The query engine that actually executes SQL against the data has no knowledge of these policies. It receives a query, checks database-level permissions, and returns results. Whether the query came from a junior analyst, a production pipeline, or an AI agent makes no difference to the query engine. The governance policies in the catalog and the access controls at the execution layer are disconnected systems maintained by different teams.
This disconnection has always been a problem. AI agents make it a crisis. An agent can discover in the catalog that a dataset is classified as PII, then query it through a separate, ungoverned SQL connection. The catalog told the agent the data is sensitive. Nothing prevented the agent from accessing it.
The gap
Catalog
Knows the policies
- • PII classification tags
- • Ownership + stewards
- • Access requirements
- • Audit requirements
The gap
no shared enforcement
Query engine
Executes queries
- • DB-level permissions only
- • No catalog awareness
- • No classification check
- • No audit correlation
How agents exploit governance gaps
AI agents are optimization engines. Given a goal and a set of available tools, they will find the most efficient path to results. If the most efficient path bypasses governance controls, the agent will take it. This is not malicious behavior. It is the predictable result of providing an agent with tools that have inconsistent access controls.
A typical multi-tool agent deployment includes an MCP server for the data catalog, a separate MCP server for the query engine, and possibly a third for object storage. The catalog MCP server enforces its own access controls. The query engine MCP server enforces different access controls. The agent can read metadata about datasets it cannot query, and it can query datasets it cannot see metadata for. The governance model has gaps at every seam between tools.
Session-level consistency is another gap. An agent may check permissions at the start of a session but not on subsequent queries. It may verify that the user has access to a catalog entity but not verify that the same user has access to the underlying data through the query engine. These inconsistencies create windows where governance policies are not enforced.
AI agents are optimization engines. If the most efficient path bypasses governance controls, the agent will take it. This is not malicious. It is the predictable result of inconsistent access controls.
The fail-closed model
A fail-closed security model denies access when the system cannot determine whether access should be allowed. Missing credentials, expired tokens, unrecognized roles, and configuration errors all result in denial, never bypass. This is the opposite of the fail-open models common in development tools, where missing configuration defaults to full access.
Fail-closed is the only viable security posture for AI agents accessing enterprise data. An agent that receives full access by default when authentication fails is an agent that will eventually access data it should not. The probability approaches certainty as the number of agents, sessions, and configuration changes increases.
The fail-closed model extends to authorization. No persona assigned means zero tool access. The default state for a new user is no access. Access must be explicitly granted through persona configuration that maps identity provider roles to tool allow patterns. This default-deny posture ensures that misconfiguration results in denied access, not unauthorized access.
The only viable posture
Typical dev default
Fail-open
- Missing credentials default to full access
- New users start with broad permissions
- Config errors degrade to permissive behavior
- Probability of breach approaches certainty at scale
Plexara default
Fail-closed
- Missing credentials produce denial
- New users start with zero tool access
- No persona assigned means no capability
- Misconfiguration results in denied access, never unauthorized access
Persona-based tool filtering
Persona configuration maps roles from an identity provider to sets of allowed and denied tools. An analyst persona might allow all query and catalog read tools but deny write operations and administrative tools. A steward persona might allow catalog write tools but deny direct data query tools. An admin persona might allow everything.
Tool filtering serves two purposes simultaneously. The first is security: agents operating under a persona cannot invoke tools outside their allow pattern. The second is efficiency: agents only see tools they are authorized to use. An analyst agent receives a tool list that includes query and catalog tools. It does not see admin tools, write tools, or tools for services the analyst role cannot access. Fewer visible tools means fewer tokens consumed by tool descriptions in the agent context window.
The wildcard pattern syntax supports precise control. A pattern like "trino_*" allows all Trino tools. A pattern like "datahub_create" with deny precedence blocks catalog creation while allowing all other catalog operations. Deny rules take precedence over allow rules, enabling a pattern where broad access is granted and specific operations are excluded.
Persona-based tool filtering
Persona
Allow
Deny (precedence)
trino_querytrino_describe_tabledatahub_search*_write*_admindatahub_*trino_describe_tabletrino_query*Audit logging with full provenance
Every tool call is logged with the user identity, the resolved persona, the connection used, the tool name, the parameters provided, the duration, and the outcome (success or failure). This audit trail captures not just what happened but who did it, under what authority, and through which data path.
The audit log is stored in PostgreSQL with configurable retention. In regulated industries, retention requirements may extend to years. The log is queryable, enabling compliance teams to answer questions like "which users accessed this dataset in the last 90 days" or "how many queries did this agent execute against PII-classified tables" without parsing application logs.
Audit logging at the platform level captures tool calls that would be invisible at the database level. A database log shows that a query was executed by a service account. The platform audit log shows which human user initiated the session, which persona they were operating under, which agent client they used, and which tool call triggered the query. This provenance chain is essential for regulatory compliance and incident investigation.
Audit trail with provenance
plexara.audit / 2026-04-21T14:03:22Z
- user
- “[email protected]”
- persona
- “analyst”
- connection
- “acme-warehouse-prod”
- tool
- “trino_query”
- duration_ms
- “218”
- outcome
- “success”
stored in PostgreSQL · queryable · retention configurable by policy
When governance and execution are unified
When the governance layer and the execution layer are the same platform, enforcement is inherent rather than aspirational. A PII classification tag in the catalog directly affects which personas can query the tagged dataset. An ownership change in the catalog immediately updates who can approve access. A deprecation warning in the catalog surfaces in every query response for that dataset.
Session-aware workflow enforcement adds a behavioral layer to governance. The platform tracks whether an agent called discovery tools before query tools, whether it checked for curated queries before writing SQL, and whether it verified data quality signals before returning results. Agents that skip required steps receive escalating warnings. This is not hard blocking but guided compliance that improves agent behavior over time.
The alternative, governance in one system and execution in another, requires continuous synchronization between the systems, agreement on identity representation across systems, and monitoring to detect when policies are not being enforced. This synchronization overhead is the hidden cost of the multi-vendor data stack. Unifying governance and execution eliminates it.
