What incumbents offer
Every major data warehouse and lakehouse vendor now has an AI assistant and MCP server support. These are deeply integrated with their own platforms. They access the vendor catalog, query the vendor compute engine, and use the vendor semantic layer. The integration is smooth because everything is controlled by one company.
Some have the most mature managed MCP server implementations in the market. Others have comprehensive agent frameworks with supervisor agents for orchestration. At least one has signed a $200M AI partnership for model hosting. These are substantial products backed by substantial investment.
For enterprises that keep all their data within a single vendor ecosystem, these AI assistants may be sufficient. The vendor assistant knows the catalog, understands the semantic layer, and can execute queries with full context. Within its own walls, the experience is compelling.
The walled garden
Warehouse vendor assistant
Deep integration. Polished UX. Full context, but only inside.
Data that lives outside
Unreachable by the vendor assistant
- Legacy Oracleon-prem
- SalesforceSaaS
- MongoDB clusterops
- S3 bucketsobject
- Mainframe exportslegacy
- Marketing data lakeseparate cloud
Reaching them means a second assistant, a third assistant, and so on. Each with its own config, security model, and limits.
The lock-in calculus
AI features that deepen dependency on a single vendor create a specific kind of lock-in. When the AI assistant is the primary way analysts interact with data, switching the underlying platform means retraining every workflow. Semantic definitions tied to one vendor cannot be ported to another. Agent instructions tuned for one platform do not transfer.
This is not incidental. It is the business model. AI features create usage patterns that are harder to migrate than data. Data can be exported. Workflows, prompts, and institutional knowledge about how to use the AI assistant cannot. Each productive session deepens the dependency.
The timing is deliberate. Vendors are adding AI features precisely when enterprises should be preserving optionality. The AI infrastructure being built now will be evaluated against open standards within two to three years. Enterprises that build on proprietary AI assistants will face re-platforming costs when that evaluation happens.
The lock-in calculus
Proprietary ecosystem
Vendor AI assistant
- Semantic definitions tied to one vendor, not portable
- Agent instructions tuned to one framework
- Per-token LLM charges bundled with warehouse compute
- Re-platforming cost grows with every productive session
Plexara
Open protocols + BYO model
- Federated SQL across warehouses, lakes, operational DBs
- Change model provider without changing data infrastructure
- Transparent pricing; LLM cost is between you and the model provider
- Optionality preserved for the next evaluation cycle
The multi-source reality
Most enterprises have data across three or more systems. A primary data warehouse, a legacy database that has not been migrated, SaaS applications with their own data stores, a data lake for unstructured data, and object storage for files and exports. No single vendor assistant can reach all of this.
When the vendor assistant cannot reach data outside its ecosystem, the enterprise needs a second AI integration for the external data. And a third for the data outside both. Each integration has its own configuration, its own security model, and its own limitations. The agent experience is fragmented across multiple tools that do not share context.
Building on a federated platform avoids this fragmentation. A single MCP endpoint connected to Trino queries data across all sources through standard SQL. The agent interacts with one tool regardless of where the data lives. The security model is unified. The metadata is centralized.
What enterprise data actually looks like
One MCP endpoint
Warehouse A
Warehouse B
Legacy DB
Data lake
Object storage
Cost opacity
Incumbent AI features create opaque cost layers. Per-token LLM charges are billed through the vendor at a markup. Compute credits for AI functions are priced separately from base compute. Agent pricing may be per-session, per-query, or per-user, with the specific model varying by vendor and changing between billing cycles.
This cost opacity makes TCO difficult to predict. An enterprise evaluating AI data access cannot easily compare the cost of the vendor AI assistant against an alternative because the vendor pricing is bundled with other services and subject to negotiated discounts that vary by account.
A platform built on open protocols with a bring-your-own-model approach eliminates this opacity. The LLM cost is between the enterprise and the model provider. The platform cost is the platform cost. There are no hidden per-token charges, no compute credit markups, and no AI feature surcharges.
When incumbent AI is sufficient and when it is not
Incumbent AI assistants are sufficient when the enterprise keeps all queryable data within a single vendor platform, the vendor semantic layer covers all business metric definitions, cost opacity is acceptable, and lock-in risk is within tolerance. For many organizations, this describes their current state accurately.
Incumbent AI assistants are insufficient when data spans multiple systems (the common case), when the enterprise needs to federate queries across warehouses, lakes, and operational databases, when cost predictability matters, or when the long-term strategy includes preserving the ability to change AI model providers without changing data infrastructure.
The decision is not about which vendor has the better AI assistant. It is about whether the enterprise wants AI data access tied to a specific vendor or built on open protocols that work across all of them.
