Protocols outlast products

The protocol vs. product distinction

SQL outlasted every proprietary query language. HTTP outlasted every proprietary network protocol. SMTP outlasted every proprietary messaging system. The pattern is consistent across five decades of computing: open protocols survive, proprietary alternatives get acquired, sunset, or abandoned.

The economics drive the pattern. Protocols accumulate investment from many participants. Each implementation strengthens the ecosystem. Each integration raises the switching cost away from the protocol, not toward any single vendor. Products accumulate investment from one vendor. When that vendor changes strategy, gets acquired, or fails, every customer pays the re-platforming cost.

Enterprise technology decisions made today will be evaluated against the same pattern. The AI infrastructure being built now will either rest on durable open standards or on proprietary foundations that will require replacement within a product cycle.

Forty years of pattern

SMTP
1982
Displaced
MCI Mail, CompuServe mail, proprietary X.400
Closed messaging systems were absorbed into an open standard no one owned.
HTTP
1989
Displaced
Gopher, WAIS, proprietary on-line services
The web won because HTTP was implementable by anyone; nothing else was.
SQL
1986
Displaced
QUEL, proprietary query DSLs
Forty years in and SQL is still the lingua franca of data. Each vendor's DSL is not.
MCP
2024Live now
Displaced
Per-framework tool interfaces
Before MCP, integrating a data source with each AI framework meant a separate project. Now it is one.

Proprietary alternatives get acquired, sunset, or abandoned. Protocols accumulate investment from many participants until the cost of not supporting them exceeds the cost of supporting them.

MCP as USB-C for AI

The Model Context Protocol standardizes how AI agents interact with external tools and data sources. Before MCP, every agent framework defined its own tool interface. Integrating a data source with Claude, GPT, Gemini, and Llama required four separate implementations. MCP collapses this to one.

The analogy to USB-C is precise. Before USB-C, every device manufacturer chose its own connector. Users accumulated drawers full of incompatible cables. USB-C did not make better cables. It made cables interchangeable. MCP does the same for AI tool integration: any MCP-compatible client can connect to any MCP server. The client and server evolve independently.

MCP adoption has grown from zero to tens of thousands of server implementations in under a year. Claude Desktop, Cursor, and custom agents built with SDKs in every major language all speak MCP. This adoption velocity matches historical patterns for protocols that reach critical mass. Once a protocol crosses the threshold where the cost of not supporting it exceeds the cost of supporting it, adoption accelerates and becomes self-reinforcing.

The USB-C analogy

AI Clients

Claude Desktop

Cursor

Custom agent (Python)

Custom agent (TS)

MCP

one protocol, any pair

MCP Servers

Plexara

GitHub

Filesystem

Slack

…thousands more

MCP does not make better tool calls; it makes them interchangeable. Client and server evolve independently.

Trino's federation model

Trino federates SQL execution across data sources that were never designed to work together. A single query can join a PostgreSQL table with an Elasticsearch index, an Iceberg lakehouse, and a Cassandra cluster. Trino does not require data to be moved, copied, or transformed. It queries data where it lives.

This federation model has a specific architectural consequence for AI agents: they do not need to know where data is physically stored. An agent writes standard SQL. Trino routes the query to the correct data source, executes it, and returns results in a uniform format. The agent interacts with one query engine regardless of how many underlying data sources exist.

Trino supports over 40 connectors. It is used in production at scale by organizations that process petabytes of data daily. The connector ecosystem continues to expand because Trino is an open protocol with a permissive license: anyone can build a connector, and each new connector is available to every Trino user.

Federation, not migration

Agent

standard SQL

Trino

federated execution

PostgreSQL

Iceberg

Elasticsearch

Cassandra

Kafka

MySQL

Snowflake

…40+

A single SQL query can join a PostgreSQL table, an Iceberg lakehouse, an Elasticsearch index, and a Cassandra cluster. Data is queried where it lives.

DataHub's metadata graph

DataHub models metadata as a graph: entities (datasets, dashboards, users, glossary terms), relationships (lineage, ownership, containment), and aspects (schema, descriptions, tags, quality signals). This graph structure captures the full context of an enterprise data estate in a way that flat catalogs cannot.

The graph model matters for AI agents because it supports traversal. An agent can start with a dataset, follow lineage to upstream sources, check quality signals on those sources, find the glossary terms that define the business concepts, and identify the data steward responsible for accuracy. This traversal provides the deep context that prevents incorrect queries.

DataHub is the most widely adopted open metadata platform, with contributions from organizations across industries. Its entity model is extensible: custom entity types, structured properties, and data contracts can capture domain-specific metadata without forking the platform. Building on DataHub means building on a metadata standard, not a vendor catalog.

Protocols accumulate investment from many participants. Products accumulate investment from one vendor. When that vendor changes strategy, every customer pays the re-platforming cost.
Economics, not philosophy

Why each was chosen

MCP was chosen because it is the only protocol-level standard for AI tool integration with meaningful adoption. Trino was chosen because it is the only open federated query engine that supports the breadth of data sources enterprises actually use. DataHub was chosen because it is the only open metadata platform with a graph model rich enough to support bidirectional semantic enrichment.

Each choice was evaluated against proprietary alternatives. Proprietary query engines offer deeper integration with their own storage but cannot federate across sources from other vendors. Proprietary catalogs offer polished interfaces but lock metadata into closed ecosystems. Proprietary AI frameworks offer convenience but tie agents to specific model providers.

The common thread is portability. An enterprise that builds on MCP, Trino, and DataHub can change its LLM provider without changing its data infrastructure. It can add new data sources without re-architecting its agent layer. It can replace any individual component without disrupting the others. This modularity is how enterprises survive technology transitions.

The tradeoff

Single-vendor stack

Built on proprietary products

Deep integration inside one ecosystem, breakdown at the edges
Connector surface area controlled by one vendor
AI assistant tied to vendor compute and semantic layer
Re-platforming cost on every strategic pivot

MCP · Trino · DataHub

Built on open protocols

Change LLM providers without changing data infrastructure
Add data sources by adding a Trino connector
Replace any component without disrupting the others
Switching cost is paid toward the protocol, not the vendor

What happens when you build on proprietary alternatives

Every major data warehouse vendor has shipped an AI assistant that works well within its own ecosystem. They access the vendor catalog, they query the vendor compute engine, they use the vendor semantic layer. The integration is seamless because everything is controlled by one company.

The problem emerges when the enterprise has data outside that ecosystem, which every enterprise does. A second warehouse, a legacy database, SaaS applications, object storage in a different cloud. The vendor assistant cannot reach this data. The enterprise now needs a second AI integration for the data outside the primary vendor, and a third for the data outside both. Each integration is proprietary, with its own configuration, its own security model, and its own limitations.

Building on protocols avoids this fragmentation. A single MCP server connected to a federated query engine accesses all data sources through one interface. The security model is unified. The metadata is centralized. The agent experience is consistent. When the enterprise adds a new data source, it adds a Trino connector. The rest of the stack is unchanged.

Protocols outlast products

The protocol vs. product distinction

SMTP

HTTP

SQL

MCP

MCP as USB-C for AI

Trino's federation model

DataHub's metadata graph

Why each was chosen

Built on proprietary products

Built on open protocols

What happens when you build on proprietary alternatives

Related reading

101 - What is a Large Language Model?

104 - Frontier models, specialized models, and why enterprise AI uses both

401 - Prompts are the new SOPs

Cookie Preferences