What you will take away from this lesson
In 209 - MCP Resources: templates and examples, we closed out the three MCP primitives. Every subsystem in a Plexara MCP has now had its own lesson. This final 200-series lesson is the capstone: a single realistic question walked end to end, with a callout for every subsystem that fires along the way, plus an appendix of every tool a Plexara MCP exposes for when you need it.
Nothing here is new. The point is to see the pieces working together in one session rather than one at a time. If a step surprises you, the lesson that introduced that piece is linked right on the step.
Learning Objectives
- 01Walk a realistic Plexara question end to end, naming which subsystem fires at each turn.
- 02Map every piece of the 200 series back to where it showed up in a real session: platform_info, DataHub discovery, Trino execution, semantic enrichment, assets, memory, knowledge, governance, and the MCP primitives (tools, prompts, resources).
- 03Pick from three styles of prompting (high-level, prompt-by-name, tool-by-name) and know when each is appropriate.
- 04Use the tool directory at the end as the reference it is: every tool the Plexara MCP exposes, grouped by toolkit, with badges that flag writes, admin-only operations, and the mandatory first call.
- 05Know what to do next: revisit any lesson whose subsystem surprised you during the trace, or start applying the curriculum on your own deployment.
Where we are in the curriculum
If any term in this lesson feels unfamiliar, the 100 series is one click back. The 200 series assumes that mental model.
100 Series: the foundation
- 101What is a Large Language Model?Brilliant at language, blind about your data. Tokens, hallucination, the grounding problem.
- 102Tokens and your budgetSubscription-plan economics, session limits, and Plexara enrichment dedup.
- 103Context, compression, and memoryThe keep / compress / clear playbook and how memory carries across sessions.
- 104Frontier models, specialized models, and why enterprise AI uses bothThree knowledge sources (training, web search, tools). MCP as the exposure protocol.
- 105What is an AI agent?The think/call-tool/observe loop. Professor's knowledge, child's literalism.
- 110Is MCP just an API wrapper?MCP as an application layer. Spectrum from thin wrapper to full application server.
If a term in this lesson looks unfamiliar, back up to the 100 series. The 200 series assumes that mental model. Every row above is a direct link.
A single realistic question
Every lesson in the 200 series has described one subsystem. That is a sensible way to teach, but it is not how the subsystems show up in a real session. This capstone picks one ordinary analyst question and watches the subsystems work together turn by turn.
Turn by turn: what happens inside Plexara
The ten steps below walk through what happens between the moment the user hits enter and the moment a dashboard link comes back. Governance, platform_info, discovery, curated queries, Trino, enrichment, memory, assets, resources, and knowledge capture are all present in this trace.
What happens, turn by turn, inside Plexara
The question names ACME Corp, which is the two-word entity cue that tells the agent this particular MCP is the right place to look. Without it, the agent might produce a generic answer about retail sales at large.
The agent sees the Plexara MCP in its tool list. Every other tool is refused until platform_info has been invoked, so the agent calls it first. The response loads the ACME operating manual: the mandatory discover-query-enrich workflow, the three-catalog data estate, the OpenSearch raw_query rule, and the prompt library.
The operating manual is explicit: discover before querying. The agent calls datahub_search('Q3 sales by region') and gets back catalog hits with descriptions, owners, tags, glossary terms, and lineage attached automatically. The top hit is os_acme_transactions.
A datahub_get_queries call on the matched dataset returns a curated raw_query template for monthly revenue by region, pre-benchmarked. The agent uses it as the basis for the Q3 query rather than authoring from scratch.
Because the aggregation runs against an OpenSearch index, the agent uses opensearch.system.raw_query with native OpenSearch Query DSL rather than standard SQL. Result returns in roughly a second. The query hits the analyst's row limit and timeout caps but stays comfortably under both.
Step 06
Enrichment attached automatically; dedup kicks in
Semantic enrichment with dedup
Token economics of dedup: 102The table metadata comes back attached to the result (descriptions, tags, glossary bindings). When the agent follows up with a Q2 query a moment later, dedup sends only the incremental context rather than re-enriching the same table.
With Q3 and Q2 numbers in hand the agent computes the delta per region and identifies the region with the biggest change. It writes a short narrative summary for the chat but does not dump the full table there.
The user asked for a dashboard, not just an answer. The agent calls save_artifact with an interactive HTML dashboard and the ACME brand theme and logo resources loaded automatically, because the dashboard-building prompt embeds those references. The dashboard goes to the portal; the chat gets a link.
The analyst, reviewing the dashboard, mentions that they prefer fiscal-quarter comparisons for this kind of analysis. The agent calls memory_manage('remember') to save that preference as a user-scoped memory. The next relevant question from this analyst (in this session or a future one) can retrieve this via memory_recall.
Early in the trace the agent noticed a column named amt that looked like cents rather than dollars. It captured that as an insight with source=agent_discovery. The insight is queued for admin review. If approved, the catalog description for that column gets updated and every future session benefits.
The user only sees the answer, the dashboard link, and the brief conversation. Every step above happened in support of that single exchange.
Three styles of prompting
The trace above used a high-level prompt: the user described the goal and the agent worked out the steps. That is the right default for most work. Two other styles are worth knowing about: invoking a named prompt from the library, and naming specific tools directly. Each has a place.
Three styles of prompting, and when to use each
- 01
High-level: describe the goal
When: Most of the time. The agent picks the right tools, respects the operating manual, and produces a grounded answer.
"For ACME Corp, what were Q3 2025 sales by region, and which region saw the biggest change from Q2? Save the analysis as a dashboard."
- 02
Mid-level: name a prompt
When: When a repeatable workflow already exists in the prompt library. Shorter to type; the output shape is predictable.
"Run create-interactive-dashboard for Q3 sales by region." Takes topic and applies the template consistently.
- 03
Low-level: name specific tools
When: When you already know exactly what you want. Forces a specific path that the agent would otherwise have to reason its way to.
"Call datahub_get_queries for the os_acme_transactions URN. Then use opensearch raw_query to run the regional_revenue template for Q3 2025. Save the result to CSV via trino_export."
There is no “best” style. The right level is the one that matches what you already know. A new user benefits from high-level prompting and lets the agent figure it out; a power user reaches for tool-by-name only when they want to short-circuit.
Closing the loop
That is the curriculum. Ten lessons in the 100 series, ten in the 200 series, and one worked example that tied them together. The subsystems are not exotic individually; the compounding value comes from having all of them running at the same time, under governance, with memory and knowledge closing the loop each session.
Appendix: the tool directory
The rest of this lesson is reference material. Plexara groups its tools into seven toolkits. Every tool name below is the exact identifier the agent would invoke. Most tools are read-oriented and side-effect-free; writes and administrator-only operations are flagged.
You do not need to memorize this directory. The point of the trace above was that the agent chooses tools on your behalf. Scan this section when you want to know what is possible, or when you need to name a specific tool in a prompt.
- Toolkits
- 7
- Tools
- 29
- Write-capable
- 9
- Mandatory first call
- 1
Platform
Core platform tools: deployment info and connection listing. The platform_info tool is mandatory as the first call in every session and is backed by a runtime session gate (202).
platform_infoMandatory first callSession-gatedReturns the deployment description, tags, toolkits, feature flags, persona, portal URL, prompts library, and the agent_instructions operating manual.
Runtime-enforced by a session gate that refuses every other tool in the deployment until platform_info has been invoked in the current session.
list_connectionsLists the configured data connections across toolkits (Trino catalogs, DataHub endpoints, S3 buckets) with their name, kind, and type.
Useful when the agent needs to know which backends are reachable before planning a query path across catalogs.
DataHub
Catalog search, schema, lineage, glossary, and curated query retrieval. This is the agent's primary discovery surface (203). The write tools at the end are used inside the knowledge-apply flow (206) rather than directly by end users.
datahub_searchFirst call for any data question. Searches the catalog by topic, keyword, tag, column name, glossary term, or domain. Returns entity URNs, descriptions, owners, tags, lineage hints, and curated-query counts.
datahub_browseLists catalog contents by category (tags, domains, data products). Useful for orientation rather than targeted lookups.
datahub_get_entityFetches a DataHub entity by URN. Returns the full metadata record, including custom properties.
datahub_get_schemaReturns the column schema for a dataset entity. Use when the agent needs column types and descriptions before writing a query.
datahub_get_lineageUpstream and downstream lineage for a dataset. Used to understand where data comes from and what depends on it.
datahub_get_glossary_termBusiness glossary lookup. The agent uses this to disambiguate terms like "revenue," "active customer," or "net amount" against the organization's canonical definitions.
datahub_get_queriesRetrieves curated, pre-benchmarked query templates for a dataset. The fast path the operating manual tells the agent to prefer over free-form queries.
datahub_get_data_productReturns data-product metadata (a named grouping of related datasets).
datahub_createWriteAdminWrites a new catalog entity. Not typically invoked directly by end users; used inside the apply_knowledge flow.
datahub_updateWriteAdminUpdates an existing catalog entity (description, tags, glossary terms, etc.). Called by apply_knowledge when applying approved insights.
datahub_deleteWriteAdminRemoves a catalog entity. Restricted to administrator sessions.
Trino
SQL execution, discovery, plan inspection, and large-result export against any configured Trino catalog (204). Read-only enforcement is applied at the platform layer when a connection is pinned that way (207).
trino_queryRead-only enforcedExecutes read-only SQL against any configured Trino catalog. Write statements are refused at the platform layer.
trino_browseLists catalogs, schemas, and tables. Used for discovery when DataHub search has not already pinpointed a dataset.
trino_describe_tableReturns the column schema for a Trino table, with an option to include sample rows.
trino_explainReturns the Trino execution plan for a query. Used when the agent needs to reason about why a query is slow or how it will be executed.
trino_exportWrites assetRuns a query and writes the result to a persisted asset (CSV, JSON, or Markdown) instead of returning rows in the conversation.
Use when the result set is large enough that putting it in the context window would be wasteful.
S3
Object storage access with configured prefix ACLs and size caps. Every tool here is restricted to the paths the deployment has explicitly allowed (207).
s3_list_bucketsEnumerates buckets configured on the deployment, within the configured prefix ACLs.
s3_list_objectsPrefix-restrictedLists objects inside a bucket. The platform layer enforces the prefix restriction; the agent cannot browse outside allowed paths.
s3_get_objectSize-cappedReads object content. Subject to the configured maximum file-size limit.
s3_get_object_metadataReads object metadata (size, last modified, content type) without downloading the content.
s3_presign_urlReturns a time-limited signed URL for a specific object. Useful when a dashboard or report needs to link to raw data.
Memory
Per-user memory organized along LOCOMO dimensions with four recall strategies. Memory persists across sessions for the same user and is retrieved via memory_recall when the agent decides prior-session context would help (206).
memory_manageWritesCreate, update, forget, or list memories for the current user. Supported commands: remember, update, forget, list, review_stale.
Memories are organized along LOCOMO dimensions: Knowledge, Events, Entities, Relationships, Preferences.
memory_recallRetrieves memories relevant to a question using one of four strategies: entity lookup, semantic similarity, lineage-graph traversal, or an auto mode that combines all three.
The agent typically calls recall implicitly. Direct invocation is useful when debugging what the platform remembers.
Knowledge
Domain-knowledge capture and admin-reviewed catalog write-back. The capture path is open to every user; the apply path is administrator-only (206, 207).
capture_insightWritesRecords a domain observation for review. Supports three sources: user-provided, agent-discovered, and enrichment-gap.
Not written to the catalog automatically. Flows into the admin review pipeline described in 206.
apply_knowledgeWrites catalogAdminThe administrator-facing side. Enumerates pending insights, supports bulk or per-entity review, synthesizes related insights into cohesive changes, and writes approved changes back to DataHub as a tracked, reversible changeset.
Portal (assets)
Artifact persistence and collection management (205). Dashboards, reports, charts, and exports live here as first-class, shareable objects rather than as transient chat output.
save_artifactWrites assetPersists generated content (HTML, JSX, SVG, Markdown, JSON, CSV) as a named, shareable asset in the portal.
Use this path whenever the agent would otherwise return a large block of content into the conversation.
manage_artifactWritesList, get, update, delete, or revert existing assets. Edits go through update rather than regenerating from scratch, which preserves provenance.
Collection management lives here too: the agent can group assets (dashboards, reports, markdown, CSVs) into sections within a named collection, and collections are shareable as a single unit.
Key terms
Three terms unique to this lesson. Most of the vocabulary across the curriculum has been covered in earlier key-terms sections; this is the short list of what the capstone adds.
Key Terms
- Capstone
- An end-to-end walkthrough lesson. This one threads a single realistic question through every Plexara subsystem the 200 series covered, so the pieces are visible working together in context rather than in isolation.
- Prompting style
- The level of specificity in a prompt. High-level prompts describe the goal; prompt-by-name prompts invoke a library prompt by identifier; tool-by-name prompts force a specific tool path. Each is right in different situations.
- Tool directory
- The appendix at the end of this lesson: every tool a Plexara MCP exposes, grouped by toolkit, with badges that flag writes, admin-only operations, session-gated calls, and the single mandatory first call.
