Skip to main content
Product10 min read

206 - Knowledge: from memory to insights

Plexara ships session-coupled memory and a distinct insights pipeline that, once admin-reviewed, promotes observations into org-wide catalog documentation.

What you will take away from this lesson

In 205 - Assets: dashboards, reports, and data, we covered how useful session output persists as assets in the portal. This lesson is about the other side of persistence: memory and knowledge. What an individual user or agent should remember across sessions, and what the organization should know across users.

Every lesson in this series has referred back to this one. 102 mentioned memory dedup as the reason follow-up questions are cheap. 103 explained the context-is-scratch-paper problem that memory solves. 202 noted that memories carry forward between sessions. 204 pointed out that agent-written queries become curated templates through the pipeline covered here. This is the lesson that ties it all together.

Learning Objectives

  1. 01Distinguish the three layers where knowledge can live: client-side LLM memory, Plexara memory, and organization-wide knowledge in the catalog.
  2. 02Know when to ask the agent to remember something vs when to capture it as an insight, and why getting that call right matters.
  3. 03Name the four steps of the insight pipeline (Capture, Review, Synthesize, Apply) and what happens at each step.
  4. 04Recognize how agent-authored queries, user corrections, and agent-detected enrichment gaps all feed the same pipeline.
  5. 05See the knowledge flywheel: one user's correction becomes the entire organization's better answer on the next question.

Three places where knowledge can live

Frontier clients like Claude.ai and ChatGPT ship their own memory features that persist across conversations at the provider level. That memory is user-scoped, broad, and generic. It remembers preferences, not facts about your data.

Plexara implements its own memory alongside the client's. Plexara memory is scoped to the workspace and rooted in the data estate. It remembers that a specific user prefers fiscal-calendar views of sales, or that a specific agent has previously confirmed a column definition. The memories are stored as semantically searchable records and the agent_instructions guide the agent to call memory_recall when prior-session context could help the current question.

The third layer is organization-wide knowledge that lives in the catalog itself. Descriptions, tags, glossary terms, curated queries, quality flags. Applies on the first question of any session for any user. Getting a fact into this layer requires passing through an admin review pipeline, because changes here affect everyone.

Three places where knowledge can live

  • Client-side LLM memory

    Scope:
    Per user, inside one AI client (Claude.ai, ChatGPT, Gemini, and similar).
    Remembers:
    Generic preferences and long-running facts about the user that hold across conversations: preferred writing style, favorite topics, the user's role.
    Lifetime:
    Persists as long as the client retains it. User-editable in the client.

    "I prefer markdown summaries." "I work in healthcare analytics." Useful but not specific to any data estate.

  • Plexara memory

    Scope:
    Per user, scoped to a Plexara workspace and its data.
    Remembers:
    Preferences and facts tied to this deployment's data: how this user defines sales, which fiscal calendar to use, prior corrections about column meaning, which datasets the user usually works with.
    Lifetime:
    Persists across sessions inside the Plexara workspace. Stored as semantically searchable records; the agent is prompted to call memory_recall when prior-session context might help, and may also proactively search on its own without needing the user to name a specific memory.

    "This user prefers fiscal-quarter comparisons, not calendar-quarter." "This user has confirmed that product_cost_cents is in cents."

  • Organization-wide knowledge

    Scope:
    Everyone in the workspace. Written into the catalog.
    Remembers:
    Facts about the data, the business, the conventions, and the definitions that should apply for everyone. Catalog descriptions, glossary definitions, tags, curated queries, quality flags, deprecation markers.
    Lifetime:
    Permanent until updated. Applies on the first question of any session, for any user.

    "The transactions table stores amounts in cents." "Revenue means gross revenue, not net." These become catalog documentation so datahub_search surfaces them to every future agent.

The three layers do not compete. They cover different audiences and different time horizons. Getting the right layer for a given fact is what makes a Plexara workspace accumulate useful memory without becoming noise.

Memory or insight: which layer do I want?

In practice the choice is usually clear once the question is framed correctly: is this fact about me, or is this fact about the data? Personal preferences and per-user disambiguations are memory. Facts about the data or the business that anyone should benefit from are insights.

Getting this right matters for two reasons. First, it keeps memory focused so the recall stays fast and relevant. Second, it keeps insights flowing into the catalog so the organization's documentation compounds over time rather than sitting in one user's personal memory.

Memory vs insight: picking the right layer

  • A personal preference or user-specific disambiguation

    Memory

    Only relevant for this user. Other users may have different preferences. Belongs in Plexara memory; retrieved via memory_recall when the agent decides the current question benefits from prior-session context.

  • A fact about the data that every user should know

    Insight

    A column definition, a table convention, a business-rule clarification, a data-quality warning. Capture as an insight so the whole workspace benefits once the admin approves it.

  • A curated query worth saving for reuse

    Insight

    Agent-authored queries that solve a common question can be promoted into catalog-level curated templates through the same pipeline.

  • A correction to something the agent said in this session

    Insight

    If the correction is specific to one user's framing it can live in memory; if the correction exposes an organizational gap (the table description was wrong, the glossary term was misdefined) it should be captured as an insight.

The insight pipeline: Capture, Review, Synthesize, Apply

Insights do not become catalog documentation automatically, and the reason is governance. A catalog update that affects every future user deserves a human in the loop. Plexara enforces that with a four-step pipeline that every insight passes through.

The pipeline also serves an audit purpose. Every change in DataHub points back to the insights that produced it, who captured them, who reviewed them, and when they were applied. Nothing about the catalog is mystery-authored.

The insight pipeline: Capture → Review → Synthesize → Apply

  1. 01

    Capture

    User or agent, during a session

    Three sources feed capture: a user correcting the agent ("that column is gross margin, not revenue"), an agent observing a pattern ("amt values look like cents rather than dollars"), or the semantic enrichment middleware flagging a gap in catalog metadata. Each produces a structured insight tied to the relevant entity.

  2. 02

    Review

    Administrator or knowledge steward, in the portal

    Pending insights are queued for review. Nothing writes back to the catalog automatically. The reviewer sees the context the insight was captured in, the entity it applies to, and who captured it. Insights can be accepted, rejected, or edited.

  3. 03

    Synthesize

    Administrator, often with agent assistance

    Related insights about the same entity are merged into a single coherent change. Three separate captures about the same column become one clean documentation update rather than three partial ones.

  4. 04

    Apply

    Administrator, via apply_knowledge

    Approved changes are written back to DataHub as a tracked, reversible changeset with full provenance. Description edits, tag changes, glossary updates, curated query additions, deprecation markers, and more are all supported. From that moment, every subsequent datahub_search reflects the change.

Nothing writes to the catalog without administrator approval. The pipeline is auditable end to end. Every change in DataHub points back to the insights and the reviewer that produced it.

Three sources feed the pipeline

A common misconception is that the pipeline only runs when a user says "capture this." In practice three independent sources keep it fed. First, user corrections: when the agent says something wrong and the user pushes back, the correction can be captured. Second, agent-authored observations: the agent notices a pattern (amounts look like cents; a column description appears stale; an expected value is missing) and captures the observation with lower confidence for review. Third, enrichment gaps: the semantic middleware automatically notes places where catalog metadata was requested and came back empty.

All three flow into the same Capture step. The reviewer sees the source of each insight and can weigh it accordingly.

The knowledge flywheel

None of this is impressive on its own. One insight applied to one dataset is a small win. The payoff is in the compounding: week-over-week, the catalog gets better, which means datahub_search surfaces more useful context, which means the agent gets more questions right on the first try, which means more of those sessions produce new insights worth capturing.

Where this leads

Memory and knowledge persist across users and sessions. That power makes governance the load-bearing question. Who can see which data, who can approve an insight, who can write to the catalog: these are not paperwork, they determine what the agent can actually do. The next lesson covers that layer.

Key terms

Seven terms cover the vocabulary you will encounter when the subject turns to memory, insights, or catalog updates. Two tool names (capture_insight, apply_knowledge) are the ones you will see in tool-call logs most often.

Key Terms

Memory (Plexara sense)
A workspace-scoped, per-user recall layer that persists across sessions. Semantically searchable so the next session automatically retrieves relevant prior memories. Distinct from client-side LLM memory (which is provider-hosted and not connected to your data estate).
memory_recall
The tool the agent calls to surface prior memories relevant to the current question. Usually invoked automatically by the agent; rarely called by name from a prompt.
memory_manage
The tool the agent calls to create, update, or remove memory entries. 'Remember that I prefer fiscal-quarter comparisons' routes through this tool.
Insight
A structured observation, tied to a catalog entity, captured during a session. Sources include user corrections, agent-detected patterns, and automatic enrichment-gap flags. Insights do not become catalog documentation without administrator review.
capture_insight
The tool the agent calls to record an insight. The insight is tagged with its source, the entity it applies to, and the context it was captured in.
apply_knowledge
The administrator-facing tool that drives the Review, Synthesize, and Apply steps of the pipeline. Enumerates pending insights, merges related ones, and writes approved changes back to DataHub as a tracked, reversible changeset. Covered again, briefly, in the governance context of 207.
Changeset
A batch of catalog updates applied in one reviewed, reversible operation. Descriptions, tags, glossary terms, curated queries, deprecation markers, and more travel through the same changeset mechanism.