Skip to main content
Philosophy12 min read

101 - What is a Large Language Model?

LLMs predict the next token from patterns learned across trillions of training examples. What that means, why it produces fluent reasoning, and its limits.

What you will take away from this lesson

You do not need to understand how LLMs work to use Plexara. The platform is built specifically to empower frontier models with the tools and knowledge they need to work effectively with your enterprise data, and it does most of that work on your behalf.

A working mental model still helps. This lesson gives you one that is focused entirely on what matters when you sit down with a Plexara MCP connected to Claude, Cursor, or any other MCP-compatible client.

Learning Objectives

  1. 01Describe what LLMs do well (language, reasoning, pattern-matching) and what they cannot do on their own (know your data).
  2. 02Recognize the kinds of hallucination you will encounter when an LLM is asked about your business without help.
  3. 03Understand tokens as both the unit of cost and the unit of context, and estimate how much text fits in a prompt.
  4. 04Articulate the division of labor: what the model supplies vs what Plexara supplies.
  5. 05Apply three practical prompting tactics: name the entity, ask the agent to show its work, trust catalog-backed answers more than schema-only ones.

Brilliant at language, blind about your data

The single most important thing to know about a large language model is that it is exceptional at language, reasoning, and pattern-matching, and has no idea what your company does, what your data looks like, or what your metrics mean.

Ask a frontier model a general question and it will draw on trillions of tokens of public training data to produce a coherent, useful answer. Ask the same model a question about your company and, without help, it will either refuse to commit or it will invent. The model is blind to everything specific to your organization. That blindness is not a bug in the model; it is the default condition of any LLM until you give it the means to see.

Plexara exists to give it the means. Once the model has a Plexara MCP connected, it can reach your catalog, your schemas, your metric definitions, your operational conventions, and your prior conversations. Without that connection, the brilliance of the model is pointed at nothing.

Tokens: the unit of cost and the unit of context

Every interaction with an LLM is measured in tokens. A token is a small chunk of text, usually three or four characters of English. Your pricing, your rate limits, and the amount of text a model can hold at once are all counted in tokens rather than characters or words.

Two practical consequences. First, you pay per token. Every character the model reads and every character it writes costs money, which is why pasting your whole database schema into a prompt when you only needed one table is genuinely wasteful. Second, every conversation has a ceiling. Once you hit the model's context window, older turns get dropped or summarized. The model "forgets" the beginning of the conversation. Understanding the token unit is prerequisite for understanding every other constraint in the rest of this curriculum.

What a token is, in practice

  • Input: Understanding

    Under/standing

    13 characters, 2 tokens. Longer or less common words get split into subwords.

  • Input: ·the·quick·brown·fox

    ·the/·quick/·brown/·fox

    20 characters, 4 tokens. Common words, with their leading space, usually fit into one token each.

  • Input: ·2026

    ·2026

    5 characters, 1 token. Frequent 4-digit numbers are a single token; long random numbers are not.

  • Input: ·antidisestablishmentarianism

    ·anti/dis/est/ablish/mentar/ian/ism

    29 characters, roughly 7 tokens. Rare long words fragment into many pieces.

  • Input: 🧠

    0xF0/0x9F/0xA7/0xA0

    1 visible character, typically 3 to 4 tokens. Emoji and other multi-byte characters are encoded as their underlying UTF-8 bytes, each of which can be its own token.

A token can be a fragment of one word, a whole word (usually with its leading space), a whole number, or a single byte of a multi-byte character. In specialized tokenizers, very common multi-word phrases can even be compressed into a single token. The main takeaway is that token count does not track character count or word count in any intuitive way, which is why both budget and context limits have to be estimated and measured rather than guessed.

How much fits in a token budget

Useful to calibrate against real content. A Plexara session starts by loading the platform_info payload, which is the operating manual the model needs in order to use the MCP effectively. The table below is a rule-of-thumb reference for anything else you might include in a prompt.

Tokens, to scale

10–20

A short sentence

Most chat turns start here.

~500

One page of prose

Useful rule of thumb when pasting content.

2K–10K

A typical database schema dump

Why you should not paste your whole schema.

~6K

The ACME demo platform_info payload

Loaded automatically at the start of every session.

200K–1M

Current frontier context windows

Opus 4.7 reaches 1M; most tiers are lower.

What an LLM gets wrong on your data

To see why Plexara matters, look at what a raw LLM does when asked about a real business it knows nothing about. Imagine asking a retail analyst agent a common question: "What were Q3 sales for Store 42?" The model has never seen your transactions table. It has no idea whether Store 42 exists, what the sales table is called, or which column holds the dollar amount. What it produces is the most plausible-sounding answer given the question, which is not the same thing as a correct answer.

Ask an LLM about your data without help. Here is what you get.

  • What the model says

    A confident dollar amount for "Store 42 Q3 sales."

    What is actually true

    The model has never seen your transactions table. Any specific number it produces is a guess that looks authoritative.

  • What the model says

    A reference to a `transactions` table.

    What is actually true

    Your actual table might be `os_acme_transactions` or `system_sale`. The model defaults to the most common name from its training data.

  • What the model says

    A column named `revenue`.

    What is actually true

    Your actual column might be `total_amount_cents`, storing values in cents rather than dollars. Math on the wrong column produces wrong numbers.

  • What the model says

    "Q3" interpreted as calendar Q3.

    What is actually true

    Your fiscal calendar may not align with the calendar year. The model will not ask; it will just pick one.

The failure mode has a name

The industry term is hallucination or confabulation. The name is less important than the pattern: when the model does not know, it does not abstain. It guesses confidently, and the guesses look exactly like knowledge. Every team that adopts an AI assistant hits this failure mode in their first week. Recognizing it is half the battle.

The one question that keeps you honest

A single habit is worth adopting immediately: for every specific claim the model makes about your data, ask where it came from.

What the model supplies vs what Plexara supplies

A useful way to think about a Plexara-connected session is as a division of labor between two capable partners. The model brings its general competencies. Plexara brings everything specific to your organization. Neither partner can do the job alone.

The model supplies

  • Grammar, syntax, vocabulary.
  • General reasoning and pattern recognition.
  • Common-sense world knowledge.
  • Fluency with SQL, Python, and common business vocabulary.
  • The ability to invoke tools when given ones to use.

Plexara supplies

  • Your schemas and the semantics of each column.
  • Your metric definitions, glossary terms, and business rules.
  • Your entity relationships and data lineage.
  • Your operational constraints (fiscal calendar, tenants, read-only enforcement).
  • Your memory of prior conversations and captured insights.
The division of labor. When you connect a Plexara MCP to a frontier model, you are giving the model the right-hand column. The model already has the left-hand column from its training.

Three prompting tactics that work

You do not need to master prompt engineering to get useful work out of a Plexara session. You do need three habits. Each maps to a concrete problem you will encounter the first time you use the platform.

01

Name the entity.

A named entity in the question is a direct signal to the agent that a tool call is warranted.

"How are ACME Corp sales doing?" beats "How are sales doing?"

02

Ask the agent to show its work.

Specific numbers need specific sources. An agent that cannot point to a table, a column, or a query behind its answer is probably guessing.

"Which table did you use?" "Where did that number come from?" "Show me the query."

03

Trust catalog-backed answers more than schema-only ones.

If the agent cites a DataHub description, a glossary term, or a curated query, trust it more. If it is working from column names alone, trust it less.

Where this leads

That is the foundation. The rest of the 100 series builds on it: tokens and your budget in 102, context behavior in 103, how frontier and specialized models fit together in 104, what an agent actually is in 105, and how MCP relates to the traditional API world in 110. The 200 series then covers the Plexara MCP itself in depth.

Key terms

Six terms cover almost all of the vocabulary you will encounter across this curriculum. Internalizing these now pays off immediately.

Key Terms

Token
A chunk of text, typically three to four characters in English, produced by a tokenizer that splits input into a fixed vocabulary. Every billable request is counted in tokens, and every context window is sized in tokens.
Context window
The maximum number of tokens a model can process in a single request. Everything the model considers must fit inside this budget: system prompt, prior turns, retrieved documents, the current user message, and the model's own generated output.
HallucinationConfabulation
The tendency of an LLM to produce confident but factually incorrect output. On your data this shows up as invented table names, made-up column names, and specific numbers that do not exist.
Grounding
The act of tying a model response to a specific tool call or retrieved document, so the answer can be traced back to a source. A grounded answer can be verified. An ungrounded answer is a guess.
Agent
An LLM running in a loop that can call tools, observe their results, and plan the next step. An agent is what actually uses a Plexara MCP. The next lesson on AI agents covers this in detail.
MCPModel Context Protocol
The open standard through which agents discover and invoke tools. Plexara exposes its capabilities to any MCP-compatible agent through this protocol.