
.png)
Snowflake Summit 2026 Was About Context. The Conversation Has a Blind Spot
The thing I kept hearing at Snowflake Summit 2026, in nearly every conversation, on the keynote stage and in the breakouts and at almost every booth on the show floor, was context. Catalog companies, semantic layers, observability tools, ingestion platforms, agent platforms: every category had its version of the same pitch. Snowflake itself put real engineering behind it. Cortex Sense automatically gathers context for agents before they answer questions and lifted accuracy on Snowflake's own evaluation from 24% to 83%. Horizon Catalog now treats agent identity, intent-driven governance, and data movement policies as first-class concepts. Cloud Agents is being built as the future surface for agent-to-agent composition inside the platform. The platform itself is becoming context-aware in real, well-engineered ways.
The convergence is happening for a real reason. The accuracy of an AI agent answering a business question on enterprise data depends almost entirely on what it knows about the business, the data, and the relationship between them. Without that, the agent is guessing. The vendors who landed on context as the headline at Summit, and Snowflake itself, are correct that this is the gating factor for accurate AI on enterprise data.
What I want to spend the rest of this post on is a kind of context that I think almost everyone on the floor was skipping past. The kind that ends up mattering most, and that the conversation at Summit barely touched.
The Context That Decides Whether an Agent Is Useful
The conversation at Summit focused, almost entirely, on context that can be written down: catalogs that hold metadata about your tables, semantic layers that map business terms to physical columns, glossaries that document what your metrics mean, data dictionaries that flag PII. All of this is real, valuable work.
The part I want to highlight is what most engineers I know would call tribal knowledge. Or in the language some data architects use, tacit knowledge: the kind that lives in practice rather than in documents.
A few examples of what I mean:
- Why you always use customer_id_v2 instead of customer_id, even though both exist (because v1 had a join bug in 2022 that nobody fully removed and you can't trust the keys on it).
- Why marketing reports use calendar weeks starting Monday but finance uses Sunday-start (because the HR payroll system locked finance into Sunday boundaries years ago and changing it would invalidate every comparison since).
- Why nobody on the team joins the revenue table to the opportunity table directly (because deal_id was renamed to opportunity_id in 2023 in some sources but not others, so the team's workaround is to go through a bridge table that quietly fixes it).
- Why the "high-value account" definition excludes companies under 100 employees in Q2 through Q4 but includes them in Q1 (because of a campaign decision made in 2021 that became permanent through inertia, and nobody on the current team can fully explain why).
- Why the gold layer is allowed to do certain kinds of joins that the silver layer is not (because the warehouse cost model worked out that way three years ago, and rebuilding the conventions would be too disruptive).
Almost none of this is in any catalog, any semantic layer, or any wiki. It lives in the head of the senior engineer or analytics engineer who has been at the company for three or four years. When she leaves, a meaningful portion of it walks out the door with her.
And this is the context that decides whether your AI agent gives the right answer or the wrong one. The agent might find the right table, use the right named metric, and apply the right join, and the answer can still be wrong because it joined on customer_id when it should have used customer_id_v2, or used Sunday weeks when the question was about marketing performance, or hit the gold layer when the question was actually about historical trends. Tribal knowledge is what separates "technically right" from "actually useful."
Why Context Retrieval Has a Structural Ceiling
The accuracy problem in enterprise AI is structurally different from the accuracy problem in general AI. A general-purpose agent can be wrong about facts because its training data was wrong or because it misunderstood the question. An enterprise agent has a third failure mode that often dominates the other two. It can be technically right in a way that doesn't match how your team thinks about the question.
The agent that returns last quarter's revenue from the canonical revenue table is technically right and useless if the team trusts a silver-layer view that the catalog doesn't flag as the operative one. The agent that returns customer count from customer_id is technically right and wrong by twelve percent because everyone on the team knows you have to use customer_id_v2.
This is where context retrieval as a capability has a structural ceiling. The most well-engineered retrieval system can find context that has been encoded. Schema, lineage, metric definitions, semantic relationships, all of it. The piece that's harder to find, because most teams haven't yet codified it anywhere a system could read, is the senior engineer's mental model of why the team does things the way they do. Cortex Sense and the equivalent investments other vendors are making at this layer are doing the right job. They are doing the job that lives downstream of codification.
Until tribal knowledge is captured somewhere an agent can read, the same agent will keep giving technically-correct, business-wrong answers in roughly the cases where tribal knowledge would have changed the answer.
Why Tribal Knowledge in Data Engineering Is So Hard to Capture
The reason this isn't being solved well isn't that nobody has tried. Wikis exist. Catalogs exist. Semantic layers exist. The reason is that tribal knowledge in data engineering has some specific properties that make it hard to fit in any of those.
It is situational, and the situation is half of what it means. Almost every rule the team follows is a response to a specific historical event. The customer_id_v2 rule exists because of a 2022 bug. The Monday-week rule exists because of an HR system constraint. Without the why, the rules look arbitrary, and a wiki page that lists rules without their reasons leaves the next engineer with no way to evaluate whether they still apply.
It is contested. Tribal knowledge isn't usually one person's view. It's a set of conventions the team has converged on through arguments. Capturing it requires actually resolving those arguments, not just transcribing positions.
It evolves constantly. Every new business question, every new source, every new edge case adds to the tribal knowledge stack. A snapshot taken in March is stale by May.
It is interlinked. The silver-versus-gold layer rule depends on warehouse cost decisions made years ago. The deal_id rename rule depends on a CRM migration in 2023. Capture one rule without its dependencies and the next engineer can't actually use it.
It crosses functional boundaries. Engineering tribal knowledge often depends on business tribal knowledge: finance's decision about how to count revenue, sales's working definition of a deal, ops's view of an active customer. The engineering rule is downstream of a business rule that nobody bothered to write down either.
And it is invisible until it's wrong. People don't realize they're applying tribal knowledge until they watch someone else not apply it. A new hire asks "wait, why are we doing X?" and the senior engineer realizes there's no documented answer because there's never needed to be one.
Wikis weren't built for this. Catalogs weren't built for this. Semantic layers weren't built for this. The industry has been using tools designed for stable, codifiable, batch-reviewable knowledge to try to capture knowledge that is none of those things.
Why This Is the Part the Conversation Keeps Missing
There are a few reasons the Summit conversation, and most vendor conversations more broadly, skip past tribal knowledge.
The most obvious is that it's invisible to outsiders. When you watch a senior data engineer work, you don't see the tribal knowledge in action. You see them producing answers. The judgment is bundled into the work, and only shows when someone less experienced gets the same task and produces something subtly wrong.
A second reason is that the vendor incentive is to focus on what can be productized. If you're selling a catalog, you focus on the parts of context that fit in a catalog. If you're selling a semantic layer, you focus on the parts that fit in a semantic layer. The tribal-knowledge piece of context is harder to fit in any product because it isn't really a thing you can sell. It is a quality of how the team works, codified in a format an agent can read.
A third reason is the eighty-twenty illusion. The codifiable parts of context (schema, lineage, metric definitions) are roughly eighty percent of what an agent technically needs. So the working assumption tends to be that capturing eighty percent gets you most of the accuracy. The problem is that the twenty percent that's tribal often determines whether the answer is right or wrong in the cases where it matters most. You can have eighty percent of the context and zero accuracy on the questions that depend on the tribal twenty percent.
A fourth reason is historical. The early AI pitch was that agents would figure things out from the data alone. Only recently has the industry accepted that agents need human judgment encoded somewhere they can read. Tribal knowledge has gone from being a "nice to have" to being the gating factor for production-scale agents faster than most product roadmaps have responded to.
The result is that almost every vendor at Summit is focused on the eighty percent they can codify, and the twenty percent that determines whether the agent is actually useful in production is left for the customer to figure out, by hand, in a way that doesn't keep up as the team grows or the data evolves.
What Good Capture Looks Like
Capturing tribal knowledge in a way that's actually usable for an agent has a few requirements that come out of the difficulties above.
It has to happen as a side effect of the work, not as a separate documentation project. Nobody on a real data team has time to maintain a wiki of why they do things. The codification has to happen when the engineer is doing the actual work, in a way that produces a machine-readable artifact as a byproduct.
It has to be readable by every agent that operates on the team's data, not just one vendor's product. If the captured knowledge sits in a vendor-specific API that only that vendor's products can read, you've solved the problem for that tool and not for the rest of your stack. Tribal knowledge has to live somewhere general enough that any agent (Snowflake's, yours, the customer's own) can access it through composition.
It has to be enforced at the moment of use, not as a post-hoc check. The catalog that says "use customer_id_v2" doesn't help if the pipeline gets built with customer_id and then runs for a month before someone notices. The enforcement has to happen at build time, when the agent or the engineer is producing the artifact in the first place.
It has to regenerate as the team and the data evolve. Static capture goes stale fast in data engineering because the underlying truth is moving. A system that captures tribal knowledge has to keep up with the rate at which the team adds new conventions, retires old ones, and resolves new edge cases.
This is a meaningfully different shape from a catalog or a documentation system. It is closer to a continuously-maintained operating model for the data team, that every agent (human or otherwise) reads on every job.
How Maia and Snowflake Compose to Solve Both Halves
This is what we are working on at Maia. Our Context Engine is the place where the team's standards (medallion conventions, naming rules, business definitions, governance policies, the why behind every convention) get encoded once. Every Maia agent reads the Context Engine on every job. When the design agent builds a pipeline, when the transformation agent generates a semantic model, when the governance agent applies a policy, the team's encoded standards are part of the input. The codification happens as a side effect of the work the team is already doing, and the standards get enforced at build time.
For the Summit conversation specifically, this is where the better-together story with Snowflake gets sharp. Snowflake is investing seriously in context retrieval and application within the warehouse. Cortex Sense, Horizon governance, Cloud Agents composition. All of it is moving in the right direction for agents that operate on data once it is already in Snowflake, and once the underlying context has been codified.
What the Context Engine adds, that platform-level retrieval doesn't yet, is the team's specific tribal knowledge in a format Snowflake's own agents can read through composition. When a CoCo session needs to know which join rule to use, the encoded standards have the answer because the team put it there. When a CoWork session asks a business question, the underlying data product was built against the same standards. The team's tribal knowledge becomes part of the operating model, and every agent in the customer's stack (Snowflake's, ours, eventually others) reads it consistently.
That is what end-to-end agentic data engineering actually looks like at production scale. Snowflake provides the retrieval and application infrastructure inside the warehouse. Maia captures and enforces the team's standards at the moment of build. The two halves compose into agents that are accurate not just because they found the right data, but because they applied the team's actual rules to it.
What I'm Coming Away From Summit With
There were several big themes I took away from Snowflake Summit 2026. The agentic enterprise has graduated from a vendor pitch to a CEO-level industry consensus. The bar has moved from experimental POCs to agents that have to work in production. Composition between agents (MCP, ACP, Cloud Agents) is going to be the architectural shape this all sits on. And context is the gating factor for whether any of it actually works at production scale, which is why the vendor conversation converged on it so heavily.
Where I think that conversation has the most room to deepen is on the kind of context I've been arguing about here. The tacit, tribal, situational kind that lives in the team's practice rather than in any document. The portion of context that decides whether agents are useful or merely impressive. The part that almost nobody at Summit was treating as a first-class problem yet.
The customers who win agentic data work over the next few years will be the ones who treat their team's tribal knowledge as a first-class artifact in their data engineering practice. Not something to write down occasionally, but something the practice is structured to keep current and make every agent read. That, I think, is the next move after the investments Snowflake announced at Summit.
It's what we're spending our time on at Maia.
Make Your Agents Actually Useful Context is the whole game.

Related Resources




