May 8, 2026

RAG Is Not a Knowledge Architecture

Seventy percent of enterprise RAG implementations fail to meet their stated objectives. Nine out of ten RAG applications that work in demos break in production. When RAG fails, the failure point is the retrieval layer 73% of the time — not the model, not the prompt, not the generation. The retrieval layer.

This is not a performance problem. It is an architecture problem. RAG was built to retrieve relevant text. It was not built to represent how a company thinks.

What RAG actually does

RAG works by converting documents into vector embeddings, storing them in a database, and — when a query arrives — retrieving the chunks whose embeddings are most similar to the query embedding. The retrieved text is then passed to a language model as context.

This is, at its core, a sophisticated search engine. The quality of the answer depends entirely on whether the right chunk was retrieved and whether that chunk contains the answer in a form the model can use.

The failure modes follow directly from this design:

Chunking destroys context. Documents are split into fixed-size pieces. A policy exception that spans two pages becomes two orphaned fragments. A decision that references a prior decision has no connection to it. The chunk contains words but not meaning.

Embeddings cannot represent structure. The semantic distance between "our Q3 revenue definition" and "how we define ARR for fiscal reporting" may be large in embedding space even though they refer to the same thing. General-purpose embeddings fail on the specialized vocabulary of any specific company.

Contradiction is invisible. When two documents contradict each other — last year's pricing policy and this year's revision — RAG has no mechanism to resolve the conflict. It retrieves both, or retrieves the wrong one, and the model generates a confident answer from whichever it received.

Staleness is silent. A document indexed six months ago looks identical to one indexed this morning. RAG has no concept of currency. An agent asking about current policy may receive an answer grounded in superseded policy, with no indication that anything is wrong.

The question RAG cannot answer

Consider what an enterprise agent actually needs to do. It needs to answer: Why did we make this pricing exception for this customer? It needs to know: What is the approved process for this type of decision, and who has the authority to override it? It needs to determine: What does this contract clause mean in the context of how our legal team has interpreted similar clauses in the past?

None of these questions are retrieval problems. They are reasoning problems over structured knowledge — knowledge that has relationships, provenance, and context. No amount of retrieval improvement closes this gap. You cannot retrieve your way to a company ontology.

The evidence is in the numbers. The a16z analysis of enterprise agent deployments identified the core failure: agents were given access to data systems but not to the business context that makes that data meaningful. Semantic layers were outdated. Tribal knowledge lived nowhere a retrieval system could reach. Revenue definitions had been modified by someone who left and never updated.

RAG would have retrieved the old document. It would not have known it was wrong.

What a knowledge architecture requires

The difference between a retrieval system and a knowledge architecture is structure, provenance, and time.

Structure means entities, relationships, and typed connections — not text. A knowledge architecture does not store a document that says "the CEO approved the Q3 exception." It stores a decision record: type: approval, subject: pricing-exception-Q3, authorized-by: [person], date: [date], clearance: [level], supersedes: [prior-record]. An agent can traverse this. It can reason over it. It can check whether the person who approved it still holds that authority.

Provenance means every record knows where it came from and when. Not because auditors require it, but because agents require it. An agent operating on company knowledge needs to distinguish between what was true then and what is true now. A retrieval system stores text. A knowledge architecture stores history.

Time means records are not deleted — they are superseded. The Q3 pricing exception is not overwritten when Q4 policy changes. Both exist. An agent operating on a historical contract can be told: use the knowledge state as of the contract date. The answers it gives are then auditable, not just plausible.

Why RAG became the default

RAG is fast to implement. Drop in a vector database, write an ingestion pipeline, connect to the LLM of your choice. The demo works. The demo will always work, because demos involve clean questions and cooperative documents.

Production is different. Production involves legacy documentation, contradictory policies, information that was never written down, and decisions made by people who left the company three years ago. Production is where the 70% find out.

The companies that are serious about enterprise agent deployment are building past RAG. Sequoia's investment thesis for context infrastructure states the problem directly: when you drop a general-purpose AI into an enterprise environment, it starts from zero. The work of getting it up to speed is slow, expensive, and has to be redone every time a process changes.

That is the RAG loop. Ingest, retrieve, fail, re-ingest, repeat.

The correct framing

RAG is a component. It is a reasonable way to surface unstructured text when that text is the answer. For many narrow applications — search over a document corpus, answering FAQ queries, surfacing relevant research — it works.

It is not a company knowledge layer. It is not a substitute for structured representation of how an organization operates. It is not the infrastructure that enterprise agents need.

The companies that ship reliable agents in production will have both: retrieval for surface-level lookup, and a structured ontology for the knowledge that actually drives decisions. The ontology is not built on top of RAG. The ontology is the foundation. RAG, where it remains useful, runs on top of it.

This is the architecture Dominir is built on. The Kernel stores decisions, relationships, and approvals as typed records — not documents — with provenance, clearance levels, and immutable history. Agents traverse the graph. They do not retrieve chunks and hope the right one surfaces.

The retrieval problem is solved. The knowledge architecture problem is the work that remains.

Dominir is building the knowledge infrastructure layer — a company ontology that structures decisions, relationships, and context in a form agents can reason over. Read the docs or request access.