Retrieval-augmented generation has become the standard approach for building knowledge-intensive AI applications. The formula is straightforward: embed documents, retrieve relevant chunks when queried, inject them into a language model's context. It works well enough that it's become the default.
But 'well enough' isn't good enough for serious knowledge work. The assumptions underlying RAG may be fundamentally misaligned with how humans actually use knowledge.
Assumption 1: Relevance Is About Similarity
RAG retrieves based on semantic similarity. But when you're working on a strategic decision, the most valuable information often isn't similar to your query—it's complementary, contradictory, or contextual in ways that similarity search can't capture.
The insight that changes your thinking is rarely the one that matches your search terms. It's the unexpected connection, the adjacent idea, the historical parallel you didn't know to look for.
Assumption 2: Chunks Are Independent
Chunking documents for embedding treats knowledge as modular. But meaning is often distributed—an insight in paragraph 3 only makes sense in light of context from paragraph 1. When we retrieve chunks in isolation, we lose the connective tissue that gives them meaning.
Assumption 3: More Context Is Better
As context windows grow, the temptation is to retrieve more. But cognitive science tells us that human attention is limited. Flooding a prompt with marginally relevant information doesn't improve decision quality—it degrades it by obscuring what matters.
Toward Better Architectures
We're exploring alternatives that address these limitations:
- Graph-structured retrieval that preserves relationships between concepts
- Hierarchical representations that maintain document coherence
- Attention-aware retrieval that surfaces what will actually be processed
- Active retrieval that iteratively refines based on reasoning needs
The goal isn't to replace RAG, but to understand when it works and when we need something more sophisticated. For casual question-answering, RAG is often sufficient. For strategic thinking that requires synthesis across disparate sources, we need architectures that mirror how human cognition actually handles knowledge.
