How is this different from ChatGPT or Claude?

ChatGPT and Claude start fresh every conversation. They don't know your domain, your competitors, or what you decided last quarter. Penumbra gives AI persistent memory of your strategic context. Every session builds on everything you've ever uploaded.

What does 'start at square two' mean?

Every strategic effort you've ever started, you started from scratch. Blank doc. Empty whiteboard. Re-explaining context. That's square one. With Penumbra, your context is already there. Your thinking has accumulated. You're not rebuilding — you're building on.

PDFs, Word docs, text files, Markdown, audio recordings, video files. We also connect to Google Docs, Notion, and Confluence. If it contains knowledge worth keeping, we can work with it.

Encrypted at rest and in transit. Your knowledge graph is completely isolated — we never train on your data or share it. You own your data, always.

How do I get started?

The beta is live now. You can start a 14-day free trial from the pricing page.

Beta is live — start your free trial→

PENUMBRA

Start Free Trial

Back to Research

Language ModelsJanuary 18, 2025

The Limits of RAG for Knowledge Work

Retrieval-augmented generation has become the default approach for knowledge-intensive tasks. But its assumptions about how humans organize and use knowledge may be fundamentally flawed.

Shep Bryan

Founder

Retrieval-augmented generation has become the standard approach for building knowledge-intensive AI applications. The formula is straightforward: embed documents, retrieve relevant chunks when queried, inject them into a language model's context. It works well enough that it's become the default.

But 'well enough' isn't good enough for serious knowledge work. The assumptions underlying RAG may be fundamentally misaligned with how humans actually use knowledge.

Assumption 1: Relevance Is About Similarity

RAG retrieves based on semantic similarity. But when you're working on a strategic decision, the most valuable information often isn't similar to your query—it's complementary, contradictory, or contextual in ways that similarity search can't capture.

The insight that changes your thinking is rarely the one that matches your search terms. It's the unexpected connection, the adjacent idea, the historical parallel you didn't know to look for.

Assumption 2: Chunks Are Independent

Chunking documents for embedding treats knowledge as modular. But meaning is often distributed—an insight in paragraph 3 only makes sense in light of context from paragraph 1. When we retrieve chunks in isolation, we lose the connective tissue that gives them meaning.

Assumption 3: More Context Is Better

As context windows grow, the temptation is to retrieve more. But cognitive science tells us that human attention is limited. Flooding a prompt with marginally relevant information doesn't improve decision quality—it degrades it by obscuring what matters.

Toward Better Architectures

We're exploring alternatives that address these limitations:

Graph-structured retrieval that preserves relationships between concepts
Hierarchical representations that maintain document coherence
Attention-aware retrieval that surfaces what will actually be processed
Active retrieval that iteratively refines based on reasoning needs

The goal isn't to replace RAG, but to understand when it works and when we need something more sophisticated. For casual question-answering, RAG is often sufficient. For strategic thinking that requires synthesis across disparate sources, we need architectures that mirror how human cognition actually handles knowledge.

Mental Models RAG Semantic Memory

Research by

Shep Bryan

Founder

Shep is the founder of Penumbra, building knowledge systems that transform how teams capture, connect, and leverage institutional intelligence for strategic decisions.

Related Research

Abstract visualization of neural network patterns

Jan 22, 2025

How LLMs Model Mental States

Large language models exhibit surprising capabilities in reasoning about beliefs, intentions, and knowledge states. We explore what this means for building systems that truly understand context.

Read article→

Jan 12, 2025

Semantic Memory in Human and Machine Cognition

How do humans organize conceptual knowledge, and what can this teach us about building better knowledge systems? A comparative analysis.

Read article→

Abstract geometric network visualization

Jan 6, 2025

Beyond Embeddings: Structured Knowledge in the Age of LLMs

Vector embeddings excel at similarity but struggle with relationships. We explore hybrid architectures that combine the best of neural and symbolic approaches.

Read article→