Karpathy's AI Wiki vs Structured Databases: Which Memory System Is Right for You?

Two Different Bets on Where Intelligence Should Live

Every AI system has a memory problem. The model itself knows a lot, but it doesn’t know your stuff — your processes, your codebase, your customers, your domain expertise. Something has to bridge that gap.

Two approaches have emerged as the most practical answers. The first is Andrej Karpathy’s AI wiki pattern: compile your knowledge into structured markdown at write time, so agents can load it directly into context. The second is a structured database approach — SQL, key-value stores, or vector databases — where knowledge lives in queryable records and agents pull what they need at runtime.

These aren’t just different tools. They represent different bets about when intelligence should happen. Understanding that distinction is what makes the difference between an AI memory system that works and one that frustrates you.

This article breaks down how each approach works, where each one wins, and when to run both together.

What Karpathy’s Wiki Actually Does

Andrej Karpathy proposed something deceptively simple: instead of building retrieval infrastructure so an agent can look things up, write down what the agent needs to know in plain markdown — and load that markdown directly into the context window.

The result is an LLM knowledge base built around the compiler analogy: the wiki is pre-processed, distilled knowledge that an agent can consume immediately without any lookup step.

Here’s what makes this different from a notes folder:

The wiki is written for an LLM, not for a human reader
Entries are regularly reviewed and updated by an agent that consolidates new information
The entire knowledge base (or a relevant subset) gets loaded into the prompt at the start of each session
There’s no retrieval step — the agent reads it all up front

The Karpathy LLM wiki pattern typically involves a small number of well-maintained markdown files covering what an agent needs to do its job: conventions, domain facts, recurring patterns, known edge cases. The agent that uses the wiki is also the agent that helps maintain it — adding new entries, merging duplicates, pruning stale content.

This creates a compounding knowledge loop: the more the agent works, the better the wiki gets, and the better the wiki gets, the more useful the agent becomes.

What Structured Databases Do

Structured databases — whether that’s a SQL table, a key-value store, or a vector database — take the opposite approach. Knowledge lives in records. The agent queries those records at runtime based on what it needs.

The most common form of this in AI systems is RAG (Retrieval-Augmented Generation). RAG works by embedding documents into vectors and then retrieving the most semantically similar chunks when a query arrives. The retrieved content gets injected into the context alongside the query.

Structured databases are powerful when:

The knowledge base is too large to fit in a context window
You need exact lookups (user IDs, transaction records, product SKUs)
Information changes frequently and needs to be current at query time
You’re serving many different users whose data needs to be isolated
You need filtered or conditional queries (e.g., “all orders over $500 placed this week”)

The trade-off is complexity. You need an embedding pipeline, a retrieval system, and you need to trust that the right chunks get surfaced at the right time. Retrieval quality varies, especially for complex multi-hop questions where the answer spans several documents.

The Core Difference: Write Time vs Runtime

The clearest way to understand the distinction is this:

Karpathy’s wiki: Intelligence happens at write time. A human (or an agent) decides what matters, synthesizes it, and writes it down in a form the LLM can use directly. At query time, the agent just reads.

Structured database: Intelligence happens at runtime. Raw data sits in records. At query time, the system has to figure out what’s relevant, retrieve it, and integrate it with the query.

This maps to the compiler analogy Karpathy uses: the wiki is compiled knowledge, ready to execute. A database is more like source code that still needs to be interpreted on every run.

Neither is objectively better. They’re suited to different problems.

The wiki approach wins when:

Your knowledge is relatively stable
The corpus is small enough to fit in context (a few hundred pages at most)
You want the agent to have full situational awareness, not fragmented chunks
You want predictable, consistent behavior across sessions

The database approach wins when:

Your data is large, growing, or highly dynamic
You need precise lookups, not fuzzy semantic matching
Different queries need different subsets of information
You’re building multi-tenant systems where data needs to be partitioned

When the Wiki Is the Right Choice

The wiki pattern shines in scenarios where depth of understanding matters more than breadth of data.

Coding agents and codebase context

If you’re running a coding agent on a project, the wiki can hold: architecture decisions, naming conventions, common patterns, known bugs, deployment steps. This is exactly the kind of knowledge that’s stable, bounded, and needs to be understood holistically rather than retrieved piecemeal.

LLM wiki vs RAG for internal codebase memory makes a compelling case here: coding agents that have the full context loaded upfront make fewer mistakes than agents that retrieve fragments and try to patch things together.

Personal assistant agents

For a personal AI assistant that needs to know your preferences, communication style, and ongoing projects, a wiki is often the right fit. The data is personal, relatively small, and benefits from being held in a coherent, readable form rather than distributed across records.

Domain expertise for specialist agents

If you’re building an agent that acts as a specialist — a legal researcher, a financial analyst, a compliance checker — the wiki can encode the conceptual framework the agent needs. Rules, definitions, typical patterns, known exceptions. This is knowledge that benefits from synthesis, not raw retrieval.

One refinement of the wiki pattern is the LLM knowledge base index file — a master document that maps what’s in each wiki file. Instead of loading every file, the agent loads the index, identifies what’s relevant, and pulls only the needed files. This lets the wiki scale past the raw context window size without needing a vector retrieval system.

When Structured Databases Are the Right Choice

Structured databases (including vector-based RAG) make sense when the problem can’t be bounded.

Large, dynamic document corpora

If you’re building a customer support bot that needs to search a knowledge base of thousands of articles that change weekly, a vector database is the right tool. The wiki pattern would require constant manual curation at a scale that isn’t feasible.

User-specific data in multi-tenant systems

When different users have different data — order histories, account details, personal preferences — you need a database. You can’t load everyone’s data into a single context window, and you don’t want to.

Exact lookups and filters

If your agent needs to answer “show me all invoices over $10,000 from Q1 2025,” that’s a SQL query. Semantic similarity search won’t help here. You need structured data and precise filtering.

Real-time and high-frequency updates

If the data changes constantly — stock prices, live inventory, system metrics — a wiki becomes stale almost immediately. A database that agents query at runtime stays accurate.

It’s worth noting that some newer approaches to agent memory infrastructure blend structured storage with semantic retrieval in ways that address the limitations of pure RAG — handling the temporal and contextual aspects of memory that vector search alone misses.

The Failure Modes of Each Approach

Understanding where each approach breaks helps you decide when to use what.

Where the wiki breaks down

Scale: Once your knowledge base grows past what fits in a context window, you need either an index layer or a retrieval system. The AI agent memory wall is a real problem for agents operating over long time horizons — eventually the context fills up.
Currency: If information changes frequently, a wiki becomes a maintenance burden. Stale entries mislead rather than help.
Retrieval precision: Asking “what was the customer’s exact order on March 3rd?” isn’t a wiki question. That’s a database query.

Where structured databases break down

Chunking artifacts: RAG retrieves chunks, not documents. The most relevant answer sometimes lives in the relationship between paragraphs, not within a single chunk. Whether large context windows eventually replace RAG is an open question — but chunk-level retrieval still introduces noise.
Semantic mismatch: Embedding similarity isn’t the same as relevance. A query about “reducing customer churn” might not surface a document titled “account retention strategy” if the embeddings don’t align.
Operational complexity: A vector database requires an embedding pipeline, indexing, tuning, and ongoing maintenance. For small-scale knowledge, this is significant overhead for uncertain gain.

How to Combine Both

The most robust AI memory systems use both approaches in their proper domains.

A practical architecture:

Wiki layer: Contains stable knowledge that the agent needs in every session — conventions, domain expertise, known patterns, workflow instructions. Loaded into context at session start.
Database layer: Contains dynamic, large, or user-specific data. Queried at runtime when the agent needs specific records.
Orchestration: The agent uses the wiki to understand what to do and how to think about problems. It queries the database to get specific facts and data it needs to act.

This separation of concerns makes each layer easier to maintain. The wiki evolves through review and refinement. The database evolves through normal CRUD operations.

Building a self-evolving memory system takes this further — using hooks and agents to automatically update the wiki based on new experiences, while keeping operational data in a proper database.

For professional contexts where relationship management matters, this hybrid approach works well too: the wiki holds general knowledge about a person or account, while a structured system tracks specific interactions, dates, and action items.

Where Remy Fits

Remy takes a position on this that’s worth stating directly: the spec is the source of truth, and the spec is a form of compiled knowledge.

When you describe your application in a Remy spec — the data types, the business rules, the edge cases, the behavior you want — you’re doing what Karpathy’s wiki does for agent knowledge. You’re capturing understanding in a form that’s precise, readable, and can be recompiled into something that runs.

The backend Remy generates includes a real SQL database (SQLite with WAL journaling and automatic schema migrations). That database handles what databases are good at: storing user data, processing queries, maintaining state across sessions. The spec handles what the wiki is good at: encoding the logic, rules, and structure that makes the database meaningful.

You’re not choosing between spec-driven development and a real database. You get both, at the right level of abstraction for each.

If you’re building an AI-powered app that needs persistent memory — user preferences, interaction history, domain knowledge alongside dynamic data — Remy’s architecture handles the full stack. You describe what you want; it compiles the infrastructure.

You can try Remy at mindstudio.ai/remy.

Frequently Asked Questions

What is Karpathy’s AI wiki pattern?

Karpathy’s wiki pattern is an approach to AI memory where knowledge is compiled into structured markdown files at write time, then loaded directly into an LLM’s context window at runtime. Rather than building a retrieval system, you write knowledge down in a form the agent can read holistically. An agent (or human) maintains and updates the wiki over time, keeping it accurate and useful. The pattern is particularly well-suited to coding agents, personal assistants, and specialist domain agents where stable, bounded knowledge is more important than large-scale retrieval.

What’s the difference between a wiki-based knowledge base and RAG?

The key difference is when the work happens. In a wiki approach, knowledge is synthesized and structured at write time — a human or agent does the hard work of deciding what matters and how to express it. At runtime, the agent just reads. With RAG (Retrieval-Augmented Generation), raw documents are stored in a vector database and retrieved based on semantic similarity at runtime. RAG scales to much larger corpora but introduces retrieval quality risks and requires more infrastructure. LLM wiki vs RAG is a detailed breakdown of when each makes sense.

When should I use a structured database instead of a wiki?

Use a structured database when your data is large, dynamic, or user-specific. If you’re storing thousands of records that change frequently, need exact lookups (filtering, sorting, joining), or need to isolate data per user in a multi-tenant system, a database is the right tool. The wiki pattern is better suited to stable, bounded knowledge that benefits from synthesis and holistic understanding rather than precise record retrieval.

Can you use both a wiki and a structured database together?

Yes, and for most production systems you should. The wiki handles stable knowledge the agent needs in every session — conventions, domain rules, how to reason about problems. The database handles dynamic, large, or user-specific data the agent queries as needed. The two layers complement each other and are easier to maintain when kept separate.

How do I keep a wiki from becoming stale?

The most effective approach is to make wiki maintenance part of the agent’s workflow. After each session, the agent can propose updates, flag outdated entries, or merge new patterns it encountered. Some teams run a dedicated review step where an agent reads through the wiki and identifies gaps or contradictions. The AutoResearch loop pattern applies this idea to ongoing knowledge refinement — the agent continuously improves its own knowledge base over time.

Does a large context window make RAG unnecessary?

Larger context windows reduce (but don’t eliminate) the need for retrieval. If your entire knowledge base fits in the context window, RAG adds complexity without clear benefit. But for corpora that span millions of tokens, or for systems where data changes in real time, retrieval is still necessary. The context window vs RAG tradeoff depends heavily on your corpus size, update frequency, and precision requirements.

Key Takeaways

Karpathy’s wiki compiles knowledge at write time. Structured databases query at runtime. That’s the fundamental difference.
The wiki wins for stable, bounded knowledge where holistic understanding matters. Databases win for large, dynamic, or user-specific data.
Most production AI systems benefit from both: a wiki layer for agent intelligence and a database layer for operational data.
The main failure mode of the wiki is scale and staleness. The main failure mode of RAG is retrieval quality and operational complexity.
Building wiki maintenance into the agent’s workflow is what keeps the knowledge base from going stale.

If you’re building an AI application that needs to reason about both stable domain knowledge and dynamic user data, try Remy — it handles both layers in a single spec-driven architecture that compiles into a full-stack application with a real database backend.