What Is the Karpathy LLM Wiki Pattern? How to Build a Personal Knowledge Base Without RAG
Andrej Karpathy's LLM Wiki uses plain text files instead of vector databases, reportedly 70x more efficient than RAG for agent-accessible knowledge bases.
The Problem With How Most People Build AI Knowledge Bases
Ask most developers how to give an AI agent access to a personal knowledge base, and they’ll point you toward RAG — retrieval-augmented generation. Set up a vector database, chunk your documents, generate embeddings, run similarity searches, then pass retrieved chunks to your model. It works, but it’s a lot of infrastructure for what often amounts to a few hundred text files.
Andrej Karpathy — former Tesla AI director, OpenAI co-founder, and one of the most influential voices in practical machine learning — has proposed something simpler. The Karpathy LLM Wiki pattern treats your knowledge base as exactly what it is: a collection of plain text files that an agent reads and writes directly, no vector store required.
This approach reportedly runs up to 70x more efficiently than traditional RAG pipelines for agent-accessible knowledge, and it’s considerably easier to build, maintain, and understand. Here’s how it works and why it’s worth taking seriously.
What the Karpathy LLM Wiki Pattern Actually Is
The LLM Wiki pattern is a method for building a persistent, agent-readable knowledge base using nothing more than plain text files organized like a wiki.
Instead of a vector database with embeddings and approximate nearest-neighbor search, you maintain a directory of Markdown or plain text files. Each file covers a specific topic, person, project, or concept. An AI agent can read these files directly, synthesize information across them, and — crucially — write updates back to them as it learns new things.
The “wiki” framing matters. A wiki has:
- Flat or lightly hierarchical structure (no deep nesting)
- Human-readable, self-contained pages
- Cross-references between related pages
- Consistent formatting that makes pages easy to scan
When you apply that structure to an LLM’s working memory, you get something that’s both accessible to the AI and useful to you as a human reviewing it.
Why Plain Text and Not a Database
Karpathy’s broader philosophy consistently favors understandable systems over clever ones. His position, expressed across lectures and public writing, is that the overhead of RAG — chunking strategies, embedding models, vector DBs, retrieval tuning — introduces fragility and complexity that’s often unnecessary.
Plain text files have almost no failure modes. They don’t require a running service. They’re version-controllable with Git. Any LLM can read them without special tooling. And for knowledge bases up to several hundred pages, modern context windows are large enough that selective loading beats probabilistic retrieval in both accuracy and speed.
Why RAG Has More Overhead Than People Realize
RAG made a lot of sense when context windows were small — say, 4,000 to 8,000 tokens. If you can only fit a few pages of text into a single inference call, you need a way to find the right pages. Vector search was the obvious answer.
But the context window landscape has changed significantly. Models from Anthropic, Google, and OpenAI now routinely offer 128,000-token to 1-million-token windows. The original constraint that motivated RAG is largely gone for personal-scale knowledge bases.
The Hidden Costs of RAG Infrastructure
Even when RAG is the right tool, it comes with real costs:
- Chunking is non-trivial. How you split documents affects retrieval quality in ways that are hard to predict. Too short and you lose context. Too long and you lose precision.
- Embeddings drift. If you switch models or update your data, you may need to re-embed everything.
- Retrieval is probabilistic. Vector similarity search returns probably relevant chunks, not definitely relevant ones. This introduces subtle errors that are hard to debug.
- The pipeline has more moving parts. Each additional component is something that can break, go stale, or behave unexpectedly.
- It’s opaque. You can’t easily read a vector database to understand what your agent “knows.”
For enterprise-scale use cases with millions of documents, RAG is the right answer. But for personal knowledge management — meeting notes, research, personal CRM, project documentation — it’s often overkill.
How the LLM Wiki Pattern Works in Practice
The basic setup is deliberately simple. Here’s what a functional implementation looks like.
File Structure
A typical LLM Wiki directory might look like this:
/wiki
/people
andrej-karpathy.md
sam-altman.md
/projects
project-alpha.md
project-beta.md
/concepts
transformer-architecture.md
rag-vs-wiki.md
/meetings
2025-01-15-q1-planning.md
index.md
Each file is self-contained and written in plain Markdown. The index.md serves as a table of contents that the agent reads first to orient itself.
What Goes in Each File
Good wiki pages have a consistent structure:
- Title and one-line summary at the top
- Key facts or bullet points — the most important information, scannable
- Longer narrative context below if needed
- Cross-references to related pages using
[[page-name]]or standard Markdown links - Last updated date — helps the agent reason about information freshness
For example, a people file might look like:
# Andrej Karpathy
Former VP of AI at Tesla, co-founder of OpenAI.
Currently: independent researcher and educator.
## Key Facts
- Created micrograd, nanoGPT teaching projects
- Advocates for simple, understandable AI systems
- Known for preference of plain-text over complex pipelines
## Context
Met at [event] on [date]. Discussed [topic].
Follow-up: [action item].
Last updated: 2025-01-10
How the Agent Uses It
The agent workflow is straightforward:
- Read the index to understand what’s available
- Load relevant files based on the user’s query or task
- Synthesize and respond using the loaded context
- Write updates back to the appropriate files after learning new information
The write-back step is what makes this a living knowledge base rather than a static reference. After a meeting, the agent can update the relevant people and project files. After a research session, it can add new concept pages. The knowledge base grows and stays current without manual maintenance.
LLM Wiki vs. RAG: When Each Makes Sense
Neither approach is universally better. The right choice depends on the scale and nature of your knowledge.
| Factor | LLM Wiki Pattern | RAG |
|---|---|---|
| Document count | Up to ~500–1,000 pages | 1,000+ documents |
| Context window fit | Selective loading works | Too large for context |
| Update frequency | Frequent writes needed | Mostly static corpus |
| Retrieval precision | Deterministic (file-level) | Probabilistic (chunk-level) |
| Setup complexity | Low — just files | High — DB, embeddings, pipeline |
| Human readability | High | Low (stored as vectors) |
| Debugging | Easy | Difficult |
For a personal assistant, research companion, or team knowledge base covering hundreds of documents, the LLM Wiki pattern is almost certainly the simpler and more reliable choice.
For a company trying to make a 10-million-document corpus searchable, RAG (or a hybrid approach) remains necessary.
The Hybrid Middle Ground
Some implementations combine both approaches. The wiki files serve as structured, high-priority knowledge that always loads, while a RAG layer handles larger archives when needed. The agent checks the wiki first, then falls back to vector search only if the wiki doesn’t contain what it needs.
This preserves the clarity and reliability of the wiki pattern while extending the reach to larger corpora.
Step-by-Step: Building Your Own LLM Wiki
Here’s a practical guide to building a personal LLM Wiki from scratch.
Step 1: Define Your Knowledge Domains
Before creating any files, decide what categories of knowledge you need. Common domains for personal use:
- People — contacts, colleagues, collaborators
- Projects — ongoing work, status, decisions
- Concepts — ideas, frameworks, research topics
- Meetings — notes, action items, decisions
- Resources — tools, references, bookmarks
Keep the structure flat. Deep hierarchies slow the agent down and make the wiki harder to maintain.
Step 2: Create Your Index
The index is the agent’s entry point. It should list all available pages with one-line descriptions:
# Knowledge Base Index
## People
- andrej-karpathy.md — Researcher, educator, AI practitioner
- [other people]
## Projects
- project-alpha.md — Q1 product initiative, owner: [name]
## Concepts
- rag-vs-wiki.md — Comparison of retrieval approaches
An agent given this index can decide which files to load without reading everything.
Step 3: Establish a Writing Convention
Consistency matters more than perfection. Pick a simple template for each domain and stick to it. The more predictable the format, the easier it is for the agent to parse.
For concepts:
# [Concept Name]
[One-line definition]
## How It Works
[Core explanation]
## When to Use It
[Practical guidance]
## Related
- [[related-concept-1]]
- [[related-concept-2]]
Step 4: Build the Agent Logic
The agent needs three capabilities:
- Read files — load individual wiki pages or scan the index
- Search content — basic text search across files (not vector search)
- Write/update files — append to or rewrite pages
For a basic implementation, you don’t need anything exotic. A system prompt that instructs the agent to read the index first, load relevant pages before responding, and write updates after completing tasks is enough to get started.
Step 5: Build a Capture Habit
A knowledge base is only as good as what goes into it. Build lightweight capture routines:
- After each meeting, ask the agent to update relevant people and project files
- When you learn something worth keeping, add it to the appropriate concept page
- Weekly, run a “maintenance pass” where the agent reviews and de-duplicates
The goal is a system that grows without becoming a burden.
How to Build a LLM Wiki Agent With MindStudio
If you want to run this pattern as an actual working agent — not just a local script — MindStudio is a practical starting point.
MindStudio is a no-code platform for building AI agents and automated workflows. You can connect it to a file storage system (Google Drive, Notion, Airtable, or a custom data store), define the read/write logic visually, and deploy an agent that maintains your LLM Wiki without writing server code.
A few ways to set this up:
- Google Drive integration — Store your wiki files in a Drive folder. The agent reads and writes files using MindStudio’s built-in Google Workspace integration. No database required.
- Notion as your wiki layer — Use Notion pages as wiki entries. MindStudio’s Notion integration lets the agent read page content and append updates after each session.
- Background agents on a schedule — Run a weekly maintenance agent that reviews your wiki, identifies stale pages, and prompts you to update them.
The build typically takes under an hour for a basic version. MindStudio supports 200+ AI models out of the box, so you can run the same wiki agent on Claude, GPT-4o, or Gemini depending on what you’re using it for.
You can try MindStudio free at mindstudio.ai — no credit card required for the basic tier.
Frequently Asked Questions
What is the Karpathy LLM Wiki pattern?
It’s a method for building an AI-accessible knowledge base using plain text or Markdown files instead of a vector database. The pattern draws on Andrej Karpathy’s philosophy of favoring simple, readable systems over complex pipelines. An AI agent reads and writes these files directly, using them as persistent memory across sessions.
Is the LLM Wiki pattern better than RAG?
For personal or small-team knowledge bases (up to a few hundred documents), the LLM Wiki pattern is generally simpler, more reliable, and easier to debug than RAG. For large-scale enterprise document search with thousands or millions of files, RAG remains the more practical choice. The right answer depends on scale and use case.
How large can an LLM Wiki get before performance degrades?
This depends on your model’s context window and how you structure the agent’s loading logic. With selective file loading (read the index first, then load relevant pages), you can scale to several hundred pages without hitting context limits. For larger wikis, a hybrid approach — wiki for structured knowledge, RAG for archives — works well.
Can an AI agent write back to the wiki automatically?
Yes. This is one of the key advantages of the pattern. Because wiki files are plain text, any agent with file write access can update them. After a meeting or research session, the agent can append new information, update existing facts, or create new pages. Over time, the knowledge base becomes a growing record of everything the agent has learned.
Do I need to know how to code to build an LLM Wiki?
Not necessarily. The file structure itself requires no coding. Building the agent that reads and writes those files can be done with a no-code tool like MindStudio, or with a simple Python script using the file system and an LLM API. The conceptual complexity is low — the main work is in defining your file structure and writing conventions.
What’s the difference between an LLM Wiki and a traditional personal knowledge management tool?
Traditional PKM tools (Obsidian, Notion, Roam) are designed for human reading and navigation. An LLM Wiki is specifically structured for machine reading — with consistent formatting, explicit cross-references, and information density that helps an AI parse content quickly. In practice, the two overlap significantly, and tools like Obsidian (which uses plain Markdown files) can double as an LLM Wiki with minimal modification.
Key Takeaways
- The Karpathy LLM Wiki pattern replaces vector databases with plain text files, making agent-accessible knowledge bases simpler and more maintainable for personal or small-team use.
- RAG adds real infrastructure overhead — chunking, embeddings, probabilistic retrieval — that’s often unnecessary when modern context windows can handle selective file loading directly.
- A functional LLM Wiki needs a clear index, consistent page templates, and agent logic that can read, search, and write files.
- The pattern scales to several hundred pages before needing augmentation; beyond that, a hybrid wiki-plus-RAG approach handles larger corpora.
- Tools like MindStudio let you build and deploy this kind of agent without writing backend code — connecting to Google Drive, Notion, or other file sources through pre-built integrations.
If you’re building AI workflows and want persistent memory that actually works — and that you can read and understand without a vector database console — the LLM Wiki pattern is worth trying. MindStudio makes it straightforward to build the agent layer on top of it.