How to Build an AI Second Brain Knowledge Base: Step-by-Step with Claude

The Problem With How We Store Information Today

Most people are drowning in information they can’t find when they need it. Notes scattered across Notion, Obsidian, Google Docs, email threads, Slack messages, browser bookmarks — and none of it talks to each other.

Building an AI second brain knowledge base using Claude changes that. Instead of a passive archive you have to search manually, you get a system that processes new information automatically, surfaces relevant context on demand, and stays current through scheduled updates. This guide walks you through how to build one from scratch — step by step.

What an AI Second Brain Actually Is

The concept of a “second brain” comes from productivity researcher Tiago Forte, whose framework describes an external system for capturing, organizing, and retrieving knowledge so your biological brain can focus on thinking rather than remembering.

An AI second brain takes that idea further. Instead of manually tagging and filing notes, you connect an AI model — in this case, Claude — to a knowledge store. The AI handles processing, summarization, and retrieval. You just feed it information and ask it questions.

A well-built system has four components:

Ingestion — pulling in documents, articles, notes, emails, or any text
Processing — Claude summarizes, tags, and extracts key ideas
Storage — structured records in a database (Airtable, Notion, Supabase, etc.)
Retrieval — semantic or keyword search that Claude uses to answer your questions

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

remy.msagent.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

The magic happens when all four are automated. You stop managing the system and start using it.

Prerequisites: What You Need Before You Start

Before building, get these in place:

An Anthropic API key or Claude access through a platform You’ll need Claude to handle the processing and retrieval logic. You can use the Anthropic API directly, or use a platform like MindStudio that gives you access to Claude (and 200+ other models) without managing API keys yourself.

A database or knowledge store Your processed memories need to live somewhere searchable. Good options:

Airtable — easy to set up, good API, works well for structured records
Notion — familiar for most users, has a decent API
Supabase — better for vector search if you want semantic retrieval
Google Sheets — lowest barrier, but less scalable

A data source What are you feeding the system? Common sources:

Saved articles or web pages
Meeting notes
Email digests
Research documents
Personal journal entries
Slack or Discord exports

An automation layer Unless you’re writing custom code, you’ll need a way to trigger workflows — when a new document arrives, process it; at 9 AM every day, ingest the overnight queue. Zapier, Make, and MindStudio all work here.

Step 1: Design Your Knowledge Schema

Before writing any prompts or building any automations, decide what shape your knowledge records should take. This is the most important step — a bad schema leads to a system you can’t query effectively.

Define Your Record Fields

A good knowledge record includes:

Title — short name for the piece of content
Source — where it came from (URL, document name, email subject)
Date added — when it entered the system
Raw content — the original text (or a cleaned version)
Summary — a 2–5 sentence Claude-generated summary
Key points — 3–7 bullet points Claude extracted
Tags — topic labels Claude assigned (e.g., “AI”, “productivity”, “research”)
Category — a top-level bucket (e.g., “Work”, “Personal”, “Research”)
Related records — links to other items Claude found similar

Keep It Queryable

The whole point is retrieval. Every field should answer a question someone might ask. “What did I save about machine learning last month?” needs date and tags to work. “What are the main arguments in this paper?” needs key points.

Don’t over-engineer the schema at first. Start with 6–8 fields and add more once you understand your own query patterns.

Step 2: Build the Ingestion Pipeline

Ingestion is how raw content gets into the processing queue. The right setup depends on your source.

Option A: Manual Submission via a Form

Build a simple form (Google Forms, Typeform, or a custom one) where you paste text or a URL. A form submission triggers the processing workflow. This is the fastest to set up and works for anything.

Option B: Email-Triggered Ingestion

Forward interesting articles, newsletters, or notes to a dedicated email address. An email-triggered agent reads the message and kicks off processing. This is great if email is already part of your reading workflow.

Option C: Browser Extension or Save-to-Queue

Some automation platforms support browser extensions or bookmarklets. Highlight text on any page and send it directly to your queue with one click.

Option D: Scheduled Batch Ingestion

Pull from a RSS feed, a Pocket/Instapaper account, or a shared Google Drive folder on a schedule — say, every hour. Everything in the queue gets processed in one batch.

For most people, a combination works best: manual submission for ad hoc items, plus one scheduled source for ongoing reading.

Step 3: Write the Claude Processing Prompt

This is the core of the system. When a new item arrives, Claude reads the content and returns structured data you store in your database.

Sample Processing Prompt

You are a knowledge management assistant. Your job is to process documents and extract structured information for storage in a knowledge base.

Given the following content, return a JSON object with these fields:
- title: a clear, descriptive title (max 10 words)
- summary: 2-4 sentences capturing the main idea
- key_points: an array of 3-7 specific, actionable or informational bullets
- tags: an array of 3-6 topic tags (lowercase, no spaces)
- category: one of [Research, Work, Personal, Reference, Ideas]
- relevance_score: 1-10 rating of how useful this is likely to be

Content:
{{content}}

Return only valid JSON. No explanation.

Why This Prompt Works

The explicit JSON request makes output parseable. The field definitions keep Claude’s output consistent. The relevance_score gives you a filter — you might decide not to store anything below 4.

Adjust the categories to match your actual life. If you’re a software engineer, maybe categories are Engineering, Management, Learning, Reference.

Handling Long Documents

Claude has a large context window, but very long documents (like research papers or book chapters) can be expensive to process in full. For content over ~5,000 words, consider:

Splitting into chunks and processing each separately
Extracting the first 2,000 words plus any abstract or conclusion
Using a cheaper model for initial summarization, then Claude for final synthesis

Step 4: Store Processed Records

Once Claude returns structured data, write it to your database. This step is mostly about mapping fields.

Writing to Airtable

If you’re using Airtable, your automation workflow (Zapier, Make, or MindStudio) should:

Parse the JSON Claude returns
Map each field to the corresponding Airtable column
Create a new record
Optionally: check for duplicates before writing (match on URL or title)

Airtable’s API is well-documented and most automation platforms have native Airtable integrations, so no custom code is needed.

Writing to Notion

Notion’s API requires a bit more setup — you need to specify your database ID and match field types (text, multi-select for tags, number for relevance score). But the result is a knowledge base you can browse directly in Notion alongside your other work.

Adding Vector Embeddings (Optional)

If you want true semantic search — finding records by meaning rather than exact keywords — you’ll need to generate embeddings for each record and store them in a vector database like Pinecone, Weaviate, or Supabase’s pgvector extension.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

This adds complexity. For most people, keyword search across well-tagged records is good enough to start. Add vector search later if you find yourself unable to surface what you need.

Step 5: Build the Memory Search Function

Storage is useless without retrieval. The memory search function lets you ask a question and get back relevant records from your knowledge base.

The Retrieval-Augmented Generation (RAG) Pattern

This is the standard approach for AI knowledge bases:

User asks a question
System searches the database for relevant records
Relevant records are injected into Claude’s context as “memory”
Claude answers the question using both its training and the retrieved records
Claude cites which records it drew from

This is called retrieval-augmented generation (RAG), and it’s how most production knowledge systems work.

Building the Search Step

For keyword/tag search:

Extract keywords from the user’s query
Query your database for records where tags or summary contain those keywords
Return the top 5–10 most recent or highest-relevance matches

For semantic search:

Generate an embedding from the user’s query
Find the nearest neighbor records in your vector store
Return the closest matches by cosine similarity

The Retrieval Prompt

Once you have retrieved records, inject them into Claude with a prompt like:

You are a personal knowledge assistant. Answer the user's question using 
the relevant memories retrieved from their knowledge base. 

If the memories don't contain enough information to answer the question, 
say so clearly — don't make things up.

Retrieved memories:
{{retrieved_records}}

User question: {{question}}

When referencing specific memories, cite them by title.

This prompt keeps Claude grounded in what you’ve actually stored and prevents hallucination.

Step 6: Set Up Scheduled Hourly Updates

A static knowledge base gets stale. The goal is a system that ingests and processes new information continuously — ideally without you having to trigger it manually.

What Hourly Updates Look Like

A scheduled workflow runs every hour and:

Checks a queue (email inbox, RSS feed, Google Drive folder, etc.) for new items
Processes any new items through Claude
Writes processed records to the database
Optionally: sends you a digest of what was added

This means your knowledge base reflects everything you’ve read, saved, or received — up to an hour ago.

Setting Up a Scheduled Workflow

Most automation platforms let you trigger a workflow on a cron schedule. In MindStudio, you’d build an autonomous background agent that runs on your chosen interval. In Zapier or Make, you’d use a scheduled trigger.

The workflow steps are:

Check queue — query your inbox or feed source for items added in the last hour
Filter duplicates — skip items already in the database (match by URL or source ID)
Loop through new items — process each one through Claude
Write to database — store the structured output
Log results — record how many items were processed, flag any failures

Handling Failures Gracefully

Scheduled workflows fail silently if you’re not careful. Build in:

Error logging to a separate “failed queue” table
Retry logic for transient failures (rate limits, timeouts)
A weekly email digest of processing stats so you notice if the system stops working

Step 7: Create a Query Interface

Hermes, walked through line by line — free 1-hour workshop

The knowledge base needs a front end — some way to ask it questions.

Option A: Chat Interface

A simple chat UI where you type a question and get a response. Most automation platforms let you build this without code. You can also connect directly to Slack or a messaging app, so you query your knowledge base via DM.

Option B: Daily Brief

Instead of on-demand queries, you get a daily email or Slack message summarizing what was added in the last 24 hours — with Claude’s own synthesis of themes and connections it noticed.

Option C: Contextual Suggestions

More advanced: whenever you’re working on a specific topic (drafting a document, in a meeting), the system proactively surfaces relevant knowledge base records. This requires more integration work but is genuinely useful.

Start with Option A. A working chat interface proves the system works and gives you real usage data to decide whether to invest in more.

How MindStudio Makes This Buildable in an Afternoon

Building everything described above from scratch — handling APIs, writing ingestion logic, setting up schedulers, parsing JSON, managing retries — is weeks of engineering work if you’re writing custom code.

MindStudio is a no-code platform built specifically for multi-step AI workflows like this one. You get Claude (and 200+ other models) built in, 1,000+ integrations with tools like Airtable, Notion, Gmail, and Google Drive, and native support for scheduled background agents.

The average MindStudio build takes 15 minutes to an hour. You’d build this second brain as a set of connected agents:

An ingestion agent that reads from your email or RSS feed on a schedule
A processing agent that runs each item through Claude and returns structured JSON
A storage agent that writes to Airtable or Notion
A retrieval agent that handles your queries via a chat UI

Each agent is built visually. You don’t write boilerplate — you describe the logic and connect the steps. The scheduling, retries, and authentication are handled by the platform.

You can try MindStudio free at mindstudio.ai. If you want to see how others have built similar knowledge management workflows, the MindStudio workflow templates page has working examples you can clone and modify.

Common Mistakes to Avoid

Skipping the Schema Step

Jumping straight to automation without defining your record structure is the most common mistake. You’ll end up with inconsistent data that’s hard to query. Spend 30 minutes on the schema first.

Processing Everything

Not every piece of content deserves to be in your second brain. A random Slack message or a low-quality article will just add noise. Use Claude’s relevance score as a filter, and consider only processing items you’ve explicitly saved rather than everything you consume.

Ignoring Maintenance

A knowledge base that grows without pruning becomes its own version of information overload. Set a monthly review: archive or delete records older than 6 months with low relevance scores. Keep the system focused on what’s actually useful to you now.

Building Too Much Too Fast

Hermes Crash Course — free 1-hour live workshop

The temptation is to build the full system — vector search, semantic clustering, daily digests, proactive suggestions — before validating the basics. Build the ingestion and simple search first. Use it for two weeks. Then add complexity based on what you actually find missing.

Trusting Claude Too Much

Claude is good at summarization and tagging but will occasionally miss nuance, misclassify a topic, or hallucinate a connection. Review a sample of processed records when you first set up the system. Adjust your prompts if you see consistent errors.

Frequently Asked Questions

What’s the difference between a second brain and a RAG system?

A second brain is the broader concept — an external store for your knowledge and thinking. RAG (retrieval-augmented generation) is the technical mechanism that makes an AI second brain work: retrieve relevant stored content, inject it into the AI’s context, generate a grounded response. Your second brain is the goal; RAG is how Claude accesses it.

Do I need coding skills to build this?

No. The architecture described here can be fully implemented using no-code tools like MindStudio, Zapier, or Make combined with a database like Airtable or Notion. The prompt engineering — writing good instructions for Claude — is the most important skill, and that’s just clear writing. For more advanced setups (vector search, custom embeddings), some technical knowledge helps but isn’t required to get started.

How much does it cost to run an AI knowledge base like this?

Costs break down into three parts: the AI model (Claude API calls), the automation platform, and the database. A typical personal setup — processing 10–30 items per day — might cost $5–$20 per month in Claude API costs depending on document length. Airtable’s free plan handles several thousand records. MindStudio starts free, with paid plans from $20/month that include more model calls. Total cost for a personal setup is often under $30/month.

How do I handle private or sensitive information?

This is a real concern. If your knowledge base includes confidential work documents, personal health information, or financial data, you need to think about where that data lives and who can access it. Use a private database (self-hosted Supabase or a private Notion workspace), review your AI provider’s data retention policies, and consider whether sensitive content should be excluded from the system entirely.

Can I connect Claude to an existing Notion or Obsidian vault?

Yes. Both have APIs or export options. For Notion, use the Notion API to read existing pages and process them through the ingestion pipeline. For Obsidian, you’d typically export to plain text or markdown and process the files in batches. The processing pipeline described here works on any text regardless of source.

How is this different from just using Claude’s Projects feature?

Claude’s Projects feature lets you upload documents and have Claude reference them in a conversation. That’s useful but limited: you manage the documents manually, there’s no automated ingestion, and the knowledge doesn’t persist or grow automatically. A custom second brain system is more work to set up but gives you automated continuous ingestion, structured search across potentially thousands of records, scheduled updates, and integration with your existing tools.

Key Takeaways

A working AI second brain has four parts: ingestion, Claude-powered processing, structured storage, and retrieval.
The schema — what fields each record contains — is the most important design decision. Get it right before building anything else.
Hourly scheduled ingestion keeps your knowledge base current without manual effort.
The RAG pattern (retrieve relevant records, inject into Claude’s context, generate a grounded response) is the standard approach for AI knowledge retrieval.
Start simple: manual ingestion and keyword search first. Add complexity only after you’ve validated the basics work.
MindStudio lets you build this entire system without writing code — including the scheduled agents, Claude processing steps, and database integrations.

Plans first. Then code.

PROJECTYOUR APP

SCREENS12

DB TABLES6

BUILT BYREMY

1280 px · TYP.

yourapp.msagent.ai

A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

If you want to skip the setup work and start with a working template, MindStudio is worth exploring. The platform is built for exactly this kind of multi-step AI workflow, and you can get a working version running in an afternoon rather than weeks.