Mark Kashef's Claude Code Hive Mind: SQLite + Telegram Multi-Agent Council on Zero Cloud Cost
Mark Kashef's hive mind stores all agent conversations, tasks, and scheduled jobs in a free local SQLite DB with a 3D graph view.
Mark Kashef Runs a Multi-Agent Business on a Free SQLite Database
Mark Kashef’s entire hive mind — every agent conversation, every scheduled task, every memory, every node in the 3D graph view — runs on a local SQLite database that costs nothing. No Supabase subscription. No cloud vector store. No managed infrastructure. Just a free embedded database sitting on his laptop, and a Telegram interface stitched together with the Anthropic SDK.
That’s the thing worth paying attention to here. The /standup and /discuss slash commands, the Gemini Flash orchestration layer, the Obsidian-style graph view — all of it is built on top of what he calls “a data organization exercise with some layers on top.” The hive mind is the layers. The SQLite database is the foundation.
This post is about how that foundation works, what the architecture actually looks like, and what you’d need to replicate it.
The architecture is simpler than the demo suggests
When you watch Kashef’s demo, the 3D graph view is the thing that grabs attention. Nodes representing agent tasks, edges connecting related work, real-time updates as agents complete jobs. It looks like a purpose-built observability platform.
Everyone else built a construction worker.
We built the contractor.
One file at a time.
UI, API, database, deploy.
It isn’t. The 3D view is a frontend built on top of a list view. Kashef is explicit about this: “If the list view is actually working and the list view is operational and all the agents are logging in real time and all of the database is set up, everything else is just additive.” The graph is a visualization layer. The SQLite table is the actual system.
The database stores six things: agent identities, conversations, tasks, memories, scheduled jobs, and hive mind entries (the summaries that power the graph). Every agent writes to the same database. That shared write access is what makes the “hive mind” framing accurate — it’s not metaphorical, it’s a literal shared state.
Each agent is defined by two files: a claude.md describing its role, skills, and rules, and a YAML configuration file with its metadata. That’s it. The sophistication comes from what those files reference, not from the files themselves.
Why Gemini Flash handles orchestration instead of Claude
One of the more interesting engineering decisions in this system is where Kashef deliberately avoids using Claude tokens.
The auto-assign feature — which looks at a new task and decides which agent should handle it — runs on Gemini 3 Flash, not Claude. The memory insight generation, which scans conversation history to suggest new agents and identify overloaded ones, also runs on Gemini 3 Flash. The reasoning is straightforward: these are classification and summarization tasks with large inputs. Gemini’s context window is enormous, inference is cheap, and burning Claude tokens on “which agent should send this email” is wasteful.
The system prompt for auto-assign is dynamic. It reads the current agent roster and their descriptions, then asks Gemini to classify the task. When Kashef demos it with a trivial task (“send Mark a message saying hi”), Gemini correctly routes it to the comms agent in seconds. The classification result gets written to the task queue, the relevant agent gets pinged via Telegram, and execution starts.
This is a pattern worth stealing: use your expensive frontier model for the work that requires it, and use cheap fast models for routing, classification, and summarization. The multi-agent workflow patterns that Claude Code supports generally assume you’ll use one model throughout — Kashef’s system is more deliberate about cost allocation.
The Telegram bridge and what it actually enables
The interface layer is a bridge built with the Anthropic SDK that connects Claude Code to Telegram. Kashef started with this because he wanted to use his existing Claude Code subscription from his phone without rebuilding anything. The bridge gives every agent its own Telegram bot token, which means each agent has its own chat thread.
The war room is a separate Telegram group where all agents are present simultaneously. The /standup command pings every agent in parallel, each one queries its own entries in the SQLite hive mind database, and each returns a status report for the last 24 hours. The meta agent reads all the other responses and synthesizes a cohesive summary.
The /discuss command works similarly but is designed for deliberation rather than reporting. You can pin a specific agent to lead the discussion, tag individual agents by name (the same @agent syntax as regular Telegram), and the system ensures each agent has context on what the others have said before responding.
Coding agents automate the 5%. Remy runs the 95%.
The bottleneck was never typing the code. It was knowing what to build.
The hardest engineering problem here, per Kashef, is context threading: making sure each agent in the war room has visibility into the full conversation, not just its own messages. He solved it by asking Claude Code to figure out the implementation — “can you find a way that in this specific ecosystem of the war room that we can have this conversation and it all has context on all the other replies as well as their underlying claude.md files, YAML files, etc.” This is a reasonable approach for one-off architectural problems where you know what you want but not how to implement it.
If you want to understand the Telegram connection setup in more detail, the step-by-step guide to Claude Code Channels with Telegram covers the BotFather configuration and plugin setup that underlies this kind of bridge.
The Meta Ads integration as a concrete example
Kashef walks through one specific business workflow in detail: a daily Meta Ads performance report delivered at 7:30am via Telegram.
The stack: Meta Ads CLI (a command-line interface to the Meta Ads API), a custom meta ad skill that formats the output for mobile display, a cron job scheduled through the system’s schedule tab, and the Telegram delivery layer. The report includes spend, actions, blind spots, hyperlinks to individual ads, and a “quick take” on winners and losers.
The interesting part is what happens after the report. Because the agent has access to the Meta Ads CLI and also has a Gemini skill that can call Nano Banana for image generation, Kashef can ask it in the same conversation to generate new ad creative, add it to a campaign, and set a spend budget. The report and the action are in the same thread.
This is the practical argument for the hive mind architecture: the agent inherits all the integrations that Claude Code already has access to. You don’t wire up the Meta Ads CLI to each agent individually. You wire it up once at the system level, and every agent gets it.
The scheduled job itself is a cron expression under the hood. The frontend translates it to plain English (“every weekday at 7:30am”) because Kashef explicitly asked Claude Code to render cron syntax as human-readable text. The raw cron string is still there; it’s just not what you see.
Memory architecture: three categories, one database
Kashef’s memory system organizes everything into three categories: importance, salience, and recency. These map roughly to “should this be kept forever,” “how relevant is this right now,” and “how recent is this.”
The practical implementation is a searchable memory store in the same SQLite database. You can query it by keyword — searching “Gmail” returns every memory where Gmail was involved. You can generate insights by running Gemini Flash over your memory corpus to find patterns you haven’t noticed. You can pin high-importance memories so they never fade.
This is a more opinionated approach than the six-level taxonomy that other agentic OS frameworks use. The six-layer Claude Code memory architecture — from claude.md files up through cross-tool shared memory — is a useful framework for understanding the design space. Kashef’s system sits somewhere around levels 3-4 of that taxonomy (semantic search plus structured recall) but implemented entirely in local SQLite rather than a managed vector database.
The key design decision is that memory maintenance is explicit. Kashef recommends having Claude Code interview you about how you want fresh memories handled, how fading memories should decay, and whether important memories should be pinned permanently. The system doesn’t make these decisions for you. You configure the policy, the system enforces it.
Other agents ship a demo. Remy ships an app.
Real backend. Real database. Real auth. Real plumbing. Remy has it all.
What the skill layer looks like
Each agent in the system can have global skills (available to all agents) or agent-specific skills (scoped to one). The decision about what graduates to global is a real architectural choice — Kashef frames it as “what deserves to be promoted to a global skill that every single agent should have access to versus what should I pigeonhole and make specific to one particular terminal agent or project.”
The skill creator skill (installed via /plugin install skill creator) is how new skills get built. You describe what you want in plain English, it drafts the skill, tests it, and packages it. For the Meta Ads workflow, Kashef built a custom meta ad skill this way. For the Obsidian-style graph view, he screen-recorded the Obsidian interface using Loom, fed the video to Gemini’s video understanding API, and described what he wanted to build. The skill that came out of that process generates the 3D visualization.
The GSD skill (Get Shit Done) handles complex multi-day projects by breaking them into plan → execute → verify phases, with fresh sub-agents spawned per task to avoid context rot. If you’re building something that spans multiple sessions, GSD is the planning layer that keeps the work coherent. The ClaudeMem skill handles cross-session memory persistence — it hooks into the session lifecycle, captures file edits and decisions, and stores semantic summaries in a local SQLite database with vector search, reporting roughly 10x token savings on retrieval compared to dumping all past context at session start.
For teams building similar orchestration infrastructure without writing the plumbing from scratch, MindStudio offers a no-code path: 200+ models, 1,000+ pre-built integrations, and a visual builder for chaining agents and workflows — the same multi-agent coordination pattern, minus the SQLite schema design.
The honest engineering assessment
There are real tradeoffs in this architecture that Kashef doesn’t obscure.
The system runs on his laptop. He acknowledges this directly: “I’ll admit at this point we don’t actually do this just yet with our own agent OS. We just run it off our laptops.” The scheduled jobs only run when the machine is on. The Telegram bridge only responds when the process is alive. Moving to a VPS or cloud provider would solve this, but it adds operational complexity and cost.
The frontend-backend synchronization is genuinely hard. Kashef describes it as “making sure that the front end should always have a perfect symbiosis with the back end so that when you have a task on screen that propagates to something like your Telegram API, your different APIs are being used and making sure that this whole system is cohesive.” This is not a solved problem. It’s an ongoing engineering challenge.
And the system took hundreds of hours to build. Kashef is explicit: “It’s not meant to be a step-by-step walkthrough on exactly how you can go from absolutely zero to this. The reason is this takes hundreds of hours to start, iterate, and refine.” The demo is the result of months of iteration, not a weekend project.
That said, the underlying insight is sound. If you want to build a multi-agent system that persists state across sessions, routes tasks intelligently, and gives you a mobile interface without paying for cloud infrastructure, SQLite plus a Telegram bridge plus Gemini Flash for cheap orchestration is a legitimate architecture. The components are all free or nearly free. The cost is engineering time.
For the spec-driven side of this kind of system — where you’re defining agent behaviors, data schemas, and task routing rules as structured documents — tools like Remy take a complementary approach: you write an annotated markdown spec and it compiles a complete TypeScript backend with SQLite, auth, and deployment. The spec is the source of truth; the generated code is derived output. That’s a different abstraction than Kashef’s hand-built system, but the underlying philosophy (structured documents driving system behavior) is similar.
The actual starting point
If you want to build toward this, the order of operations matters.
Start with the list view. Get a SQLite database where agents can log task completions. Make sure that table is always populated with accurate, recent data. Everything else — the graph view, the war room, the insight generation — is a layer on top of a working log.
Then add the Telegram bridge using the Anthropic SDK. One agent, one bot token, one chat thread. Get that working reliably before adding more agents.
Then add scheduling. One cron job, one task, verify it runs when expected. The Claude Code Dispatch remote control pattern covers the mechanics of how local Claude instances receive and execute remote commands — useful background for understanding how the Telegram-to-agent execution path actually works.
The 3D graph view, the war room slash commands, the agent suggestion feature — those come later, after the foundation is solid. Kashef’s core point is that this is a data engineering problem, not an AI problem. Get the data layer right and the rest follows.
The /standup command is only impressive because the database behind it is accurate. That’s the part that takes the work.