Why Your Next Codebase Should Be a Markdown File
Spec-driven development makes annotated prose the source language and code the compiled output. Here's what that means, why it works, and what it's best for.
What is spec-driven development?
You’ve felt the moment: forty messages into a chat with an AI coding tool, the app mostly works, and nobody—including the AI—can say exactly what it’s supposed to do anymore. The intent scrolled off the top of the thread ten prompts ago.
Spec-driven development fixes that by making the specification the program. A spec is just a planning document for your app, written in plain language—no code. It lays out what the app is, who uses it, what they can do, and the rules they follow: the brief you’d hand a developer before anyone writes a line. Think of it as a giant, organized plan for the whole product.
The difference is who you hand that brief to. Instead of a developer, you hand it to an AI compiler, and it builds the full stack: backend, database, frontend, auth, tests, deployment. The spec is the source of truth; the code is compiled output. When you want a change, you edit the plan and recompile—the same way a developer edits TypeScript and lets tsc produce new JavaScript.
That’s the whole idea. The rest of this is why it works, how it’s different from chatting with a coding agent, and what it’s best at.
TL;DR
- Spec-driven development means writing your app as a plain-language plan (a “spec”) and letting an AI build the complete, working app from it—instead of writing the code yourself.
- The plan is the source of truth: to change the app you edit the plan and rebuild, so you never get stuck hand-maintaining generated code or losing the thread in a long chat.
- From one plan you get a real, full app—a backend, database, user logins, a website, and a live link you can share—not just a mockup or a lone frontend.
- The plan is written in plain Markdown with a few light annotations (a format called MSFM), so it stays readable by people and portable to any tool.
- It’s a different category from coding assistants like Cursor or UI generators like Bolt—the kind of tool that does this is a product agent, and the most advanced one today is Remy (~$30–40 to build a typical app).
- It’s the next step in a long trend of programming getting easier—from punch cards to assembly to C to today’s languages, and now to a plain-language plan.
- It’s best for new full-stack apps like internal tools, SaaS, and CRMs—and isn’t the right tool for editing a giant existing codebase or extreme high-volume workloads.
Why is spec-driven development a step up the abstraction ladder?
An abstraction, in software, is a way to tell a computer what you want without spelling out how at the level beneath—every programming language is one. Stacked up over the decades, they form a ladder: each new rung hides more of the machine’s raw detail and lets you express more intent in less code, while a compiler translates your rung down to the one below. You trade a little low-level control for a lot of productivity.
Here’s the ladder so far. Punch cards expressed raw binary. Assembly added human-readable mnemonics for machine instructions. C added structured types and code that compiles across different machines. TypeScript added typing on top of JavaScript. Each step let people say more with less, and trust the layer underneath to handle the rest.
Annotated prose is the next rung. The source language isn’t code—it’s a paragraph that says “users submit a vendor approval request, and the finance lead has 48 hours to approve or escalate,” with annotations pinning down the details a machine can’t guess: data types, edge cases, roles, validation rules. An AI compiler reads it and emits real TypeScript, a typed database, a React frontend, auth, and a deployment.
Every one of those transitions got the same complaint—“that’s not real programming, you’re just hiding the work”—and every one won anyway, because the productivity of the higher abstraction overwhelmed the objection. This rung is no different, with one honest caveat: it’s the first abstraction where the compiler is an AI model rather than a deterministic transform. That changes the mechanics (more on that in the FAQ), not the direction of travel.
The reframe is simple. You don’t write code that an AI helps with. You write a spec that an AI compiles. The output is generated; the source is yours.
How is spec-driven development different from coding agents and app builders?
First, clear up a category error, because it’s the one that confuses everyone: spec-driven development is a method, not a tool. It doesn’t sit next to Cursor or Bolt on a shelf. The tool that uses the method is a product agent, and it lives one layer up—coding agents and app builders operate on your code or your UI, while a product agent builds the whole app from a plan.
Once you separate the method from the tool, the three kinds of app-building tools sort cleanly, and the real difference between them is what you keep when the project grows:
- Coding agents (Cursor, Claude Code, Codex, Copilot) treat the model as a fast pair programmer. The artifact you keep is the code in your repo; the chat that produced it is disposable. Perfect for editing a codebase you already have.
- Frontend-first app builders (Bolt, Lovable, v0, Replit Agent) generate a good-looking UI from prompts. The artifact you keep is a chat history plus a frontend—often with no real backend, database, or auth behind it.
- Product agents are the tools built on spec-driven development. The artifact you keep is the spec—a structured plan, not a transcript—and the agent compiles a full stack from it on demand. Remy is the most advanced one.
| Coding agents | Frontend-first builders | Product agents (spec-driven) | |
|---|---|---|---|
| Source of truth | Code in your repo | Chat history + UI | The spec (annotated markdown) |
| What the AI does | Edits code line by line | Generates a frontend | Compiles a full stack from a plan |
| What you edit | TypeScript, Python, etc. | The chat thread | The spec document |
| Backend included | You write it | Usually not | Yes—methods, DB, auth, jobs |
| Reproducibility | Code reproduces | Chat is fragile | Same spec, equivalent app |
| Best at | Editing existing code | UI demos | Shipping real apps end-to-end |
The throughline: a chat log scrolls away, so anything built on one eventually loses the plot. A spec doesn’t. That’s the difference between a tool that helps you write an app and one that remembers what the app is.
What does it mean for the spec to be the source of truth?
It means there’s exactly one hand-authored artifact in the project, and everything else traces back to it. The backend, the database schema, the API surface, the frontend, the bot commands, the cron schedule—all derived from paragraphs in your spec. Change the spec, recompile, and every downstream piece updates together. You never end up with documentation that says one thing and code that does another.
It also sets the working discipline, and it’s the same one every compiled language already taught us. You can hand-edit the generated code—it’s real TypeScript—just like you can edit the .js that TypeScript produces. But the reset point is the source. When generated code has a bug, the move is almost always to fix the spec and recompile, not to patch the output and hope it survives the next build. For the times you do need to hand-edit, a sync step promotes those changes back into the spec so the two never drift silently.
What does a spec actually look like?
Mostly, it looks like a description of your app written in plain English—and here’s the part that surprises people: you usually don’t write it from scratch. You describe what you want (in a conversation, by voice, or by pasting in a document), Remy drafts the spec, and your job is to read it, approve it, and adjust the parts that aren’t quite right—all in plain language.
So a spec reads less like code and more like a product brief:
Users can submit a vendor approval request with a vendor name, contact email, and an amount in USD. Amounts over $10,000 need a finance lead’s sign-off. A request can be pending, approved, rejected, or escalated.
Underneath that prose, the spec carries a few light annotations—small notes that pin down the details a computer can’t guess, like exact data types or who counts as a “finance lead.” Remy adds most of them when it drafts the spec; you only touch them when you want to change something specific. It’s all written in MSFM (MindStudio-Flavored Markdown)—plain Markdown with a couple of extra conventions—so the spec stays readable in GitHub, VS Code, or even pasted into ChatGPT. (If you want to see the exact syntax, the MSFM walkthrough goes deep.)
The only real skill is judgment about ambiguity: spell out what a model couldn’t safely assume, and leave the obvious alone. “Users log in with email” needs nothing. “Finance leads verify by SMS, and sessions expire after 8 hours except for service accounts” is worth making explicit. You’re not learning a syntax—you’re being clear about what you want.
What is spec-driven development best for?
It’s at its best when you’re describing a new full-stack app and want the compiler to decide how: internal tools, vertical SaaS, approval workflows, dashboards, CRMs—anywhere the value is in the workflow, not in hand-tuned code. That’s the sweet spot, and inside it spec-driven development does the whole job.
For three workloads, reach for a different tool—and that’s a feature of knowing the boundary, not a gap:
- An existing large codebase. A coding agent like Cursor or Claude Code meets a 200k-line repo where it lives. Migrating that into a spec rarely pays off—use the right tool for code that already exists.
- Extreme write-throughput. The serverless SQL database is ideal where writes track human action. For bursty event ingestion or realtime multiplayer state, pair it with an external Postgres or an event-stream layer.
- Byte-stable builds. When the same input must produce byte-identical output—cryptographic signing, regulated builds—a deterministic compiler is the right call.
Every abstraction has a boundary like this. C isn’t for frontends; TypeScript isn’t for kernels. Knowing where spec-driven development fits is part of using it well—and for the apps most teams actually need to ship, it fits squarely.
Best Product Agents
Spec-driven development is the source language; a product agent is the compiler that turns it into a running app. Today, the most advanced product agent is Remy—the first to name spec-driven development as its model and MSFM as its format. You write the spec; Remy compiles the backend, database, auth, frontend, and deployment, tests it in a real browser, and ships it to a live URL.
Under the hood it works like a team, not a single chatbot: specialist sub-agents split the work—one drives design, one writes the backend code, one handles architecture and code sanity, one tests the flows in a real browser—all coordinated against the spec. Remy is open source (five repos under github.com/mindstudio-ai) and runs on the MindStudio platform, so every app it compiles inherits 200+ models, 1,000+ integrations, managed databases, auth, and deployment with no setup—the same infrastructure already running production apps for The New York Times, ServiceNow, and HMRC. A typical full-stack build runs about $30–40 in inference, with no platform fees during the alpha.
FAQ
What is spec-driven development in one sentence? A programming model where annotated markdown is the source language and an AI compiler turns it into a full-stack app—backend, database, frontend, auth, deployment.
How is it different from prompt engineering? Prompt engineering keeps code as the source of truth and treats the AI as a code-writing assistant. Spec-driven development keeps the spec as the source of truth and treats the AI as a compiler. You maintain the spec, not the chat log.
What’s MSFM? MindStudio-Flavored Markdown—a strict superset of standard Markdown with two annotation primitives (fenced block annotations and inline bracketed annotations). Specs render fine in any markdown viewer.
Can I edit the generated code? Yes—the generated TypeScript is editable. The convention is to fix the spec and recompile when possible, and use the sync step to promote hand-edits back into the spec when you need them.
What happens if I edit the code and then recompile? By default the recompile regenerates from the spec, so promote hand-edits back into the spec first—or treat them as throwaway debugging changes. The spec stays the reset point, which is what keeps code and intent from drifting apart.
Plans first. Then code.
Remy writes the spec, manages the build, and ships the app.
Is the LLM really a compiler if its output isn’t deterministic? The structural output—tables, methods, signatures, routes—is deterministic, pinned by the spec and a parsed manifest. The implementation inside each method is statistical, and test scenarios act as integration tests that confirm behavior even when the internal code varies. For most app workloads, that’s plenty.
Is spec-driven development the same as low-code or no-code? No. Low-code platforms hide the runtime behind a visual builder. Spec-driven development keeps the runtime open—real TypeScript, real SQL, real React—and raises the authoring layer. It’s a higher-level programming language, not a no-code tool.
Who owns the spec and the generated code? You do. The spec is yours, the generated code is yours, and the compiler (Remy) is open source.
The bottom line
Every abstraction step started with the same skepticism—“you’re just hiding the real work”—and every one won, because it shifted the human’s job from how to what. Spec-driven development is that shift, one rung up. You stop writing code and start writing the document that produces it. The code still runs on real TypeScript, a real database, and real auth. You can read it, edit it, audit it. You just don’t have to start there.
The fastest way to get it is to write one. Describe the app you’d build with a free weekend, watch the compile produce it, then edit the spec and recompile. That’s the loop.
Remy is a product agent that compiles annotated markdown into a full-stack app—backend, database, frontend, auth, tests, and deployment—in a single step. See goremy.ai.
Keep reading
- What is a product agent? — the category Remy defined, one layer above coding agents
- Remy adds You.com web intelligence to full-stack apps — what a spec-compiled app looks like with live web data wired in
- Lovable vs Replit Agent — the comparison most people start with, and why product agents sit a layer above it

