GSD Framework for Claude Code: How to Plan and Build Full Applications

Why Most Claude Code Projects Fall Apart Before They Ship

You’ve probably seen it happen. You start a Claude Code session with a clear goal. The first few responses are sharp. Then an hour in, the agent starts contradicting itself, forgetting earlier decisions, or generating code that conflicts with what’s already there. By the time you’re halfway through the build, you’re spending more time correcting drift than making progress.

This isn’t a Claude problem. It’s a structure problem. When you treat a full application build as one long conversation, you’re asking a single context window to hold the entire history of your project. That context fills up, degrades, and eventually works against you.

The GSD framework for Claude Code exists to prevent exactly this. It breaks large application builds into three distinct phases — Plan, Execute, and Verify — each with clean context, a defined scope, and a specific output. The result is a more predictable build process that scales to multi-day, multi-feature projects without losing coherence.

This guide explains how the framework works, when to use it, and how to apply each phase in practice.

What the GSD Framework Actually Is

GSD stands for Get Stuff Done. The name is deliberately plain. The goal isn’t elegant architecture for its own sake — it’s shipping working software without the session degradation that kills most large Claude Code projects.

At its core, the framework separates three types of cognitive work that tend to collide in unstructured sessions:

Planning — figuring out what to build and how to structure it
Execution — writing the actual code
Verification — checking that what was built matches what was planned

When these three things happen in the same context, they interfere with each other. Planning generates a lot of exploratory text. Execution adds file reads, diffs, and code blocks. Verification loops back over earlier decisions. Each phase feeds context compounding, and the window fills fast.

The GSD framework treats each phase as its own context. You complete one before starting the next. This keeps each context lean and focused, which keeps output quality high across the entire project.

If you want a deeper look at the framework’s origins and structure, this overview of the GSD framework and its context phases covers the conceptual foundation well.

When to Use the GSD Framework

Not every Claude Code task needs this level of structure. For a quick bug fix, a single-file component, or a simple script, you don’t need three phases. Just ask and iterate.

The GSD framework earns its overhead when you’re building something that:

Has more than 5–7 distinct features or subsystems
Will take more than a single session to complete
Involves multiple files that need to stay in sync
Has explicit requirements that shouldn’t drift during implementation
Requires validation against a spec, not just a vibe check

In practice, this means most serious application builds. If you’re building a SaaS feature, a full CRUD API, an admin dashboard, or anything with auth and database interactions, the GSD framework is the right tool.

For comparison with other planning approaches, this breakdown of Claude Code planning frameworks — Plan Mode, PRD Generator, and GSD — is worth reading before you commit to any one method.

Phase 1: Plan

What Happens in This Phase

The Plan phase has one job: produce a written specification that defines what you’re building before any code is written.

This sounds obvious. In practice, most people skip it. They describe a feature in a prompt, Claude starts generating code, and the “plan” exists only as a vague intention in the developer’s head. When the build gets complex, that vague intention isn’t enough to keep things coherent.

In the GSD framework, the Plan phase is a separate Claude Code session with no code generation. You’re not building anything yet. You’re producing a document.

What the Plan Document Should Contain

A good GSD plan document covers:

Project overview — one paragraph describing what the app does and who uses it
Feature list — every feature, broken into discrete units
Data models — entity definitions, field types, relationships
API surface — endpoints or methods, their inputs and outputs
Auth requirements — who can do what, and how authentication works
Edge cases and constraints — things that should be explicitly out of scope, error states to handle
Implementation order — which features to build first and why

The format matters less than the completeness. Markdown works well because it’s readable and can be referenced easily in later phases.

How to Run the Plan Phase

Start a fresh Claude Code session. Paste in your high-level description of the application. Then instruct Claude to produce a structured plan document rather than code.

A prompt like this works well:

“I want to build [app description]. Don’t write any code yet. Instead, produce a structured plan document that covers the feature list, data models, API surface, auth requirements, and implementation order. Flag any ambiguities or missing information.”

Read the output carefully. Push back on anything vague. Clarify edge cases. The goal is to leave this phase with a document you trust — one that could hand off to another developer (or another Claude session) without losing information.

Once the plan is solid, save it as a markdown file in your project directory. This file becomes the source of truth for everything that follows.

Phase 2: Execute

Starting Fresh

The Execute phase opens a new Claude Code session. You do not continue from the Plan phase session.

This is the part that feels counterintuitive. You’ve just done all that planning — why start over? Because the planning session is full of exploratory reasoning, revisions, and clarifications. Carrying that into execution means the agent starts writing code with a cluttered context. Output quality drops almost immediately.

Starting fresh, with just the plan document loaded, gives the execution context a clean baseline.

How to Structure Execution

The Execute phase works best when you treat the plan document as an ordered task list. Don’t ask Claude to build the entire application in one go. Work through features sequentially, one at a time.

For each feature:

Reference the relevant section of the plan document
Ask Claude to implement that feature
Review the output before moving on
Update the plan document if anything changes

If a feature turns out to be more complex than the plan anticipated, pause and revise the plan section before continuing. Don’t let implementation decisions drift silently from the plan.

Managing Context During Execution

Even with a fresh context, long execution sessions accumulate cruft. A few things help:

Use /compact periodically to compress the session history without losing critical context
Break large features into sub-tasks and treat each as a focused request
Keep file reads targeted — don’t ask Claude to read the entire codebase if you only need one module

Context rot is real, and execution sessions are where it hits hardest. Watch for signs: repeated mistakes, contradictions with earlier decisions, suggestions that ignore files already created. When you see these, it’s time to compact or restart.

For multi-day builds, the GSD framework recommends closing the execution session at natural stopping points and starting fresh the next day with the plan document and a brief summary of where you left off. This is covered in depth in the guide to planning multi-day app builds without losing context.

Effort Levels and Model Selection

During execution, you have choices about how much compute Claude applies to each task. For complex logic or architecture decisions, higher effort levels produce better results. For straightforward boilerplate, lower effort is faster and cheaper.

Understanding Claude Code’s effort levels and when to apply them can meaningfully extend the useful life of an execution session and reduce token spend.

Phase 3: Verify

What Verification Is For

The Verify phase answers one question: does the built application match the plan?

This is different from testing. Testing checks whether code works. Verification checks whether what was built is what was specified. You can have passing tests and still have drifted significantly from the original plan.

✗ VIBE-CODED APP

Tangled. Half-built. Brittle.

✓ AN APP, MANAGED BY REMY

UIReact + Tailwind✓

APIValidated routes✓

DBPostgres + auth✓

DEPLOYProduction-ready✓

Architected. End to end.

Built like a system. Not vibe-coded.

Remy manages the project — every layer architected, not stitched together at the last second.

Verification is also where you catch accumulated debt from the execution phase. When you’re deep in implementation, it’s easy to make small compromises — slightly different data structures, skipped edge cases, auth shortcuts. Each one is small. Together they can amount to an app that works but doesn’t match what was intended.

Running the Verify Phase

Open a new session. Load the plan document and the key implementation files. Then ask Claude to compare the implementation against the plan, specifically looking for:

Features in the plan that weren’t implemented
Implementation decisions that contradict the plan
Missing error handling or edge cases
Data model discrepancies

A prompt structure that works:

“Here is the original plan document [paste plan]. Here are the key implementation files [reference or paste relevant files]. Review the implementation against the plan and identify: (1) anything in the plan that’s missing, (2) anything in the implementation that contradicts the plan, (3) edge cases the plan specified that weren’t handled.”

The output becomes a punch list. Work through it in the execution context before calling the build done.

The Builder-Validator Chain

The Verify phase is a manual implementation of what’s sometimes called the builder-validator chain — using a separate agent or session specifically to audit the work of another. The separation matters because the execution context has strong priors toward the code it just wrote. A fresh verification context doesn’t.

This pattern is also structurally similar to the planner-generator-evaluator architecture, which uses distinct agents for each role. If you’re building more complex systems, that pattern extends the GSD approach to full multi-agent workflows.

Common Mistakes When Using the GSD Framework

Merging Phases in Practice

The most common mistake is treating the phases as labels on a continuous session rather than separate contexts. If you plan and execute in the same window, you get the worst of both worlds: a cluttered context during execution, and planning outputs that weren’t clean enough to reference reliably.

Under-Specifying the Plan

Vague plans produce vague implementations. If your plan says “add user authentication,” the execution phase will make assumptions — about session handling, token expiry, password reset flows, email verification. Some assumptions will be fine. Others will cause problems you only discover during verification.

Good plans are specific. “Authentication uses JWT tokens with a 24-hour expiry, stored in HTTP-only cookies. Password reset is handled via a one-time link sent to email. No OAuth in v1.”

Skipping Verification

Verification is easy to skip because the code works. Tests pass, the app loads, features are visible. But working and correct aren’t the same thing. A 20-minute verification session catches drift that would otherwise surface as bugs or rework later.

Not Persisting the Plan Document

The plan document is most valuable across sessions. If you don’t save it as a file, it exists only in the chat history — which is messy, harder to reference, and gets compacted or lost. Save it to your project directory from the start.

GSD vs Other Planning Approaches

The Claude Code ecosystem has several planning frameworks. GSD isn’t the only option, and it’s worth knowing where it fits relative to others.

Plan Mode is built into Claude Code and useful for quick architecture sketches. It’s a single-session tool, not a multi-phase framework. Good for smaller tasks, less suited to multi-day builds.

Remy doesn't write the code. It manages the agents who do.

AGENTS ASSIGNED TO THIS BUILD

Remy

Product Manager Agent

Leading

Design

Engineer

Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

PRD Generators produce more formal product requirement documents. They’re heavier on process and work well in team settings where sign-off is needed before development starts. The output can feed into a GSD Execute phase.

WHISK is another context management framework for AI coding agents that takes a different approach to preventing degradation. It’s worth comparing if GSD’s phase-boundary model doesn’t fit your workflow.

GStack, Superpowers, and Hermes are more opinionated Claude Code frameworks with specific conventions and tooling. They bundle more structure than GSD but also more setup cost.

GSD sits in a practical middle ground. It’s light enough to adopt immediately, structured enough to handle serious projects, and the phase boundaries are explicit enough that teams can use it consistently.

How Remy Changes This Equation

The GSD framework is a workaround for a real constraint: large application builds exceed what a single context can handle well.

Remy addresses the underlying problem differently. Instead of a conversational session that degrades over time, Remy uses a spec document — annotated markdown that describes what the app does — as the permanent source of truth. The code is compiled from the spec, not built through a conversation.

This means you never lose the plan. The spec doesn’t drift because it’s not a chat log. When you add a feature, you update the spec. When the code needs to change, the spec changes first and the code follows. Context degradation isn’t a risk because the spec is the source, and the spec is always legible.

If the GSD framework is the right way to manage Claude Code’s context constraints, Remy is what happens when those constraints are designed away from the start. Full-stack — backend, database, auth, deployment — compiled from a structured document that both you and the agent can read and reason about.

You can see how this compares to prompt-driven approaches in this comparison of spec-driven development versus vibe coding. If the GSD framework resonates with you, the spec-driven approach takes the same instinct — structure before execution — and builds it into the tooling itself.

Try Remy at mindstudio.ai/remy.

Frequently Asked Questions

What does GSD stand for in Claude Code?

GSD stands for Get Stuff Done. It’s a three-phase framework for building full applications with Claude Code: Plan (write a spec before touching code), Execute (implement feature by feature using the spec), and Verify (check the implementation against the spec). The name reflects the focus on shipping, not framework sophistication.

Why do you need separate sessions for each GSD phase?

Each phase has a different job, and mixing them pollutes the context. The planning session generates exploratory reasoning and revisions. If that carries into execution, the agent starts with a cluttered baseline and degrades faster. Starting fresh for each phase keeps each context focused and improves output quality throughout the build.

How does GSD prevent context rot in Claude Code?

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

goremy.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

Context rot happens when accumulated conversation history starts degrading agent output — usually through contradictions, forgotten decisions, or drift from earlier requirements. GSD limits this by isolating each phase in its own session. The plan document acts as a persistent, clean reference that survives session boundaries. When you open a new execution or verification session, you load the document rather than inheriting a degraded history.

Can I use the GSD framework for small projects?

It works, but it’s probably overkill. For single-session tasks — a quick script, a simple component, a minor API change — the overhead of three separate phases isn’t worth it. GSD pays off when a project is complex enough that context degradation is likely. A rough heuristic: if you expect more than 2–3 hours of Claude Code work, use GSD.

What should a GSD plan document include?

At minimum: a project overview, a feature list, data models, API surface, auth requirements, edge cases and constraints, and an implementation order. The more specific each section is, the better the execution phase goes. Vague plans produce vague code. Treat the plan document as a spec you’d hand to another developer and expect them to implement without follow-up questions.

How does GSD compare to using sub-agents for large builds?

They’re complementary. GSD is a session management framework — it tells you how to structure phases and when to start fresh. Sub-agents are an execution strategy — they let you run parallel workstreams for codebase analysis or split large tasks across isolated contexts. You can use sub-agents within the GSD Execute phase for parallel feature implementation, then bring the results together during Verify.

Key Takeaways

The GSD framework breaks Claude Code application builds into three phases: Plan, Execute, and Verify — each in its own context.
The Plan phase produces a written spec document before any code is written. This document is the source of truth for everything that follows.
The Execute phase implements features sequentially from the spec, in a fresh session, one feature at a time.
The Verify phase opens another fresh session to compare the implementation against the spec and produce a punch list.
Separate sessions prevent context rot and context compounding from degrading output across a long build.
GSD is best suited for projects with 5+ features, multi-day timelines, or strict requirements that need to survive session boundaries.
Remy takes the same core instinct — spec before code — and makes it the foundation of the tool itself, with a persistent spec document that compiles into a full-stack application.