How to Use Sub-Agents in Claude Code to Manage Context and Speed Up Research

Q: What is the Task tool in Claude Code?

The Task tool is a built-in capability in Claude Code's agentic mode that allows the main agent to spawn sub-agents. Each sub-agent gets its own context window, a defined objective, and access to tools. The parent agent receives only the sub-agent's output, not its internal reasoning history. This is how Claude Code implements multi-agent and parallel processing patterns.

Q: Can Claude Code sub-agents run in parallel?

Yes. When the orchestrator spawns multiple Task calls for independent sub-tasks, they can execute concurrently. This is one of the main performance benefits of the sub-agent pattern. Tasks that don't depend on each other's outputs are the best candidates for parallelization.

Why Context Bloat Kills Complex AI Workflows

Anyone who has used Claude Code for a serious research or coding task has run into the same wall: the context window fills up. You’re halfway through a complex investigation, and Claude starts losing track of what it found three steps ago. Results get muddled, tasks slow down, and the quality of responses degrades.

The fix isn’t a bigger context window. It’s a smarter architecture — specifically, using sub-agents in Claude Code to isolate work, run tasks in parallel, and keep the main context clean.

This article covers how sub-agents work in Claude Code, when to use them, and how to structure multi-agent workflows that actually perform well.

What Sub-Agents Actually Are in Claude Code

Claude Code can spawn separate agent instances — called sub-agents — that each operate in their own isolated context window. The main Claude Code session acts as the orchestrator. It delegates specific tasks to sub-agents, waits for (or collects) their results, and uses those results without carrying the full working history of each sub-task.

This is made possible through Claude’s Task tool, which is available in agentic mode. When Claude calls Task, it launches a new agent with a fresh context, a defined objective, and access to the same tools (or a subset of them) as the parent.

The key distinction: sub-agents don’t share context with the parent. Each one runs independently. That’s what makes them useful.

The Difference Between a Sub-Agent and a Tool Call

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

A tool call (like a web search or file read) is synchronous — Claude calls it, gets a result, and the result goes directly into the main context window.

A sub-agent is different. It’s a self-contained process that can itself make multiple tool calls, reason across many steps, and return a consolidated answer. The sub-agent’s entire working history stays inside its own context. The parent only receives the final output.

This means the parent’s context stays lean no matter how much work the sub-agent does.

Why Context Management Matters for Research Tasks

Context limits in large language models aren’t just a technical footnote. They directly affect output quality.

When a context window fills up:

Earlier findings get “forgotten” or weighted less heavily
The model may repeat work it already did
Reasoning becomes less precise as the model has to compress more information
Long tasks become unreliable

For research-heavy workflows — competitive analysis, codebase audits, documentation synthesis, literature reviews — this is a real problem. You might ask Claude to research five different topics and then synthesize them. If all five research threads run in the same context, by the time you hit topic four, you’re burning through tokens that could have gone toward actual reasoning.

Sub-agents solve this by keeping each research thread in its own container.

How to Use Sub-Agents in Claude Code

Prerequisites

Before you set up sub-agent workflows, make sure you’re working in the right environment:

You need Claude Code running in agentic mode (not just interactive chat)
The Task tool must be available — this is enabled by default in Claude Code’s agentic configurations
You should have a clear breakdown of what work can be parallelized versus what needs sequential ordering

If you’re using Claude via the API, you’ll want to enable tool use and structure your prompts so the orchestrator knows when to delegate.

Step 1: Identify Which Tasks to Delegate

Not every task benefits from sub-agents. The best candidates are:

Independent research threads — topics that don’t depend on each other’s results
Repetitive structured tasks — analyzing multiple files, summarizing multiple documents, scraping multiple pages
Deep dives that would bloat context — tasks where the working steps don’t need to be visible to the parent, only the conclusion

Tasks that should stay in the main context:

Anything requiring reasoning across multiple sub-results at the same time
Tasks with tight sequential dependencies
Short tasks where sub-agent overhead exceeds the benefit

Step 2: Structure Your Orchestrator Prompt

The orchestrator — your main Claude Code session — needs a clear prompt that defines the overall goal and explicitly tells it to delegate sub-tasks. A good orchestrator prompt does three things:

States the end goal
Breaks down what sub-tasks need to happen
Tells Claude to use Task to parallelize independent work

Here’s an example structure:

You are researching the competitive landscape for B2B project management tools.

Your job is to:
1. Use Task to research each of the following tools independently: Asana, Monday.com, Linear, Notion
2. For each tool, ask the sub-agent to identify: pricing, key differentiators, customer reviews, and recent feature changes
3. Once all sub-agents complete, synthesize the findings into a comparison table

Run sub-tasks for each tool in parallel.

REMY IS NOT

✕a coding agent
✕no-code
✕vibe coding
✕a faster Cursor

IT IS

✓a general contractor for software

The one that tells the coding agents what to build.

This prompt tells the orchestrator to spawn four sub-agents (one per tool), run them simultaneously, and then synthesize the results.

Step 3: Define Sub-Agent Objectives Clearly

Each sub-agent gets its own task description. Be specific. Vague sub-agent instructions lead to vague results, and you don’t get a chance to course-correct mid-task without killing the sub-agent and restarting.

Good sub-agent instructions:

State exactly what to research or produce
Define the format of the output (bullet list, JSON, paragraph summary)
Set a scope limit — “focus only on pricing page data, not general marketing copy”
Indicate what tools the sub-agent can use

Avoid giving sub-agents instructions that require them to wait on each other or reference the parent context. Each sub-agent should be self-sufficient.

Step 4: Handle Sub-Agent Results in the Orchestrator

When sub-agents return results, the orchestrator receives only their output — not their reasoning process or tool call history. This is by design.

To make good use of this:

Ask sub-agents to return structured output (tables, JSON, numbered lists) so the orchestrator can parse and compare easily
Have sub-agents include a brief confidence note if the quality of source material was variable
Keep orchestrator-level synthesis focused — don’t re-research what sub-agents already covered

If a sub-agent returns incomplete or unclear output, the orchestrator can spawn a follow-up sub-agent with a more specific instruction rather than expanding the main context.

Step 5: Run Tasks in Parallel Where Possible

This is where the speed gains come from. When Claude Code uses Task for multiple independent sub-agents, they can run concurrently. Instead of waiting for each research thread to finish before starting the next, they all run at the same time.

For a four-topic research task:

Sequential approach: each topic runs one after another, compounding context usage
Parallel sub-agent approach: all four topics run simultaneously, each in a clean context, results return roughly at the same time

The time savings compound as you add more parallel threads.

Under the hood, this is an async fan-out/fan-in pattern: the orchestrator spawns several Task calls without waiting for each one to finish, then waits for all of them and synthesizes the combined results. One caveat — firing many sub-agents at once means many simultaneous API requests, which can trip rate limits on lower API tiers. For high-volume workflows, cap concurrency, stagger launches slightly, and build exponential backoff into the orchestrator’s retry logic.

Cost Management: Use Cheaper Models for Sub-Agents

Context isolation is only half the payoff. Sub-agents also give you a natural place to route simpler work to cheaper models — one of the most underused levers in Claude Code.

Think of your system in tiers. The orchestrator handles reasoning, planning, and synthesis, so it’s where you want a strong model like Claude Opus or Sonnet. Sub-agents handling mechanical work — extracting fields from structured data, reformatting text, checking whether something is present, generating boilerplate from a template — can often run on Claude Haiku or another lighter model. When the orchestrator costs $15 per million output tokens and a sub-agent costs $0.25, that gap adds up fast on any workflow you run often.

When you spawn a sub-agent via the Task tool, you can specify which model it uses. A practical pattern: have the orchestrator classify each incoming task by complexity, then route simple tasks to Haiku and reserve Sonnet or Opus for work that needs nuanced judgment or long reasoning chains. Keep the lighter models away from any task where a wrong interpretation would cascade into bigger errors downstream.

Practical Use Cases for Sub-Agents

Parallel Codebase Analysis

If you’re auditing a large codebase, you can assign sub-agents to different modules or directories. Each sub-agent reviews its portion independently and returns a structured summary. The orchestrator synthesizes findings across the whole codebase without carrying every file’s content in a single context.

Multi-Source Research Synthesis

Researching a topic across multiple sources — academic papers, news articles, product docs, forum discussions — is a perfect fit for sub-agents. Assign one sub-agent per source type or per source. Each returns a clean summary. The orchestrator builds the synthesis.

Document Review at Scale

If you have ten contracts to review for specific clauses, you don’t want all ten in a single context. Spawn a sub-agent per document. Each returns a structured extraction. The orchestrator flags anything that needs attention.

Iterative Hypothesis Testing

In research or data analysis workflows, sub-agents can test different hypotheses or analytical approaches in parallel. The orchestrator picks the most promising results without getting bogged down in each dead-end analysis.

A Practical Example: Parallel Code Review

Here’s the fan-out/fan-in pattern on a concrete task: reviewing a pull request with 12 changed files for security issues and style violations, then summarizing the changes.

Done sequentially, a single agent reads all 12 files, runs every check on each, and writes the summary — context growing with each file until, by file 10, it’s carrying a lot of accumulated, half-relevant history. Slow, large context, elevated cost.

With sub-agents, the orchestrator:

Splits the 12 files into three groups of four
Spawns three sub-agents in parallel, one per group
Has each run the security and style checks on its four files with minimal context
Collects structured results (JSON: file, issue type, description)
Synthesizes everything into one report

The whole thing runs in roughly a third of the sequential time, each sub-agent stays small and focused, and because the per-file checks are mechanical, they can run on a lighter model. Fan out, process in parallel, fan in and synthesize — that loop is the core of effective sub-agent design.

Common Mistakes When Using Sub-Agents

Over-Delegating Simple Tasks

Spawning a sub-agent to fetch a single URL and return one sentence is wasteful. Sub-agents have overhead — they’re a full agent instantiation. Use them for tasks that would otherwise take multiple steps or accumulate significant context.

Under-Specifying Output Format

If you don’t tell sub-agents how to format their output, you’ll get inconsistent results that are hard to synthesize. Always specify: bullet points, JSON, markdown table, or a specific schema.

Creating Sub-Agent Chains That Defeat the Purpose

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

goremy.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

If sub-agent A’s output is required input for sub-agent B, and sub-agent B’s output is required for sub-agent C, you’ve created a sequential chain. You don’t get parallel speed gains, and you’ve added complexity. Sequential task dependencies are fine — just be aware they don’t benefit from parallelization.

Losing Track of Sub-Agent State

Sub-agents don’t have memory between runs. If you spawn a new sub-agent to “continue where the last one left off,” it starts completely fresh. Pass all necessary context in the sub-agent’s initial prompt.

Where MindStudio Fits Into Multi-Agent Workflows

If you’re building multi-agent systems but want to avoid managing the infrastructure yourself, MindStudio offers a different path.

MindStudio’s Agent Skills Plugin — an npm SDK (@mindstudio-ai/agent) — lets any AI agent, including Claude Code, call pre-built capabilities as simple method calls. Things like agent.searchGoogle(), agent.runWorkflow(), or agent.sendEmail() abstract away the plumbing so agents can focus on reasoning.

The practical implication: instead of building sub-agent infrastructure from scratch, you can offload specific capabilities to MindStudio workflows that your Claude Code orchestrator calls on demand. Your Claude Code agent handles the reasoning and orchestration; MindStudio handles the execution layer for specific tasks like web search, document processing, or external integrations.

MindStudio also lets you build full multi-step agent workflows visually — no code required — with access to 200+ AI models and 1,000+ integrations out of the box. If your team needs to expose sub-agent capabilities to non-technical stakeholders or integrate them with tools like Slack, Notion, or HubSpot, that’s considerably easier to do in MindStudio than in raw Claude Code.

You can start for free at mindstudio.ai.

FAQ

What is the Task tool in Claude Code?

The Task tool is a built-in capability in Claude Code’s agentic mode that allows the main agent to spawn sub-agents. Each sub-agent gets its own context window, a defined objective, and access to tools. The parent agent receives only the sub-agent’s output, not its internal reasoning history. This is how Claude Code implements multi-agent and parallel processing patterns.

How do sub-agents help with context window limits?

Each sub-agent operates in an isolated context. All the work it does — tool calls, intermediate reasoning, partial results — stays within its own window. The parent agent only receives the final output. This means you can do substantial research or analysis work without the parent context growing to match.

Can Claude Code sub-agents run in parallel?

Yes. When the orchestrator spawns multiple Task calls for independent sub-tasks, they can execute concurrently. This is one of the main performance benefits of the sub-agent pattern. Tasks that don’t depend on each other’s outputs are the best candidates for parallelization.

Are sub-agents in Claude Code the same as in Claude’s API?

They use the same underlying mechanism. Anthropic’s model supports multi-agent orchestration patterns across both the API and Claude Code. The Task tool in Claude Code is the interface for triggering sub-agents in that environment. If you’re building custom systems via the API, you can implement similar patterns by chaining agent calls with isolated context management.

What’s the difference between sub-agents and just making multiple API calls?

Everyone else built a construction worker.
We built the contractor.

🦺

CODING AGENT

Types the code you tell it to.
One file at a time.

🧠

CONTRACTOR · REMY

Runs the entire build.
UI, API, database, deploy.

A raw API call with no agent framing just gets a one-shot response. Sub-agents, by contrast, can use tools, reason across multiple steps, and produce complex outputs — all inside their own context. They’re full agent instances, not single inference calls. The orchestrator coordinates them and uses their results strategically.

When should you NOT use sub-agents?

Avoid sub-agents when tasks are short and simple (a single tool call or lookup), when tasks depend heavily on each other’s intermediate steps, or when the output of one task is needed to frame the next. In those cases, sequential execution in the main context is simpler and just as effective. Sub-agents add value when tasks are substantive, independent, and would otherwise bloat the main context.

Key Takeaways

Sub-agents in Claude Code run in isolated context windows, so the parent agent stays clean no matter how much work the sub-agent does.
The Task tool is what makes sub-agent delegation possible in agentic mode.
Parallel sub-agents dramatically reduce time on independent multi-topic research tasks.
Clear, structured output formats from sub-agents make synthesis far easier for the orchestrator.
The biggest mistakes are over-delegating simple tasks, under-specifying outputs, and creating sequential chains where parallel execution isn’t actually possible.
If you want to extend sub-agent workflows with pre-built capabilities and integrations, MindStudio’s Agent Skills Plugin and visual workflow builder are worth exploring.

For teams building serious research or analysis workflows, the sub-agent pattern is one of the most practical ways to get more done without hitting context limits. Start small — pick one multi-topic research task you currently run sequentially, restructure it with parallel sub-agents, and see the difference.