Claude Code Effort Levels Explained: When to Use Low, Medium, High, and Max

Q: Can I set effort levels in automated pipelines and scripts?

Yes. The --effort flag works in non-interactive mode, so you can set different effort levels for different steps in a script or CI/CD pipeline. This is useful for building cost-optimized agentic workflows where some steps need heavy reasoning and others don't.

What “Effort” Actually Means in Claude Code

Claude Code gives you direct control over how deeply the model reasons before it responds. That control comes through a single setting: the effort level.

The four levels — low, medium, high, and max — determine how many tokens get allocated for extended thinking before Claude produces any output. Higher effort means more internal reasoning; lower effort means the model responds more directly from pattern recognition.

This isn’t just a speed toggle. Each level represents a different cost-quality tradeoff, and using the wrong one in the wrong context will either burn through your token budget unnecessarily or produce mediocre results on problems that actually needed careful thought.

Understanding when to use each Claude Code effort level is one of the most practical skills you can develop for working efficiently with the tool.

How Effort Levels Work Under the Hood

Claude’s extended thinking capability lets the model reason through a problem step by step before producing a final response. That reasoning happens in a separate “thinking” phase and is tracked separately from regular output tokens — but it still counts toward your total token usage and cost.

When you set an effort level in Claude Code, you’re setting the thinking budget: the upper limit on how many tokens the model can spend reasoning internally before answering.

The four levels and their behavior

Low — The smallest thinking budget. Claude responds quickly, drawing mostly on its trained knowledge and pattern recognition. Think of it as answering from memory rather than working through the problem.
Medium — A moderate thinking budget. The model does meaningful reasoning, but stops well short of exhausting its capacity. This is the default in most configurations and handles the majority of everyday coding tasks well.
High — A substantially larger budget. Claude can trace through complex logic, consider multiple approaches, and backtrack when needed. Response time increases noticeably, and so does token cost.
Max — The full available thinking budget. The model can reason as extensively as the current context allows. This is the most expensive and slowest option, but produces the highest-quality output on genuinely hard problems.

Remy is new. The platform isn't.

Remy

Product Manager Agent

THE PLATFORM

200+ models 1,000+ integrations Managed DB Auth Payments Deploy

▮

BUILT BY MINDSTUDIO

Shipping agent infrastructure since 2021

Remy is the latest expression of years of platform work. Not a hastily wrapped LLM.

Why this matters for cost

Extended thinking tokens are billed at the same rate as output tokens on Claude 3.5 and Claude 3.7 models. At max effort, a single prompt can consume dramatically more tokens than the same prompt at low effort — sometimes 10x or more, depending on complexity.

Don’t want to pick the effort level for every task by hand? Build a workflow in MindStudio that routes each step to the right model and effort level automatically →

That asymmetry makes effort level selection a real cost lever, not just a quality preference. Anthropic’s documentation on extended thinking outlines how thinking tokens are metered and what to expect from each model.

When to Use Low Effort

Low effort is the right call when the task is well-defined, the answer is relatively obvious, and there’s no meaningful ambiguity to work through.

Good use cases for low effort

Syntax fixes — Correcting a typo, fixing indentation, or resolving a linting error that the model can pattern-match immediately.
Simple renames — Renaming a variable, function, or file across a small codebase.
Code formatting — Rewriting a block of code to match a style convention.
Quick lookups — “What’s the signature for this function?” or “How do I access an environment variable in Node?”
Boilerplate generation — Scaffolding a standard component, CRUD endpoint, or configuration file from a clear spec.

What to watch for

Low effort is not appropriate for anything that requires reasoning about tradeoffs, diagnosing root causes, or generating novel logic. If the problem has more than one plausible interpretation, or if the correct answer depends on understanding context that spans multiple files, low effort will often produce code that looks right but isn’t.

The speed advantage is real — low effort responses are noticeably faster. For tasks in a tight feedback loop (like making small iterative changes and previewing results), that speed adds up. Just don’t let it become a habit that bleeds into more complex requests.

When to Use Medium Effort

Medium effort is the most versatile setting and the appropriate default for the majority of development work. It gives Claude enough reasoning capacity to handle context-dependent tasks, multi-step logic, and moderately complex bugs.

Good use cases for medium effort

Standard bug fixes — Diagnosing why a function returns an unexpected value, tracking down an off-by-one error, or fixing broken conditional logic.
Refactoring familiar patterns — Extracting a function, splitting a module, converting class components to hooks, or updating an API call to match a new schema.
Writing tests — Generating unit tests for a function where the expected behavior is clear but requires some interpretation.
Explaining code — “What does this block do?” or “Why is this query slow?” — questions that need more than surface-level pattern matching but don’t require deep inference chains.
Adding features with clear requirements — Implementing a new endpoint, adding a form validation rule, or connecting a new data source when the spec is explicit.

The medium effort sweet spot

Remy doesn't write the code. It manages the agents who do.

AGENTS ASSIGNED TO THIS BUILD

Remy

Product Manager Agent

Leading

Design

Engineer

Deploy

Remy runs the project. The specialists do the work. You work with the PM, not the implementers.

Medium effort handles most of the work a developer does during a normal coding session. It’s fast enough that you won’t feel like you’re waiting, and powerful enough that the output is usually correct on the first pass for well-scoped tasks.

If you find yourself re-prompting frequently at medium effort on a specific type of task, that’s a signal to bump up to high — not to try rephrasing the prompt.

When to Use High Effort

High effort is appropriate when the problem involves genuine complexity: multiple interacting systems, non-obvious root causes, performance tradeoffs, or logic that needs to be reasoned through carefully rather than pattern-matched.

Good use cases for high effort

Hard debugging sessions — When you’ve already ruled out the obvious causes and the bug involves subtle interaction between components, async timing, or state that mutates in non-obvious ways.
Architectural decisions — “Should I use a message queue or event emitter here?” or “What’s the right way to structure this data model?” — questions where the tradeoffs actually matter and getting it wrong is expensive.
Complex refactors across multiple files — When changes need to be consistent across a large surface area and a small error anywhere breaks the build.
Performance optimization — Profiling bottlenecks, optimizing queries, or reducing render cycles in a way that requires understanding what the current code is actually doing.
Security review — Identifying injection vectors, access control gaps, or data leakage paths requires careful, non-obvious reasoning.

Knowing when medium isn’t enough

A practical signal: if you’ve tried medium effort on a problem and the response is technically plausible but misses something important — it solves the wrong version of the problem, ignores a relevant constraint, or produces a solution that breaks something else — that’s a good indicator that more reasoning headroom would help.

High effort is not cheap. But on tasks where it genuinely applies, the cost of the extra tokens is almost always less than the cost of debugging a solution that was confidently wrong.

When to Use Max Effort

Max effort is the setting for the hardest problems. It gives the model maximum reasoning capacity and should be reserved for situations where getting the answer right matters enough to justify the cost and latency.

Good use cases for max effort

Algorithmic complexity — Designing or reviewing an algorithm where correctness is non-trivial, edge cases matter, and the problem space has real depth (graph traversal, dynamic programming, constraint satisfaction).
System design — When you need Claude to reason about a full architecture, not just a component. How should services communicate? Where does caching belong? What happens at failure points?
Mysterious, hard-to-reproduce bugs — Race conditions, memory leaks, non-deterministic test failures. These require following long chains of causality, and max effort gives the model the space to do that.
Large-scale refactoring with risk — When you’re making changes that touch core infrastructure, alter data models, or affect how multiple teams’ code interacts — and the cost of getting it wrong is high.
Writing critical code from scratch — A parser, a state machine, a cryptographic implementation, a billing calculation. Anything where subtle logic errors have real downstream consequences.

What max effort doesn’t fix

Catch up on Hermes — free 60-minute live workshop

Max effort improves reasoning depth, but it doesn’t override model capability limits. If the problem requires information the model doesn’t have (proprietary internal APIs, recent updates past its training data, runtime context it can’t see), more thinking won’t fill that gap.

It’s also worth remembering that max effort increases response latency meaningfully. For interactive use, this is sometimes frustrating. For tasks where you’d run the model and come back later, it’s less of an issue.

Practical Decision Framework

Choosing an effort level gets easier with practice, but having a mental checklist speeds up the intuition-building phase.

Ask these questions before setting the level

Is the answer obvious given the input? If yes, low effort is probably enough.
Does the task require understanding context beyond what’s in the current file or prompt? If yes, medium or higher.
Have I already tried a lower effort level and gotten a wrong or incomplete answer? Step up one level.
Would getting this wrong have real consequences? High or max.
Is this a genuinely novel problem — something the model hasn’t likely seen in a common form? High or max.

A rough heuristic by task type

Task Type	Recommended Effort
Typo fix, rename, formatting	Low
Simple function, standard boilerplate	Low to Medium
Feature addition with clear spec	Medium
Bug fix with known root cause	Medium
Multi-file refactor	Medium to High
Debugging with unclear root cause	High
Architecture or design decision	High
Hard algorithmic problem	High to Max
Mission-critical or complex system design	Max

This isn’t a rigid rule — context matters. A “simple” bug in a complex concurrent system might warrant max effort. A “complex” refactor that follows a completely standard pattern might need only medium.

Effort Levels and Token Cost: What to Expect

Token costs for extended thinking are not always obvious from the outside, because the thinking tokens aren’t visible in the final response. You’re paying for reasoning you don’t directly see.

Rough cost scaling

As a general pattern, the step from medium to high is a larger jump than low to medium. And max effort can cost substantially more than high — depending on how much reasoning the model actually uses, which varies by prompt complexity.

Some teams set a default effort level per task type and reserve max for a formal review pass rather than interactive use. This keeps costs predictable without sacrificing quality where it matters.

Using effort levels strategically in agentic workflows

If you’re running Claude Code in an automated pipeline — generating code, reviewing it, running tests — you can set different effort levels for different stages. A first-pass code generation step might run at medium, while the final review or edge-case check runs at high or max. This captures most of the quality benefit while keeping the overall token spend reasonable.

Building Claude-Powered Workflows Beyond the CLI

Claude Code is built for the terminal and for direct developer use. But much of the work that benefits from careful AI reasoning happens in broader business contexts — content pipelines, data processing, customer-facing tools, or internal automation.

Cursor

ChatGPT

Figma

Linear

GitHub

Vercel

Supabase

remy.msagent.ai

Seven tools to build an app. Or just Remy.

Editor, preview, AI agents, deploy — all in one tab. Nothing to install.

If you’re building systems that need Claude’s reasoning applied to tasks outside a code editor, MindStudio gives you access to Claude and 200+ other models through a no-code workflow builder. You can build agents that apply different model configurations to different steps — heavy reasoning for complex analysis, lighter inference for quick classification — without managing API keys or writing infrastructure code.

The platform handles rate limiting, retries, and auth, so teams can focus on what the agent should actually do rather than how to wire it together. You can try it free at mindstudio.ai.

For developers who are already using Claude Code but want to extend their agents with additional capabilities — sending emails, searching the web, running sub-workflows — MindStudio’s Agent Skills Plugin lets Claude Code call 120+ typed actions as simple method calls, without building that infrastructure from scratch.

Frequently Asked Questions

What is the default effort level in Claude Code?

Claude Code uses medium effort as the default in most configurations. This is a reasonable starting point for the majority of coding tasks. You can change the default in your settings or override it per request using the --effort flag.

Does higher effort always produce better results?

No. For tasks that are genuinely straightforward, higher effort doesn’t improve output quality — it just increases cost and latency. The reasoning budget only adds value when there’s actual complexity to reason through. Applying max effort to a simple rename or a boilerplate generation task will produce the same result as low effort, just slower and more expensively.

How do effort levels affect token usage and billing?

Effort levels control the thinking budget, and thinking tokens are billed at the same rate as output tokens on Claude 3.5 and 3.7 models. Higher effort levels can significantly increase total token usage per request. Max effort in particular can use far more tokens than the visible output suggests, because the reasoning happens internally. Track your usage across effort levels to understand the cost impact for your specific use patterns.

Can I set effort levels in automated pipelines and scripts?

Yes. The --effort flag works in non-interactive mode, so you can set different effort levels for different steps in a script or CI/CD pipeline. This is useful for building cost-optimized agentic workflows where some steps need heavy reasoning and others don’t.

When should I use max effort instead of just reprompting at lower effort?

If you’ve tried medium or high effort and the model is consistently missing something important about the problem — not because of a poorly written prompt, but because the problem itself is genuinely complex — that’s when max effort is worth it. Reprompting at lower effort levels can help with ambiguous prompts but won’t help with reasoning depth.

How do effort levels interact with context window size?

Effort levels and context window are separate dimensions. A larger context window gives the model more information to reason about; a higher effort level gives it more budget to reason with. For very hard problems, you often want both: a full, relevant context and a high thinking budget. Neither substitutes for the other.

Key Takeaways

Effort levels control Claude Code’s extended thinking budget — how many tokens the model uses for internal reasoning before responding.
Low is appropriate for simple, well-defined tasks where the answer is obvious from the input.
Medium handles the majority of everyday development work — bugs, features, refactoring, and tests with clear specs.
High is the right call for complex debugging, multi-file refactors, architectural decisions, and performance work.
Max is reserved for genuinely hard problems: algorithmic complexity, mysterious bugs, critical system design, and anything where being wrong has real consequences.
Higher effort levels cost more because thinking tokens are billed like output tokens — use them where they add value, not by default.
Building a habit of matching effort level to task complexity saves money and produces better results than setting a single level for everything.

Plans first. Then code.

PROJECTYOUR APP

SCREENS12

DB TABLES6

BUILT BYREMY

1280 px · TYP.

yourapp.msagent.ai

A · UI · FRONT END

Remy writes the spec, manages the build, and ships the app.

The effort level setting is a small configuration choice with real practical impact. Used deliberately, it gives you direct control over the cost-quality tradeoff on every request — which is exactly the kind of control that makes AI coding tools worth using seriously.