Text Generation Model

Claude 4.6 Sonnet

Anthropic's most capable Sonnet model yet, delivering near-Opus-level intelligence for coding, agents, and computer use with a 1M token context window.

Start Building with Claude 4.6 Sonnet View All Models

Publisher

Anthropic

Type Text

Context Window 1,000,000 tokens

Training Data February 2026

Input $3.00/MTok

Output $15.00/MTok

LATESTLARGE CONTEXTREASONINGTOOLSMCP

Try Claude 4.6 Sonnet →

About Claude 4.6 Sonnet

Frontier coding and agents with 1M context

Claude Sonnet 4.6 is a text generation model developed by Anthropic, released in February 2026 as an upgrade to the Sonnet line of mid-tier models. It features a 1 million token context window in beta, allowing it to process entire codebases, lengthy legal documents, or large collections of research papers within a single request. The model is designed for coding, agentic workflows, computer use, and professional knowledge work at scale.

Sonnet 4.6 is particularly suited for developers and enterprises running high-volume workloads that require consistent instruction following, accurate tool selection, and reliable error correction across long sessions. It includes improved computer use capabilities, enabling it to navigate browsers, fill multi-step web forms, and automate desktop workflows. Anthropic's safety evaluations found it to be as safe as or safer than other recent Claude models, with noted resistance to prompt injection attacks.

Capabilities

What Claude 4.6 Sonnet supports

1M Token Context

Accepts up to 1 million tokens in a single request (beta), enabling reasoning across entire codebases, lengthy contracts, or dozens of documents at once.

Advanced Coding

Supports the full software development lifecycle including planning, implementation, debugging, and large-scale refactors across multiple files.

Agentic Workflows

Handles long-running, multi-step autonomous tasks with improved instruction following, tool selection, and error correction over extended sessions.

Computer Use

Controls browsers and desktop software to navigate complex spreadsheets, fill multi-step web forms, and automate workflows that previously required human intervention.

Tool Use

Supports structured tool calling, allowing the model to invoke external functions and APIs as part of a reasoning or task-completion workflow.

MCP Integration

Compatible with Model Context Protocol (MCP) servers, enabling connection to external data sources and services through a standardized interface.

Reasoning

Applies multi-step reasoning to complex professional tasks including financial analysis, research synthesis, and frontend code generation.

Safety Guardrails

Includes Anthropic's safety evaluations with documented resistance to prompt injection attacks, rated as safe as or safer than other recent Claude models.

Ready to build with Claude 4.6 Sonnet?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

44.4 Intelligence Index

46.4 Coding Index

Benchmark	What it tests	Standard	Extended Thinking
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	79.9%	87.5%
HLE	Questions that challenge frontier models across many domains	13.2%	30.0%
SciCode	Scientific research coding and numerical methods	46.9%	46.8%
IFBench	Instruction following accuracy	41.2%	56.6%
Long Context Reasoning	Reasoning across long documents and contexts	57.7%	70.7%
TerminalBench Hard	Agentic coding and terminal command tasks	46.2%	53.0%
τ²-Bench	Agentic tool use in realistic scenarios	79.5%	75.7%
SWE-bench Verified	Real GitHub issues requiring multi-file code fixes	79.6%	—
OSWorld-Verified	Autonomous computer use and desktop tasks	72.5%	—
Terminal-Bench 2.0	Agentic coding and terminal command tasks	59.1%	—
ARC-AGI-2	Novel abstract reasoning and pattern recognition	58.3%	—
MMLU-Pro	Expert knowledge across 14 academic disciplines	79.1%	—
MATH-500	Undergraduate and competition-level math problems	97.8%	—
MMMB	Multilingual and multimodal understanding	76.1%	—
Finance Agent	Financial analysis and decision-making tasks	63.3%	—
τ²-bench Retail	Agentic tool use in retail scenarios	91.7%	—
τ²-bench Telecom	Agentic tool use in telecom scenarios	97.9%	—
MCP-Atlas Tool Use	Structured tool use via Model Context Protocol	61.3%	—

FAQ

Common questions about Claude 4.6 Sonnet

What is the context window for Claude Sonnet 4.6?

Claude Sonnet 4.6 supports a 1 million token context window, currently available in beta. This allows it to process large inputs such as entire codebases or lengthy document collections in a single request.

What is the training data cutoff for Claude Sonnet 4.6?

Based on the metadata provided, the training date for Claude Sonnet 4.6 is February 2026.

What types of tasks is Claude Sonnet 4.6 best suited for?

Claude Sonnet 4.6 is designed for coding, agentic workflows, computer use, and enterprise knowledge work. It is particularly well-suited for high-volume deployments requiring consistent instruction following and long-session reliability.

Does Claude Sonnet 4.6 support tool use and MCP servers?

Yes. Claude Sonnet 4.6 supports structured tool calling and is compatible with Model Context Protocol (MCP) servers, making it suitable for integration with external APIs and data sources.

How does Claude Sonnet 4.6 handle safety and security?

Anthropic's safety evaluations found Claude Sonnet 4.6 to be as safe as or safer than other recent Claude models. It has documented resistance to prompt injection attacks, which is relevant for agentic and computer use deployments.

Community Discussion

What people think about Claude 4.6 Sonnet

Community discussions mentioning Claude Sonnet 4.6 appear in the context of broader model comparison threads, where users are evaluating coding performance across multiple AI models. Sentiment in coding-focused threads suggests interest in how Sonnet 4.6 performs on real-world software tasks relative to other available models.

Some threads note regressions in general benchmarks for competing models even when agentic coding scores improve, reflecting a common concern about uneven capability trade-offs across model updates. The LocalLLaMA coding comparison thread is the most directly relevant, with users sharing results from testing models on TypeScript projects in practical development scenarios.

r/singularity 110 pts 36 comments

Gemini 3.1 livebench results

r/singularity 86 pts 36 comments

Livebench just dropped their run of codex 5.3. New SOTA for agentic coding, but regression overall

r/LocalLLaMA 22 pts 30 comments

I compared 8 AI coding models on the same real-world feature in an open-source TypeScript project. Here are the results

View more discussions →

Resources

Documentation & links

Announcement Blog Post Announcements

→

Product Page Other

→

What's New in Claude 4.6 Documentation

→

Models Overview & API Reference Documentation

→

Model Card / System Card Research

→

About Claude

→

Anthropic Prompt Engineering Resources

→

Official Website

→

Acceptable Use Policy

→

Anthropic Trust Center

→

Configuration

Parameters & options

Max Temperature 1

Max Response Size 128,000 tokens

Reasoning Select

When enabled, the model will explain its thought process step-by-step before providing a final answer. This can help users understand how the model arrived at its conclusions, but may result in longer responses. The model dynamically decides when and how much to think.

Default: false

DisabledEnabled

Related models

Explore similar models

Start building with Claude 4.6 Sonnet

No API keys required. Create AI-powered workflows with Claude 4.6 Sonnet in minutes — free.

Get Started Free Explore All Models