Text Generation Model

DeepSeek V3.2

DeepSeek-V3.2 is a reasoning-first open large language model that combines efficient sparse attention with advanced agentic capabilities, delivering GPT-5-level performance.

Start Building with DeepSeek V3.2 View All Models

Publisher

DeepSeek

Type Text

Context Window 160,000 tokens

Training Data December 2025

Input $0.26/MTok

Output $0.38/MTok

Provider

DeepInfra

Try DeepSeek V3.2 →

About DeepSeek V3.2

Open-weight reasoning model with agentic tool use

DeepSeek-V3.2 is an open-weight large language model developed by DeepSeek and released on December 1, 2025. It uses a Mixture-of-Experts architecture combined with a novel sparse attention mechanism called DeepSeek Sparse Attention (DSA), which reduces computational complexity to near-linear scale (O(kL)) for long-context tasks. The model supports a 160,000-token context window and is available under the MIT License on Hugging Face.

DeepSeek-V3.2 introduces three notable technical advances: a scalable reinforcement learning training framework, a large-scale agentic task synthesis pipeline covering over 1,800 environments and 85,000+ complex instructions, and native support for Thinking in Tool-Use — the ability to reason while invoking external tools in both thinking and non-thinking modes. It is best suited for complex multi-step reasoning, agentic workflows involving search and code execution, long-context document processing, and developers building AI applications that require integrated reasoning and tool use.

Capabilities

What DeepSeek V3.2 supports

Long-Context Processing

Handles inputs up to 160,000 tokens, enabling analysis of lengthy documents, codebases, or multi-turn conversations in a single context window.

Advanced Reasoning

Trained with a scalable reinforcement learning framework that extends post-training compute, supporting multi-step logical and mathematical reasoning tasks.

Thinking in Tool Use

Supports integrated reasoning during tool invocation, allowing the model to think through problems while calling external tools in both thinking and non-thinking modes.

Agentic Task Execution

Trained on a synthesis pipeline covering 1,800+ environments and 85,000+ complex instructions, enabling reliable performance on search, code, and general agent workflows.

Code Generation

Generates, explains, and debugs code across multiple programming languages, with demonstrated performance at competitive programming benchmarks including IOI and ICPC.

Mathematical Problem Solving

Achieves gold-medal-level results on the 2025 IMO, CMO, and ICPC World Finals benchmarks, reflecting strong symbolic and numerical reasoning capabilities.

Sparse Attention Efficiency

Uses DeepSeek Sparse Attention (DSA) to reduce attention computation to near-linear complexity (O(kL)), lowering resource requirements for long-context inference.

Open Weights Access

Released under the MIT License with full model weights available on Hugging Face, allowing local deployment and fine-tuning without usage restrictions.

Ready to build with DeepSeek V3.2?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	83.7%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	75.1%
LiveCodeBench	Real-world coding tasks from recent competitions	59.3%
HLE	Questions that challenge frontier models across many domains	10.5%
SciCode	Scientific research coding and numerical methods	38.7%
AIME 2025	American math olympiad problems (2025)	96.0%
SWE-bench Verified	Real GitHub issues requiring multi-file code fixes	77.2%

FAQ

Common questions about DeepSeek V3.2

What is the context window size for DeepSeek-V3.2?

DeepSeek-V3.2 supports a context window of 160,000 tokens, making it suitable for long-document processing, extended conversations, and large codebase analysis.

Is DeepSeek-V3.2 open source?

Yes. DeepSeek-V3.2 is released as an open-weight model under the MIT License. The model weights are publicly available on Hugging Face at huggingface.co/deepseek-ai/DeepSeek-V3.2.

What is the training data cutoff for DeepSeek-V3.2?

Based on the metadata provided, DeepSeek-V3.2 has a training date of December 2025. Specific knowledge cutoff details are documented in the official technical report.

What makes DeepSeek-V3.2 different from earlier DeepSeek models?

DeepSeek-V3.2 introduces three new capabilities not present in earlier versions: DeepSeek Sparse Attention (DSA) for near-linear attention complexity, a scalable reinforcement learning post-training framework, and a large-scale agentic task synthesis pipeline covering 1,800+ environments. It is also the first DeepSeek model to support Thinking in Tool-Use.

Can DeepSeek-V3.2 be run locally?

Yes. Because the model weights are openly available under the MIT License on Hugging Face, developers can download and run DeepSeek-V3.2 locally. Community users have demonstrated running it on hardware configurations such as 16x AMD MI50 32GB GPUs using vLLM.

What types of tasks is DeepSeek-V3.2 best suited for?

DeepSeek-V3.2 is designed for complex reasoning tasks, agentic workflows (including search and code agents), long-context retrieval, mathematical problem solving, and applications that require the model to reason while using external tools.

Community Discussion

What people think about DeepSeek V3.2

Community reception on r/LocalLLaMA was largely positive at launch, with the Hugging Face release announcement thread receiving over 1,000 upvotes and 210 comments, reflecting strong interest in the open-weight release. Users highlighted the model's agentic capabilities and its MIT License as notable attributes.

Some users raised questions about real-world quality shortly after release, as reflected in a thread titled "is the new Deepseek v3.2 that bad?" with 68 comments discussing early impressions. A separate thread documented community experimentation with local hardware, including a 16x AMD MI50 setup achieving 10 tokens per second for text generation using vLLM.

r/LocalLLaMA 71 pts 68 comments

is the new Deepseek v3.2 that bad?

r/LocalLLaMA 1,041 pts 210 comments

deepseek-ai/DeepSeek-V3.2 · Hugging Face

r/LocalLLaMA 463 pts 239 comments

16x AMD MI50 32GB at 10 t/s (tg) & 2k t/s (pp) with Deepseek v3.2 (vllm-gfx906)

View more discussions →

Resources