Text Generation Model

DeepSeek-R1

Reasoning LLM from Chinese AI company DeepSeek utilizing Chain of Thought.

Start Building with DeepSeek-R1 View All Models

Publisher

DeepSeek

Type Text

Context Window 64,000 tokens

Training Data Late 2024

Input $0.55/MTok

Output $2.19/MTok

Provider

DeepInfra

FLAGSHIPREASONING

Try DeepSeek-R1 →

About DeepSeek-R1

Chain-of-thought reasoning for complex problems

DeepSeek-R1 is a text generation model developed by DeepSeek, a Chinese AI company. It is a reasoning-focused model that generates a Chain of Thought (CoT) before producing a final answer, a technique designed to improve accuracy on multi-step problems. The model was trained through late 2024 and supports a context window of 64,000 tokens. DeepSeek released the model weights publicly, making it available for local deployment and research use.

DeepSeek-R1 is well suited for tasks that benefit from structured reasoning, such as mathematics, logic puzzles, coding challenges, and scientific problem-solving. Because the model externalizes its reasoning steps before answering, users can inspect the thought process that led to a given response. DeepSeek also released a series of distilled versions of R1 based on smaller base models, broadening its accessibility across different hardware configurations.

Capabilities

What DeepSeek-R1 supports

Chain-of-Thought Reasoning

Generates an explicit reasoning trace before producing a final answer, allowing multi-step problems to be broken down systematically. This CoT process is visible in the model's output.

Math & Logic

Applies step-by-step reasoning to solve mathematical and logical problems, including proofs, equations, and structured inference tasks.

Code Generation

Produces and debugs code across common programming languages, using its reasoning process to work through algorithmic problems before outputting a solution.

Long-Context Processing

Handles input and output sequences within a 64,000-token context window, supporting analysis of lengthy documents or extended multi-turn conversations.

Open Weights Access

Model weights are publicly released by DeepSeek, enabling local deployment and fine-tuning without relying solely on the hosted API.

Ready to build with DeepSeek-R1?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	84.9%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	81.3%
MATH-500	Undergraduate and competition-level math problems	98.3%
AIME 2024	American math olympiad problems	89.3%
LiveCodeBench	Real-world coding tasks from recent competitions	77.0%
HLE	Questions that challenge frontier models across many domains	14.9%
SciCode	Scientific research coding and numerical methods	40.3%

FAQ

Common questions about DeepSeek-R1

What is the context window for DeepSeek-R1?

DeepSeek-R1 supports a context window of 64,000 tokens, which covers both input and output combined.

What makes DeepSeek-R1 different from a standard text generation model?

DeepSeek-R1 generates a Chain of Thought (CoT) before delivering its final answer. This means the model works through reasoning steps explicitly, which is intended to improve accuracy on complex or multi-step tasks.

What is the training data cutoff for DeepSeek-R1?

Based on the available metadata, DeepSeek-R1 was trained through late 2024. It does not have knowledge of events after that period.

Is DeepSeek-R1 available as open weights?

Yes. DeepSeek released the model weights for DeepSeek-R1 publicly on Hugging Face, allowing users to run the model locally or fine-tune it independently of the hosted API.

What types of tasks is DeepSeek-R1 best suited for?

DeepSeek-R1 is designed for tasks that benefit from structured reasoning, including mathematics, logic, coding, and scientific problem-solving. Its CoT approach makes it particularly useful when intermediate reasoning steps matter.

Community Discussion

What people think about DeepSeek-R1

Community discussion around DeepSeek-R1 on r/LocalLLaMA has been largely positive, with users praising the model's reasoning capabilities and the quality of its updated releases. The May 2025 update (R1-0528) generated significant engagement, with multiple high-upvote threads highlighting strong performance across a range of tasks.

Some threads reflect enthusiasm about running the model locally given its open weights, while others note that the model's reasoning trace can be verbose, which may affect latency in production use cases. The R1-0528 update in particular drew attention for improvements over the original January 2025 release.

r/LocalLLaMA 313 pts 52 comments

It's been one year since the release of Deepseek-R1

r/LocalLLaMA 859 pts 262 comments

deepseek-ai/DeepSeek-R1-0528

r/LocalLLaMA 959 pts 184 comments

DeepSeek R1 05 28 Tested. It finally happened. The ONLY model to score 100% on everything I threw at it.

r/LocalLLaMA 613 pts 203 comments

DeepSeek: R1 0528 is lethal

View more discussions →

Resources