Text Generation Model

Kimi K2.5

Moonshot AI's most powerful open-source model — a 1T-parameter native multimodal agent capable of visual understanding, advanced coding, and coordinated multi-agent task execution.

Start Building with Kimi K2.5 View All Models

Publisher

DeepSeek

Type Text

Context Window 262,144 tokens

Training Data January 2026

Input $0.45/MTok

Output $2.80/MTok

Provider

DeepInfra

Try Kimi K2.5 →

About Kimi K2.5

Multimodal MoE model with native agent swarm

Kimi K2.5 is an open-source multimodal model developed by Moonshot AI and released in January 2026. It uses a Mixture-of-Experts architecture with 1 trillion total parameters and approximately 32 billion active at inference time, trained on roughly 15 trillion mixed visual and text tokens. Unlike models that add vision as a secondary capability, Kimi K2.5 was trained natively on both image and text data, enabling integrated understanding of charts, documents, video, and code.

The model supports two operating modes — Instant Mode for direct responses and Thinking Mode for step-by-step reasoning on complex problems — within a 256,000-token context window. It introduces an Agent Swarm paradigm that can coordinate up to 100 parallel sub-agents, reducing execution time by 4.5x on parallelizable tasks. Kimi K2.5 is released under a modified MIT license, making it available for local deployment, fine-tuning, and commercial use, and is particularly suited for visual programming, document analysis, automated research, and multi-step agentic workflows.

Capabilities

What Kimi K2.5 supports

Visual Understanding

Processes images, charts, documents, and video natively, achieving scores of 90.1 on MathVista, 92.3 on OCRBench, and 87.4 on VideoMME.

Advanced Coding

Handles real-world software engineering tasks, scoring 76.8% on SWE-Bench Verified and 85.0% on LiveCodeBench v6.

Mathematical Reasoning

Applies step-by-step reasoning to math and science problems, scoring 96.1% on AIME 2025 and 87.6% on GPQA-Diamond.

Agent Swarm Execution

Coordinates up to 100 parallel sub-agents for complex workflows, achieving a 4.5x reduction in execution time on parallelizable tasks and 78.4% on BrowseComp.

Long Context Processing

Supports a 256,000-token context window, enabling analysis of long documents, extended codebases, and lengthy video content in a single pass.

Dual Inference Modes

Offers Instant Mode for fast, direct responses and Thinking Mode for deep, iterative reasoning on complex problems.

MoE Architecture

Uses a 1 trillion parameter Mixture-of-Experts design with ~32 billion parameters active per forward pass, balancing capacity with inference efficiency.

Ready to build with Kimi K2.5?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	87.9%
HLE	Questions that challenge frontier models across many domains	29.4%
SciCode	Scientific research coding and numerical methods	49.0%
AIME 2025	American math olympiad problems (2025)	96.1%
MMLU-Pro	Expert knowledge across 14 academic disciplines	87.1%
SWE-bench Verified	Real GitHub issues requiring multi-file code fixes	76.8%
SWE-bench Pro	Challenging real-world software engineering tasks	50.7%
Terminal-Bench 2.0	Agentic coding and terminal command tasks	50.8%
LiveCodeBench	Real-world coding tasks from recent competitions	85.0%
BrowseComp	Complex web browsing and information retrieval	60.6%
OSWorld-Verified	Autonomous computer use and desktop tasks	63.3%

FAQ

Common questions about Kimi K2.5

What is the context window for Kimi K2.5?

Kimi K2.5 supports a context window of 262,144 tokens (256K), allowing it to process long documents, extended codebases, and lengthy video content in a single session.

Is Kimi K2.5 open-source and can it be used commercially?

Yes. Kimi K2.5 is released under a modified MIT license, which permits local deployment, fine-tuning, and integration into commercial applications.

What is the training data cutoff for Kimi K2.5?

Based on the available metadata, Kimi K2.5 was released in January 2026. A specific training data cutoff date is not stated in the provided metadata.

How does the Agent Swarm feature work?

Kimi K2.5 introduces an Agent Swarm paradigm that can coordinate up to 100 parallel sub-agents to execute complex, multi-step tasks. On parallelizable workloads, this reduces execution time by approximately 4.5x compared to sequential execution.

What are the two inference modes available in Kimi K2.5?

Kimi K2.5 supports Instant Mode, which provides fast and direct responses suited for everyday tasks, and Thinking Mode, which performs deep step-by-step reasoning for complex problems such as advanced math or multi-stage coding challenges.

How many parameters does Kimi K2.5 have, and how many are active at inference?

Kimi K2.5 has 1 trillion total parameters in a Mixture-of-Experts architecture, with approximately 32 billion parameters active at any given inference step.

Community Discussion

What people think about Kimi K2.5

Community discussions reflect positive sentiment toward Kimi K2.5's benchmark performance, with one thread noting that it scores in territory previously occupied only by proprietary models as of February 2026. Practical evaluations, including an informal multi-model comparison involving real-world business scenarios, placed it among a small group of models that performed reliably.

Some community interest centers on how models like Kimi K2.5 handle ambiguous or nonsensical prompts, as a dedicated benchmark thread explored whether models push back appropriately rather than generating confident but incorrect answers. No major reliability or safety concerns specific to Kimi K2.5 were surfaced in the threads reviewed.

r/singularity 49 pts 12 comments

Open-source LLMs are now within single digits of proprietary models on most benchmarks. February 2026 rankings show GLM-5, Kimi K2.5, and DeepSeek V3.2 all scoring in what was frontier-only territory a year ago.

r/LocalLLaMA 832 pts 241 comments

I gave 12 LLMs $2,000 and a food truck. Only 4 survived.

r/singularity 790 pts 168 comments

Bullshit Benchmark - A benchmark for testing whether models identify and push back on nonsensical prompts instead of confidently answering them

View more discussions →

Resources