Text Generation Model

Mistral Small 3.1 (25.03)

Single-node inference model with 128k context window supporting dozens of languages and 80+ coding languages.

Start Building with Mistral Small 3.1 (25.03) View All Models

Publisher

Mistral

Type Text

Context Window 128,000 tokens

Training Data n/a

Input $0.10/MTok

Output $0.30/MTok

Try Mistral Small 3.1 (25.03) →

About Mistral Small 3.1 (25.03)

Multilingual small model with 128k context

Mistral Small 3.1 (25.03) is a text generation model developed by Mistral, released in March 2025. It features a 128,000-token context window, multimodal understanding, and support for dozens of spoken languages alongside more than 80 coding languages. The model is designed to run on a single node, making it practical for deployment without distributed infrastructure.

This version introduces improved text performance and expanded context handling compared to earlier Mistral Small releases. At an inference speed of approximately 150 tokens per second, it is suited for tasks that require both throughput and long-context processing, such as document analysis, multilingual applications, and code generation. Its combination of broad language coverage and single-node efficiency makes it a practical choice for developers building production applications with constrained compute budgets.

Capabilities

What Mistral Small 3.1 (25.03) supports

Long Context Window

Processes up to 128,000 tokens in a single request, enabling analysis of long documents, codebases, or extended conversations without truncation.

Multilingual Text

Supports dozens of spoken languages for generation and comprehension tasks, making it suitable for international and localized applications.

Code Generation

Handles code tasks across 80+ programming languages, including generation, completion, and explanation.

Multimodal Understanding

Accepts image inputs alongside text, allowing the model to reason about visual content within a single prompt.

Fast Inference

Delivers approximately 150 tokens per second, supporting latency-sensitive production workloads on a single node.

Function Calling

Supports structured tool use and function calling, enabling integration with external APIs and agentic workflows.

Ready to build with Mistral Small 3.1 (25.03)?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	52.9%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	38.1%
MATH-500	Undergraduate and competition-level math problems	56.3%
AIME 2024	American math olympiad problems	6.3%
LiveCodeBench	Real-world coding tasks from recent competitions	14.1%
HLE	Questions that challenge frontier models across many domains	4.3%
SciCode	Scientific research coding and numerical methods	15.6%