Mistral Large 3
Mistral Large 3 is Mistral’s first mixture-of-experts model since the seminal Mixtral series, and represents a substantial step forward in pretraining at Mistral.
Open-weight MoE model with multilingual and vision support
Mistral Large 3 is a 675-billion-parameter mixture-of-experts (MoE) text generation model developed by Mistral. It is the first MoE model Mistral has released since the Mixtral series, and was trained from scratch on 3,000 NVIDIA H200 GPUs. The model is released under a permissive open-weight license, making the weights publicly available for download and self-hosting.
Mistral Large 3 supports a 256,000-token context window and includes image understanding alongside text generation. It is particularly noted for multilingual conversation handling, with Mistral highlighting non-English and non-Chinese language performance as a focus area. The model is well-suited for tasks requiring long-context reasoning, multilingual text processing, and instruction following across general-purpose prompts.
What Mistral Large 3 supports
Long Context Window
Processes up to 256,000 tokens in a single context, enabling analysis of long documents, codebases, or extended conversations.
Mixture-of-Experts Architecture
Uses a sparse MoE design across 675 billion total parameters, activating only a subset of experts per token during inference.
Multilingual Text Generation
Handles conversations and instructions in a wide range of languages, with Mistral specifically highlighting performance on non-English and non-Chinese languages.
Image Understanding
Accepts image inputs alongside text, enabling tasks such as visual question answering and image-based reasoning.
Open-Weight Access
Model weights are publicly available on Hugging Face under a permissive license, supporting local deployment and fine-tuning.
Instruction Following
Post-training aligns the model to follow general-purpose instructions, with Mistral reporting parity with leading instruction-tuned open-weight models on general prompts.
Ready to build with Mistral Large 3?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 80.7% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 68.0% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 46.5% |
| HLE | Questions that challenge frontier models across many domains | 4.1% |
| SciCode | Scientific research coding and numerical methods | 36.2% |
Common questions about Mistral Large 3
What is the context window for Mistral Large 3?
Mistral Large 3 supports a context window of 256,000 tokens.
How many parameters does Mistral Large 3 have?
Mistral Large 3 has 675 billion total parameters and uses a mixture-of-experts architecture, meaning only a subset of parameters are active for any given token.
Is Mistral Large 3 open source?
Yes. Mistral Large 3 is released as an open-weight model, meaning the weights are publicly available. The model can be downloaded from Hugging Face and run locally or fine-tuned.
What input types does Mistral Large 3 support?
Mistral Large 3 supports text input and also includes image understanding capabilities, allowing it to process image inputs alongside text prompts.
What hardware was used to train Mistral Large 3?
According to Mistral, the model was trained from scratch on 3,000 NVIDIA H200 GPUs.
Is there a knowledge cutoff date for Mistral Large 3?
A specific training data cutoff date has not been published in the available metadata for Mistral Large 3.
What people think about Mistral Large 3
Community discussion on r/LocalLLaMA has been active around Mistral Large 3, with threads covering its availability on Hugging Face and interest in GGUF quantizations for local deployment. Users have shown enthusiasm for the model's open-weight release and its large 675B parameter count.
Some users have expressed mixed impressions about real-world performance relative to expectations, as reflected in a thread titled "Unimpressed with Mistral Large 3 675B." Discussions have also touched on inference framework support, including upcoming vLLM compatibility and EQ-Bench evaluation results.
Unimpressed with Mistral Large 3 675B
Mistral 3 Large 675B up on huggingface
Who’s got them Q_001_X_S_REAP Mistral Large 3 GGUFs?
EQ-Bench updates: Gpt-5.2, Opus 4.5, Mistral Large 3 and Nanbeige4-3B
Upcoming vllm Mistral Large 3 support
Parameters & options
Explore similar models
Start building with Mistral Large 3
No API keys required. Create AI-powered workflows with Mistral Large 3 in minutes — free.