Llama 4 Scout
Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.
Multimodal MoE model with 17B active parameters
Llama 4 Scout is a multimodal AI model developed by Meta, released in early 2025 as part of the Llama 4 model family. It uses a Mixture of Experts (MoE) architecture with 17 billion active parameters, 16 experts, and 109 billion total parameters, meaning only a subset of parameters is activated per token during inference. The model processes both text and image inputs within a unified backbone and supports a 130,000-token context window.
Llama 4 Scout is designed for developers and enterprises building applications that require combined text and vision understanding. Its MoE design makes it more compute-efficient during training and inference compared to dense models of similar total parameter counts. On MindStudio, it is served via Groq, which provides low-latency inference for the instruct-tuned variant.
What Llama 4 Scout supports
Multimodal Input
Processes both text and image inputs within a single unified model backbone, enabling tasks that combine visual and language understanding.
Long Context Window
Supports up to 130,000 tokens of context, allowing it to handle long documents, extended conversations, or large code files in a single request.
Mixture of Experts
Uses a 16-expert MoE architecture with 109 billion total parameters, activating only 17 billion per token to reduce compute cost while maintaining output quality.
Instruction Following
Fine-tuned as an instruct model, enabling it to follow natural language instructions for tasks like summarization, Q&A, and structured generation.
Fast Inference via Groq
Served on Groq's LPU infrastructure, which is designed to deliver low-latency token generation for real-time applications.
Code Generation
Capable of generating, explaining, and debugging code across common programming languages as part of its general instruction-following training.
Ready to build with Llama 4 Scout?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 75.2% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 58.7% |
| MATH-500 | Undergraduate and competition-level math problems | 84.4% |
| AIME 2024 | American math olympiad problems | 28.3% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 29.9% |
| HLE | Questions that challenge frontier models across many domains | 4.3% |
| SciCode | Scientific research coding and numerical methods | 17.0% |
Common questions about Llama 4 Scout
What is the context window for Llama 4 Scout?
Llama 4 Scout supports a context window of 130,000 tokens, which allows for long documents, extended conversations, or large inputs to be processed in a single request.
How many parameters does Llama 4 Scout have?
Llama 4 Scout has 109 billion total parameters, but uses a Mixture of Experts architecture that activates only 17 billion parameters per token during inference.
Does Llama 4 Scout support image inputs?
Yes. Llama 4 Scout is a multimodal model that can process both text and image inputs within a unified model backbone.
When was Llama 4 Scout trained?
According to the model metadata, Llama 4 Scout's training data has a cutoff in early 2025.
Who publishes Llama 4 Scout and where is it hosted on MindStudio?
Llama 4 Scout is developed and published by Meta. On MindStudio, it is served via Groq using the llama-4-scout-17b-16e-instruct model variant.
What people think about Llama 4 Scout
Community reception of Llama 4 Scout on Reddit has been mixed, with some users acknowledging the model's architectural novelty and its availability on platforms like Hugging Face shortly after release. However, the most upvoted threads reflect significant disappointment, with many users feeling the model did not meet expectations set by Meta's pre-release benchmarks.
Common concerns include perceived gaps between benchmark performance and real-world usability, as well as comparisons to what users hoped the Llama 4 generation would deliver. The threads with the highest engagement (2,179 and 541 upvotes respectively) both center on unmet expectations rather than specific use case successes.
Meta's Llama 4 Fell Short
I'm incredibly disappointed with Llama-4
meta-llama/Llama-4-Scout-17B-16E · Hugging Face
Documentation & links
Parameters & options
Explore similar models
Start building with Llama 4 Scout
No API keys required. Create AI-powered workflows with Llama 4 Scout in minutes — free.