Text Generation Model

Llama 4 Scout

Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.

Start Building with Llama 4 Scout View All Models

Publisher

Multimodal MoE model with 17B active parameters

Llama 4 Scout is a multimodal AI model developed by Meta, released in early 2025 as part of the Llama 4 model family. It uses a Mixture of Experts (MoE) architecture with 17 billion active parameters, 16 experts, and 109 billion total parameters, meaning only a subset of parameters is activated per token during inference. The model processes both text and image inputs within a unified backbone and supports a 130,000-token context window.

Llama 4 Scout is designed for developers and enterprises building applications that require combined text and vision understanding. Its MoE design makes it more compute-efficient during training and inference compared to dense models of similar total parameter counts. On MindStudio, it is served via Groq, which provides low-latency inference for the instruct-tuned variant.

Capabilities

What Llama 4 Scout supports

Multimodal Input

Processes both text and image inputs within a single unified model backbone, enabling tasks that combine visual and language understanding.

Long Context Window

Supports up to 130,000 tokens of context, allowing it to handle long documents, extended conversations, or large code files in a single request.

Mixture of Experts

Uses a 16-expert MoE architecture with 109 billion total parameters, activating only 17 billion per token to reduce compute cost while maintaining output quality.

Instruction Following

Fine-tuned as an instruct model, enabling it to follow natural language instructions for tasks like summarization, Q&A, and structured generation.

Fast Inference via Groq

Served on Groq's LPU infrastructure, which is designed to deliver low-latency token generation for real-time applications.

Code Generation

Capable of generating, explaining, and debugging code across common programming languages as part of its general instruction-following training.

Ready to build with Llama 4 Scout?

Get Started Free

Performance

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark	What it tests	Score
MMLU-Pro	Expert knowledge across 14 academic disciplines	75.2%
GPQA Diamond	PhD-level science questions (biology, physics, chemistry)	58.7%
MATH-500	Undergraduate and competition-level math problems	84.4%
AIME 2024	American math olympiad problems	28.3%
LiveCodeBench	Real-world coding tasks from recent competitions	29.9%
HLE	Questions that challenge frontier models across many domains	4.3%
SciCode	Scientific research coding and numerical methods	17.0%

FAQ

Common questions about Llama 4 Scout

What is the context window for Llama 4 Scout?

Llama 4 Scout supports a context window of 130,000 tokens, which allows for long documents, extended conversations, or large inputs to be processed in a single request.

How many parameters does Llama 4 Scout have?

Llama 4 Scout has 109 billion total parameters, but uses a Mixture of Experts architecture that activates only 17 billion parameters per token during inference.

Does Llama 4 Scout support image inputs?

Yes. Llama 4 Scout is a multimodal model that can process both text and image inputs within a unified model backbone.

When was Llama 4 Scout trained?

According to the model metadata, Llama 4 Scout's training data has a cutoff in early 2025.

Who publishes Llama 4 Scout and where is it hosted on MindStudio?

Llama 4 Scout is developed and published by Meta. On MindStudio, it is served via Groq using the llama-4-scout-17b-16e-instruct model variant.

Community Discussion

What people think about Llama 4 Scout

Community reception of Llama 4 Scout on Reddit has been mixed, with some users acknowledging the model's architectural novelty and its availability on platforms like Hugging Face shortly after release. However, the most upvoted threads reflect significant disappointment, with many users feeling the model did not meet expectations set by Meta's pre-release benchmarks.

Common concerns include perceived gaps between benchmark performance and real-world usability, as well as comparisons to what users hoped the Llama 4 generation would deliver. The threads with the highest engagement (2,179 and 541 upvotes respectively) both center on unmet expectations rather than specific use case successes.

r/LocalLLaMA 2,179 pts 194 comments

Meta's Llama 4 Fell Short

r/LocalLLaMA 541 pts 248 comments

I'm incredibly disappointed with Llama-4

r/LocalLLaMA 17 pts 0 comments

meta-llama/Llama-4-Scout-17B-16E · Hugging Face

View more discussions →

Resources