Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Text Generation Model

Llama 4 Scout

Llama 4 Scout is a powerful multimodal model with 17 billion active parameters, offering state-of-the-art performance in its class.

Publisher Meta
Type Text
Context Window 130,000 tokens
Training Data Early 2025
Input $0.10/MTok
Output $0.30/MTok
Provider DeepInfra

Multimodal MoE model with 17B active parameters

Llama 4 Scout is a multimodal AI model developed by Meta, released in early 2025 as part of the Llama 4 model family. It uses a Mixture of Experts (MoE) architecture with 17 billion active parameters, 16 experts, and 109 billion total parameters, processing both text and image inputs through a unified model backbone. The model supports a 130,000-token context window and is available under Meta's Llama 4 Community License.

Llama 4 Scout is designed for developers and enterprises building applications that require multimodal understanding across text and vision. Its MoE design activates only a subset of parameters per token, making inference more compute-efficient relative to dense models of comparable total parameter count. It is well-suited for tasks such as document analysis, image-grounded question answering, and long-context text generation.

What Llama 4 Scout supports

Multimodal Input

Processes both text and image inputs within a single unified model backbone, enabling vision-language tasks like image-grounded Q&A.

Long Context Window

Supports up to 130,000 tokens of context, allowing analysis of lengthy documents or multi-turn conversations without truncation.

Mixture of Experts

Uses a 16-expert MoE architecture that activates only 17B of 109B total parameters per token, reducing compute requirements during inference.

Text Generation

Generates coherent, instruction-following text responses across a wide range of tasks including summarization, drafting, and question answering.

Instruction Following

Fine-tuned as an instruct model to follow user and system prompts, making it suitable for chat and task-oriented applications.

Ready to build with Llama 4 Scout?

Get Started Free

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark What it tests Score
MMLU-Pro Expert knowledge across 14 academic disciplines 75.2%
GPQA Diamond PhD-level science questions (biology, physics, chemistry) 58.7%
MATH-500 Undergraduate and competition-level math problems 84.4%
AIME 2024 American math olympiad problems 28.3%
LiveCodeBench Real-world coding tasks from recent competitions 29.9%
HLE Questions that challenge frontier models across many domains 4.3%
SciCode Scientific research coding and numerical methods 17.0%

Common questions about Llama 4 Scout

What is the context window size for Llama 4 Scout?

Llama 4 Scout supports a context window of 130,000 tokens, which allows it to handle long documents and extended conversations in a single pass.

How many parameters does Llama 4 Scout have?

Llama 4 Scout has 109 billion total parameters, but uses a Mixture of Experts architecture that activates only 17 billion parameters per token during inference.

What input types does Llama 4 Scout support?

Llama 4 Scout is a multimodal model that accepts both text and image inputs, processing them through a unified model backbone.

Who developed Llama 4 Scout and when was it released?

Llama 4 Scout was developed by Meta and released in early 2025 as part of the Llama 4 model family.

What is the knowledge cutoff for Llama 4 Scout?

The model's training data has a cutoff in early 2025, based on the training date provided in the model metadata.

What people think about Llama 4 Scout

Community reception of Llama 4 Scout on Reddit has been mixed, with some users acknowledging its multimodal architecture and efficient MoE design as technically interesting. However, the most upvoted threads reflect significant disappointment, with many users feeling the model did not meet expectations set by Meta's announcements.

Common criticisms include concerns about real-world performance not matching benchmark claims, and frustration with the gap between marketing and practical results. Threads discussing the Hugging Face release received comparatively little engagement, suggesting the broader community response was dominated by negative sentiment at launch.

View more discussions →

Parameters & options

Max Temperature 1
Max Response Size 60,000 tokens

Start building with Llama 4 Scout

No API keys required. Create AI-powered workflows with Llama 4 Scout in minutes — free.