Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Text Generation Model

Llama 4 Maverick

Llama 4 Maverick is a high-performance multimodal model with 17 billion active parameters, supporting 12 languages and offering industry-leading image and text understanding.

Publisher Meta
Type Text
Context Window 130,000 tokens
Training Data Early 2025
Input $0.20/MTok
Output $0.60/MTok
Provider DeepInfra

Multimodal MoE model with 128 experts

Llama 4 Maverick is a multimodal mixture-of-experts model developed by Meta, released in early 2025. It has 17 billion active parameters drawn from a pool of 400 billion total parameters across 128 experts, and supports both text and image inputs. The model handles 12 languages and offers a 130,000-token context window, making it suited for long-document and multilingual tasks.

Maverick is designed for general assistant and chat use cases, with particular strengths in image understanding and creative writing. It uses a sparse MoE architecture, meaning only a subset of parameters are activated per inference pass, which allows the model to deliver broad capability at a more efficient compute cost. Developers building applications that require cross-language support, visual reasoning, or extended context handling are the primary target audience for this model.

What Llama 4 Maverick supports

Multimodal Input

Accepts both text and image inputs in a single prompt, enabling tasks like visual question answering and image-based reasoning.

Long Context Window

Supports up to 130,000 tokens of context, allowing processing of long documents, extended conversations, or large code files in a single request.

Multilingual Support

Handles 12 languages natively, enabling chat and assistant tasks across a range of international languages without translation preprocessing.

Mixture-of-Experts Architecture

Uses 128 experts with 17 billion active parameters per forward pass out of 400 billion total, enabling broad capability with selective parameter activation.

Creative Writing

Generates structured and open-ended written content with attention to tone, with Meta noting response quality and tone as explicit design focuses.

Instruction Following

Tuned as an instruct model with built-in refusal mechanisms, designed to follow user instructions accurately while maintaining safety guardrails.

Ready to build with Llama 4 Maverick?

Get Started Free

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark What it tests Score
MMLU-Pro Expert knowledge across 14 academic disciplines 80.9%
GPQA Diamond PhD-level science questions (biology, physics, chemistry) 67.1%
MATH-500 Undergraduate and competition-level math problems 88.9%
AIME 2024 American math olympiad problems 39.0%
LiveCodeBench Real-world coding tasks from recent competitions 39.7%
HLE Questions that challenge frontier models across many domains 4.8%
SciCode Scientific research coding and numerical methods 33.1%

Common questions about Llama 4 Maverick

What is the context window for Llama 4 Maverick?

Llama 4 Maverick supports a context window of 130,000 tokens, which allows it to process long documents, extended conversations, or large inputs in a single request.

How many parameters does Llama 4 Maverick have?

The model has 400 billion total parameters across 128 experts, but only 17 billion parameters are active during any single inference pass due to its mixture-of-experts architecture.

What languages does Llama 4 Maverick support?

Llama 4 Maverick supports 12 languages, making it suitable for multilingual assistant and chat applications.

What types of inputs does Llama 4 Maverick accept?

The model is multimodal and accepts both text and image inputs, enabling use cases such as visual question answering and image-based reasoning alongside standard text tasks.

When was Llama 4 Maverick trained?

According to the available metadata, Llama 4 Maverick has a training date of early 2025. A precise knowledge cutoff date has not been publicly specified in the available documentation.

What people think about Llama 4 Maverick

Community reception of Llama 4 Maverick has been mixed, with some users expressing disappointment in the model's real-world performance relative to expectations set by benchmark results. A thread with over 500 upvotes on r/LocalLLaMA specifically called out a gap between announced capabilities and observed outputs shortly after release.

Discussions across r/LocalLLaMA also include comparisons of Llama 4 against other open-weight models, with users evaluating it alongside competing releases in community-run grading exercises. Concerns around benchmark reliability and practical usability in local deployments appear frequently in these threads.

View more discussions →

Parameters & options

Max Temperature 1
Max Response Size 60,000 tokens

Start building with Llama 4 Maverick

No API keys required. Create AI-powered workflows with Llama 4 Maverick in minutes — free.