Skip to main content
MindStudio
Pricing
Blog About
My Workspace
Text Generation Model

Gemini 2.5 Flash

Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.

Publisher Google
Type Text
Context Window 1,048,576 tokens
Training Data June 2025
Input $0.30/MTok
Output $2.50/MTok
LARGE CONTEXTREAL-TIME LATENCYTOOLS

Balanced thinking model with large context window

Gemini 2.5 Flash is a text generation model developed by Google, designed to balance performance and cost efficiency. It is a thinking model, meaning it applies internal reasoning steps before producing a response, which supports more deliberate outputs across a range of tasks. The model supports a context window of 1,048,576 tokens, making it suitable for processing long documents, extended conversations, and large codebases in a single request.

Gemini 2.5 Flash is well-suited for tasks that require both speed and reasoning, such as summarization, question answering, tool use, and multi-step instruction following. It supports tool integrations, allowing it to be used in agentic workflows where external functions or APIs need to be called. The model reached general availability with a training data cutoff of June 2025, and is accessible through Google's Vertex AI platform.

What Gemini 2.5 Flash supports

Extended Context Window

Processes up to 1,048,576 tokens in a single request, enabling analysis of long documents, large codebases, or extended conversation histories without truncation.

Thinking / Reasoning

Applies internal chain-of-thought reasoning before generating a final response, supporting more deliberate outputs on multi-step or complex tasks.

Tool Use

Supports structured tool and function calling, allowing the model to invoke external APIs or defined functions as part of an agentic workflow.

Low-Latency Output

Optimized for real-time response latency, making it suitable for interactive applications and user-facing products that require timely replies.

Multimodal Input

Accepts text alongside other input modalities including images, enabling tasks like document understanding, visual question answering, and image-based reasoning.

Structured Configuration

Supports numeric and select-type parameters for controlling generation behavior, such as temperature and output length, through the API.

Ready to build with Gemini 2.5 Flash?

Get Started Free

Benchmark scores

Scores represent accuracy — the percentage of questions answered correctly on each test.

Benchmark What it tests Score
MMLU-Pro Expert knowledge across 14 academic disciplines 80.9%
GPQA Diamond PhD-level science questions (biology, physics, chemistry) 68.3%
MATH-500 Undergraduate and competition-level math problems 93.2%
AIME 2024 American math olympiad problems 50.0%
LiveCodeBench Real-world coding tasks from recent competitions 49.5%
HLE Questions that challenge frontier models across many domains 5.1%
SciCode Scientific research coding and numerical methods 29.1%

Common questions about Gemini 2.5 Flash

What is the context window size for Gemini 2.5 Flash?

Gemini 2.5 Flash supports a context window of 1,048,576 tokens, which allows it to process very long inputs such as large documents or extended conversation histories in a single request.

What is the training data cutoff for Gemini 2.5 Flash?

The model's training data cutoff is June 2025, meaning it has knowledge of events and information up to that date.

Does Gemini 2.5 Flash support tool or function calling?

Yes, Gemini 2.5 Flash supports tool use and function calling, making it suitable for agentic workflows where the model needs to interact with external APIs or defined functions.

What type of model is Gemini 2.5 Flash?

Gemini 2.5 Flash is a thinking model, meaning it uses internal reasoning steps before producing its final output. It is categorized as a text generation model published by Google.

Where can Gemini 2.5 Flash be accessed?

Gemini 2.5 Flash is available through Google's Vertex AI platform and can also be used via MindStudio without requiring separate API key management.

What people think about Gemini 2.5 Flash

Community discussions around Gemini 2.5 Flash reflect general interest in its image editing capabilities and the rollout of related model variants, with posts about the Flash Image model receiving hundreds of upvotes for demonstrating high-level image edits. Users have also noted Google's informal naming of a Flash image preview variant as "Nano Banana," which generated notable engagement.

A recurring concern in the community is pricing, with one widely discussed thread noting that the cost of thinking output tokens doubled from $0.15 to $0.30 after the model reached general availability. This pricing change prompted significant discussion among developers evaluating the model for cost-sensitive production use cases.

View more discussions →

Parameters & options

Max Temperature 2
Max Response Size 65,535 tokens
Thinking Budget Select
Default: auto
OffManualAuto
Thinking Budget Limit Number

Must be less than Max Response Size

Range: 1–24576

Start building with Gemini 2.5 Flash

No API keys required. Create AI-powered workflows with Gemini 2.5 Flash in minutes — free.