Gemini 2.5 Flash
Gemini 2.5 Flash is a thinking model that offers great, well-rounded capabilities. It is designed to offer a balance between price and performance.
Balanced thinking model with large context window
Gemini 2.5 Flash is a text generation model developed by Google, designed to balance performance and cost efficiency. It is a thinking model, meaning it applies internal reasoning steps before producing a response, which supports more deliberate outputs across a range of tasks. The model supports a context window of 1,048,576 tokens, making it suitable for processing long documents, extended conversations, and large codebases in a single request.
Gemini 2.5 Flash is well-suited for tasks that require both speed and reasoning, such as summarization, question answering, tool use, and multi-step instruction following. It supports tool integrations, allowing it to be used in agentic workflows where external functions or APIs need to be called. The model reached general availability with a training data cutoff of June 2025, and is accessible through Google's Vertex AI platform.
What Gemini 2.5 Flash supports
Extended Context Window
Processes up to 1,048,576 tokens in a single request, enabling analysis of long documents, large codebases, or extended conversation histories without truncation.
Thinking / Reasoning
Applies internal chain-of-thought reasoning before generating a final response, supporting more deliberate outputs on multi-step or complex tasks.
Tool Use
Supports structured tool and function calling, allowing the model to invoke external APIs or defined functions as part of an agentic workflow.
Low-Latency Output
Optimized for real-time response latency, making it suitable for interactive applications and user-facing products that require timely replies.
Multimodal Input
Accepts text alongside other input modalities including images, enabling tasks like document understanding, visual question answering, and image-based reasoning.
Structured Configuration
Supports numeric and select-type parameters for controlling generation behavior, such as temperature and output length, through the API.
Ready to build with Gemini 2.5 Flash?
Get Started FreeBenchmark scores
Scores represent accuracy — the percentage of questions answered correctly on each test.
| Benchmark | What it tests | Score |
|---|---|---|
| MMLU-Pro | Expert knowledge across 14 academic disciplines | 80.9% |
| GPQA Diamond | PhD-level science questions (biology, physics, chemistry) | 68.3% |
| MATH-500 | Undergraduate and competition-level math problems | 93.2% |
| AIME 2024 | American math olympiad problems | 50.0% |
| LiveCodeBench | Real-world coding tasks from recent competitions | 49.5% |
| HLE | Questions that challenge frontier models across many domains | 5.1% |
| SciCode | Scientific research coding and numerical methods | 29.1% |
Common questions about Gemini 2.5 Flash
What is the context window size for Gemini 2.5 Flash?
Gemini 2.5 Flash supports a context window of 1,048,576 tokens, which allows it to process very long inputs such as large documents or extended conversation histories in a single request.
What is the training data cutoff for Gemini 2.5 Flash?
The model's training data cutoff is June 2025, meaning it has knowledge of events and information up to that date.
Does Gemini 2.5 Flash support tool or function calling?
Yes, Gemini 2.5 Flash supports tool use and function calling, making it suitable for agentic workflows where the model needs to interact with external APIs or defined functions.
What type of model is Gemini 2.5 Flash?
Gemini 2.5 Flash is a thinking model, meaning it uses internal reasoning steps before producing its final output. It is categorized as a text generation model published by Google.
Where can Gemini 2.5 Flash be accessed?
Gemini 2.5 Flash is available through Google's Vertex AI platform and can also be used via MindStudio without requiring separate API key management.
What people think about Gemini 2.5 Flash
Community discussions around Gemini 2.5 Flash reflect general interest in its image editing capabilities and the rollout of related model variants, with posts about the Flash Image model receiving hundreds of upvotes for demonstrating high-level image edits. Users have also noted Google's informal naming of a Flash image preview variant as "Nano Banana," which generated notable engagement.
A recurring concern in the community is pricing, with one widely discussed thread noting that the cost of thinking output tokens doubled from $0.15 to $0.30 after the model reached general availability. This pricing change prompted significant discussion among developers evaluating the model for cost-sensitive production use cases.
Google is now officially calling "Gemini 2.5 Flash image preview", "Nano Banana"
Google doubled the price of Gemini 2.5 Flash thinking output after GA from 0.15 to 0.30 what
Google's new Gemini 2.5 Flash Image model can do some very impressive high-level image edits
BREAKING: OpenAI releases "GPT-Image-1.5" (ChatGPT Images) & It instantly takes the #1 Spot on LMArena, beating Google's Nano Banana Pro.
Parameters & options
Must be less than Max Response Size
Explore similar models
Start building with Gemini 2.5 Flash
No API keys required. Create AI-powered workflows with Gemini 2.5 Flash in minutes — free.