Model Overview
Gemini 2.5 Flash is Google's best model in terms of price-performance, offering well-rounded capabilities with adaptive thinking.
Key Features
- High intelligence (3/4 dots rating)
- Fast speed (4/5 lightning bolts rating)
- 1,048,576 context window
- 65,536 max output tokens
- January 2025 knowledge cutoff
- Audio, images, video, and text input support
- Text output support
Technical Specifications
- Model Code: gemini-2.5-flash-preview-05-20
- Supports: Input: audio, images, video, text; Output: text only
- Features: Caching, code execution, function calling, search grounding, structured outputs, thinking
- Pricing:
- Input: $0.15 per 1M tokens (text/image/video), $1.00 per 1M tokens (audio)
- Output: $0.60 per 1M tokens (non-thinking), $3.50 per 1M tokens (thinking)
- Context caching: $0.0375 per 1M tokens (text/image/video), $0.25 per 1M tokens (audio), $1.00/1M tokens per hour storage
- TTS: $0.50 input, $10.00 output per 1M tokens
- Free Tier: Available
Snapshots
- gemini-2.5-flash-preview-05-20
Positioning and Use Cases
Model thinks as needed or can be configured with a thinking budget. Best for low latency, high volume tasks that require thinking. Optimized for adaptive thinking and cost efficiency across diverse tasks.
Rate Limits
- More restricted rate limits since it is an experimental/preview model
Documentation
Official Documentation