Gemini 2.5 Flash Preview

Model Overview

Gemini 2.5 Flash is Google's best model in terms of price-performance, offering well-rounded capabilities with adaptive thinking.

Key Features

  • High intelligence (3/4 dots rating)
  • Fast speed (4/5 lightning bolts rating)
  • 1,048,576 context window
  • 65,536 max output tokens
  • January 2025 knowledge cutoff
  • Audio, images, video, and text input support
  • Text output support

Technical Specifications

  • Model Code: gemini-2.5-flash-preview-05-20
  • Supports: Input: audio, images, video, text; Output: text only
  • Features: Caching, code execution, function calling, search grounding, structured outputs, thinking
  • Pricing:
    • Input: $0.15 per 1M tokens (text/image/video), $1.00 per 1M tokens (audio)
    • Output: $0.60 per 1M tokens (non-thinking), $3.50 per 1M tokens (thinking)
    • Context caching: $0.0375 per 1M tokens (text/image/video), $0.25 per 1M tokens (audio), $1.00/1M tokens per hour storage
    • TTS: $0.50 input, $10.00 output per 1M tokens
  • Free Tier: Available

Snapshots

  • gemini-2.5-flash-preview-05-20

Positioning and Use Cases

Model thinks as needed or can be configured with a thinking budget. Best for low latency, high volume tasks that require thinking. Optimized for adaptive thinking and cost efficiency across diverse tasks.

Rate Limits

  • More restricted rate limits since it is an experimental/preview model

Documentation

Official Documentation

Google

Next-generation AI models backed by powerful technical expertise

Gemini 2.5 Flash Preview

Parameters 4/5 lightning bolts rating
Output tokens 65,536 tokens

Gemini 2.5 Flash is Google's best model in terms of price-performance, offering well-rounded capabilities with adaptive thinking.

Official: $0.15 • $0.6 Our Price: $0.12 • $0.48 Save 20%

Frequently Asked Questions

What is the uptime guarantee?
We guarantee 99.9% uptime with our enterprise-grade infrastructure and redundant systems.
How is pricing calculated?
Pricing is based on the number of tokens processed. Both input and output tokens are counted in the final cost.
What is the difference between GPT-4 and GPT-4 Turbo?
GPT-4 Turbo is the latest version with improved performance, longer context window, and more recent knowledge cutoff date.