Gemini 2.0 Flash

Model Overview

Gemini 2.0 Flash is Google's newest multimodal model with next generation features and improved capabilities, built to power agentic experiences.

Key Features

High intelligence (3/4 dots rating)
Very fast speed (5/5 lightning bolts rating)
1,048,576 context window
8,192 max output tokens
August 2024 knowledge cutoff
Audio, images, video, and text input support
Text output support

Technical Specifications

Model Code: gemini-2.0-flash
Supports: Input: audio, images, video, text; Output: text only
Features: Structured outputs, caching, function calling, code execution, search, Live API, thinking (experimental)
Pricing:
- Input: $0.10 per 1M tokens (text/image/video), $0.70 per 1M tokens (audio)
- Output: $0.40 per 1M tokens
- Context caching: $0.025/1M tokens (text/image/video), $0.175/1M tokens (audio), $1.00/1M tokens per hour storage
- Image generation: $0.039 per image
- Live API: Input $0.35 (text), $2.10 (audio/image/video); Output $1.50 (text), $8.50 (audio)
Free Tier: Available

Snapshots

gemini-2.0-flash (latest)
gemini-2.0-flash-001 (stable)
gemini-2.0-flash-exp (experimental)

Positioning and Use Cases

Generate code and images, extract data, analyze files, generate graphs, and more. Low latency, enhanced performance with superior speed, native tool use, and 1M token context window. Ideal for agentic experiences and real-time applications.

Rate Limits

Standard rate limits apply

Documentation

Official Documentation

Google

Next-generation AI models backed by powerful technical expertise

Gemini 2.0 Flash

Parameters 5/5 lightning bolts rating

Output tokens 8,192 tokens

Gemini 2.0 Flash is Google's newest multimodal model with next generation features and improved capabilities, built to power agentic experiences.

Official: $0.10 • $0.40 Our Price: $0.08 • $0.32 Save 20%

Back To List Try Now

Frequently Asked Questions

What is the uptime guarantee?

We guarantee 99.9% uptime with our enterprise-grade infrastructure and redundant systems.

How is pricing calculated?

Pricing is based on the number of tokens processed. Both input and output tokens are counted in the final cost.

What is the difference between GPT-4 and GPT-4 Turbo?

GPT-4 Turbo is the latest version with improved performance, longer context window, and more recent knowledge cutoff date.