Model Overview
Gemini 2.0 Flash Live enables low-latency bidirectional voice and video interactions with Gemini through the Live API.
Key Features
- High intelligence (3/4 dots rating)
- Very fast speed (5/5 lightning bolts rating)
- 1,048,576 context window
- 8,192 max output tokens
- August 2024 knowledge cutoff
- Audio, video, and text input support
- Text and audio output support
Technical Specifications
- Model Code: gemini-2.0-flash-live-001
- Supports: Input: audio, video, text; Output: text and audio
- Features: Structured outputs, function calling, code execution, search, audio generation
- Pricing:
- Input: $0.10 per 1M tokens (text/image/video), $0.70 per 1M tokens (audio)
- Output: $0.40 per 1M tokens
- Context caching: $0.025/1M tokens (text/image/video), $0.175/1M tokens (audio), $1.00/1M tokens per hour storage
- Image generation: $0.039 per image
- Live API: Input $0.35 (text), $2.10 (audio/image/video); Output $1.50 (text), $8.50 (audio)
- Free Tier: Available
Snapshots
- gemini-2.0-flash-live-001
Positioning and Use Cases
Specifically designed for real-time voice and video interactions. Perfect for live conversational AI, virtual assistants, real-time customer support, and interactive applications requiring immediate audio/video processing.
Rate Limits
- Preview model rate limits apply
Documentation
Official Documentation