Gemini 2.5 Flash Native Audio

Model Overview

Gemini 2.5 Flash Native Audio provides interactive and unstructured conversational experiences with high quality, natural conversational audio outputs, available with or without thinking capabilities.

Key Features

  • High intelligence (3/4 dots rating)
  • Fast speed (4/5 lightning bolts rating)
  • 128,000 context window
  • 8,000 max output tokens
  • January 2025 knowledge cutoff
  • Audio, video, and text input support
  • Audio and text output support (interleaved)

Technical Specifications

  • Model Code: gemini-2.5-flash-preview-native-audio-dialog & gemini-2.5-flash-exp-native-audio-thinking-dialog
  • Supports: Input: audio, video, text; Output: audio and text
  • Features: Audio generation, function calling, search grounding, thinking, style and control prompting
  • Pricing:
    • Input: $0.50 per 1M tokens (text), $3.00 per 1M tokens (audio/video)
    • Output: $2.00 per 1M tokens (text), $12.00 per 1M tokens (audio)
  • Free Tier: Not available

Snapshots

  • gemini-2.5-flash-preview-native-audio-dialog (preview)
  • gemini-2.5-flash-exp-native-audio-thinking-dialog (experimental)

Positioning and Use Cases

Available through the Live API for low-latency bidirectional voice interactions. Ideal for conversational AI applications, voice assistants, and interactive audio experiences with natural speech generation.

Rate Limits

  • More restricted rate limits since it is an experimental/preview model

Documentation

Official Documentation

Google

Next-generation AI models backed by powerful technical expertise

Gemini 2.5 Flash Native Audio

Parameters
Output tokens 8,000 tokens

Gemini 2.5 Flash Native Audio provides interactive and unstructured conversational experiences with high quality, natural conversational audio outputs, available with or without thinking capabilities.

Official: $0.50 • $2.00 Our Price: $0.40 • $1.60 Save 20%

Frequently Asked Questions

What is the uptime guarantee?
We guarantee 99.9% uptime with our enterprise-grade infrastructure and redundant systems.
How is pricing calculated?
Pricing is based on the number of tokens processed. Both input and output tokens are counted in the final cost.
What is the difference between GPT-4 and GPT-4 Turbo?
GPT-4 Turbo is the latest version with improved performance, longer context window, and more recent knowledge cutoff date.