Gemini 1.5 Flash

Model Overview

Gemini 1.5 Flash is a fast and versatile multimodal model for scaling across diverse tasks with excellent performance.

Key Features

  • High intelligence (3/4 dots rating)
  • Fast speed (4/5 lightning bolts rating)
  • 1,048,576 context window
  • 8,192 max output tokens
  • Knowledge cutoff not specified
  • Audio, images, video, and text input support
  • Text output support

Technical Specifications

  • Model Code: gemini-1.5-flash
  • Supports: Input: audio, images, video, text; Output: text only
  • Features: System instructions, JSON mode, JSON schema, adjustable safety settings, caching, tuning, function calling, code execution
  • Audio/Visual Specs: Max 3,600 images per prompt, 1 hour video, ~9.5 hours audio
  • Pricing:
    • Input: $0.075 per 1M tokens (≤128k prompts), $0.15 per 1M tokens (>128k prompts)
    • Output: $0.30 per 1M tokens (≤128k prompts), $0.60 per 1M tokens (>128k prompts)
    • Context caching: $0.01875 per 1M tokens (≤128k), $0.0375 per 1M tokens (>128k), $1.00 per hour storage
  • Free Tier: Available

Snapshots

  • gemini-1.5-flash (latest stable)
  • gemini-1.5-flash-latest
  • gemini-1.5-flash-001 (stable)
  • gemini-1.5-flash-002 (stable)

Positioning and Use Cases

Fast and versatile performance across a diverse variety of tasks. Excellent balance of speed, capability, and cost for most general-purpose applications.

Rate Limits

  • Standard rate limits apply

Documentation

Official Documentation

Google

Next-generation AI models backed by powerful technical expertise

Gemini 1.5 Flash

Parameters Unknow
Output tokens 8,192 tokens

Gemini 1.5 Flash is a fast and versatile multimodal model for scaling across diverse tasks with excellent performance.

Official: $0.075 • $0.30 Our Price: $0.06 • $0.24 Save 20%

Frequently Asked Questions

What is the uptime guarantee?
We guarantee 99.9% uptime with our enterprise-grade infrastructure and redundant systems.
How is pricing calculated?
Pricing is based on the number of tokens processed. Both input and output tokens are counted in the final cost.
What is the difference between GPT-4 and GPT-4 Turbo?
GPT-4 Turbo is the latest version with improved performance, longer context window, and more recent knowledge cutoff date.