Model Overview
Gemini 1.5 Pro is a mid-size multimodal model optimized for complex reasoning tasks requiring more intelligence, with exceptional long-context capabilities.
Key Features
- Very high intelligence (4/4 dots rating)
- Medium speed (3/5 lightning bolts rating)
- 2,097,152 context window
- 8,192 max output tokens
- Knowledge cutoff not specified
- Audio, images, video, and text input support
- Text output support
Technical Specifications
- Model Code: gemini-1.5-pro
- Supports: Input: audio, images, video, text; Output: text only
- Features: System instructions, JSON mode, JSON schema, adjustable safety settings, caching, function calling, code execution
- Audio/Visual Specs: Max 7,200 images per prompt, 2 hours video, ~19 hours audio
- Pricing:
- Input: $1.25 per 1M tokens (≤128k prompts), $2.50 per 1M tokens (>128k prompts)
- Output: $5.00 per 1M tokens (≤128k prompts), $10.00 per 1M tokens (>128k prompts)
- Context caching: $0.3125 per 1M tokens (≤128k), $0.625 per 1M tokens (>128k), $4.50 per hour storage
- Free Tier: Available
Snapshots
- gemini-1.5-pro (latest stable)
- gemini-1.5-pro-latest
- gemini-1.5-pro-001 (stable)
- gemini-1.5-pro-002 (stable)
Positioning and Use Cases
Can process large amounts of data at once, including 2 hours of video, 19 hours of audio, codebases with 60,000 lines of code, or 2,000 pages of text. Ideal for complex reasoning tasks requiring more intelligence.
Rate Limits
- Standard rate limits apply
Documentation
Official Documentation