Gemini 1.5 Flash-8B

Model Overview

Gemini 1.5 Flash-8B is a smaller model designed for high volume and lower intelligence tasks with cost efficiency.

Model Code: gemini-1.5-flash-8b
Supports: Input: audio, images, video, text; Output: text only
Features: System instructions, JSON mode, JSON schema, adjustable safety settings, caching, tuning, function calling, code execution
Audio/Visual Specs: Max 3,600 images per prompt, 1 hour video, ~9.5 hours audio
Pricing:
- Input: $0.0375 per 1M tokens (≤128k prompts), $0.075 per 1M tokens (>128k prompts)
- Output: $0.15 per 1M tokens (≤128k prompts), $0.30 per 1M tokens (>128k prompts)
- Context caching: $0.01 per 1M tokens (≤128k), $0.02 per 1M tokens (>128k), $0.25 per hour storage
Free Tier: Available

Optimized for high volume and lower intelligence tasks. Most cost-effective option for simple tasks that don't require advanced reasoning.

Next-generation AI models backed by powerful technical expertise

Parameters Unknow

Output tokens 8,192 tokens

Gemini 1.5 Flash-8B is a smaller model designed for high volume and lower intelligence tasks with cost efficiency.

Official: $0.0375 • $0.15 Our Price: $0.03 • $0.12 Save 20%

Back To List Try Now

Frequently Asked Questions

What is the uptime guarantee?

We guarantee 99.9% uptime with our enterprise-grade infrastructure and redundant systems.

How is pricing calculated?

Pricing is based on the number of tokens processed. Both input and output tokens are counted in the final cost.

What is the difference between GPT-4 and GPT-4 Turbo?

GPT-4 Turbo is the latest version with improved performance, longer context window, and more recent knowledge cutoff date.