Model Overview
GPT-4o mini TTS is a text-to-speech model built on GPT-4o mini, a fast and powerful language model. Use it to convert text to natural sounding spoken text. The maximum number of input tokens is 2000.
Key Features
- Higher performance (4/4 dots rating)
- Fast speed (4/5 lightning bolts rating)
- Text-to-speech model powered by GPT-4o mini
- Accepts text input and produces audio output
- Maximum input token limit: 2000 tokens
Technical Specifications
- Pricing: $0.60 per 1M input tokens, $12.00 per 1M output tokens
- Supports: Input: text only, Output: audio only
- Features: Speech generation supported via v1/audio/speech endpoint
Snapshots
Positioning and Use Cases
As a text-to-speech model powered by GPT-4o mini, this model is designed for converting text to natural sounding spoken text with high performance and fast speed.
Rate Limits
- Free tier: Not supported
- Tier 1: 500 RPM, 50,000 TPM
- Tier 2: 2,000 RPM, 150,000 TPM
- Tier 3: 5,000 RPM, 600,000 TPM
- Tier 4: 10,000 RPM, 2,000,000 TPM
- Tier 5: 10,000 RPM, 8,000,000 TPM
Documentation
Official Documentation