Model Overview
GPT-3.5 Turbo is a legacy GPT model for cheaper chat and non-chat tasks.
Key Features
- Low intelligence (1/4 dots rating)
- Slow speed (2/5 lightning bolts rating)
- 16,385 context window
- 4,096 max output tokens
- Sep 01, 2021 knowledge cutoff
- Text input support
- Text output support
Technical Specifications
- Pricing: $0.50 per 1M tokens (input), $1.50 per 1M tokens (output)
- Supports: Input: text; Output: text only
- Features: Fine-tuning
Snapshots
- gpt-3.5-turbo (alias for gpt-3.5-turbo-0125)
- gpt-3.5-turbo-0125
- gpt-3.5-turbo-1106
- gpt-3.5-turbo-instruct
Positioning and Use Cases
GPT-3.5 Turbo models can understand and generate natural language or code and have been optimized for chat using the Chat Completions API but work well for non-chat tasks as well. As of July 2024, use gpt-4o-mini in place of GPT-3.5 Turbo, as it is cheaper, more capable, multimodal, and just as fast. GPT-3.5 Turbo is still available for use in the API.
Rate Limits
- Free tier: Not supported
- Tier 1: 3,500 RPM, 10,000 RPD, 200,000 TPM, 2,000,000 batch queue limit
- Tier 2: 3,500 RPM, 2,000,000 TPM, 5,000,000 batch queue limit
- Tier 3: 3,500 RPM, 800,000 TPM, 50,000,000 batch queue limit
- Tier 4: 10,000 RPM, 10,000,000 TPM, 1,000,000,000 batch queue limit
- Tier 5: 10,000 RPM, 50,000,000 TPM, 10,000,000,000 batch queue limit
Documentation
Official Documentation