Model Overview
Grok-3-Fast is the high-speed variant of Grok-3, offering identical quality with significantly reduced latency for time-sensitive applications.
Key Features
- Very high intelligence (4/4 dots rating)
- Very fast speed (5/5 lightning bolts rating)
- 131,072 context window
- High max output tokens (estimated 8,192+)
- November 17, 2024 knowledge cutoff
- Text input support
- Text output support
Technical Specifications
- Pricing: $5.00 per 1M tokens (input), $25.00 per 1M tokens (output)
- Supports: Input: text; Output: text only
- Features: Low-latency infrastructure, enterprise optimization, real-time applications
Snapshots
- grok-3-fast (alias for grok-3-fast-latest)
- grok-3-fast-latest
Positioning and Use Cases
Grok-3-Fast uses the exact same underlying model as Grok-3 but is served on faster infrastructure for latency-sensitive applications. It's ideal for real-time chat applications, interactive systems, and any use case where response speed is critical. The increased cost per token is justified by significantly faster response times while maintaining identical quality and capabilities.
Rate Limits
- Information not publicly available
Additional Notes
- Knowledge Cutoff: All Grok-3 family models have a knowledge cutoff of November 17, 2024
- No Internet Access: Unlike grok.com and Grok in X, API models are not connected to the internet
- Flexible Role Order: No role order limitation - you can mix system, user, or assistant roles in any sequence
- Model Aliases: Latest versions are automatically updated through aliases for seamless upgrades
- Fast vs Standard: Fast variants offer identical quality with reduced latency at higher cost
Documentation
Official Documentation