Rate Limit Structure
The API uses a tiered rate limiting system with different limits based on operation type:Global Rate Limit
- 15,000 requests per minute
- Applied to all requests across all endpoints
- Provides a safety buffer above individual endpoint limits
Read Operations (Safe Operations)
- 10,000 requests per minute
- Applied to GET requests that don’t modify data
- Includes:
- Polling dub status:
GET /v1/me/dubs/{dubId} - Polling voice status:
GET /v1/me/voices/{voiceId} - Listing voices:
GET /v1/voices - Getting account details:
GET /v1/me
- Polling dub status:
Write Operations (Unsafe Operations)
- 500 requests per minute
- 20,000 requests per hour
- Applied to POST, PATCH, DELETE requests that create or modify data
- Includes:
- Creating dubs:
POST /v1/me/dubs - Starting dub processing:
POST /v1/me/dubs/{dubId}/preprocess - Starting dub generation:
POST /v1/me/dubs/{dubId}/generate - Creating custom voices:
POST /v1/me/voices - Starting voice training:
POST /v1/me/voices/{voiceId}/clone
- Creating dubs:
Rate Limit Headers
When you make requests, the API returns headers with current rate limit information:Important: Since multiple rate limits may apply to each request (e.g. global and time-window limits), the headers reflect the most restrictive rate limit that applies to your request. This ensures you’re aware of whichever limit you’re closest to exceeding.
Header Descriptions
| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum number of requests allowed in the current window for the most restrictive limit |
X-RateLimit-Remaining | Number of requests remaining in the current window for the most restrictive limit |
X-RateLimit-Reset | Unix timestamp (in seconds) when the most restrictive rate limit window resets |
Rate Limit Exceeded Response
When you exceed a rate limit, you’ll receive a429 Too Many Requests response:
Additional Headers on 429 Response
Retry-After header indicates how many seconds to wait before making another request.
Best Practices
1. Implement Exponential Backoff
When you receive a 429 response, implement exponential backoff:2. Monitor Rate Limit Headers
Check rate limit headers proactively to avoid hitting limits. Remember, headers show the most restrictive limit:3. Respect Polling Guidelines
For status polling, follow these guidelines:- Maximum polling frequency: Once every 3 seconds
- Use appropriate delays: Don’t poll continuously
- Implement jitter: Add random delays to avoid thundering herds
5. Implement Request Queuing
For applications with high request volumes, implement a queue system:Rate Limit Architecture
Understanding how rate limits are implemented can help you optimize your usage:User-Based Limiting
- Rate limits are applied per user account
- If a user has multiple API keys, they all share the same rate limit pool
- Limits reset on a fixed window basis
Operation Classification
- Safe operations (GET requests) have higher limits since they don’t modify data
- Unsafe operations (POST/PATCH/DELETE) have lower limits due to computational cost
- Global limits prevent any single user from overwhelming the system
Fixed Windows
- Rate limits use fixed time windows with discrete reset points
- Windows are calculated per minute (60 seconds) and per hour (3600 seconds)
- Limits reset completely at the start of each new window period
Common Rate Limiting Scenarios
Scenario 1: Bulk Dub Creation
When creating many dubs:- Each dub creation counts against the 500/minute unsafe limit
- Each dub requires 3 operations: create, preprocess, generate
- Maximum ~165 dubs per minute (500 ÷ 3 operations)
Scenario 2: Status Polling
When polling multiple dubs:- Each status check counts against the 10,000/minute safe limit
- Poll maximum once every 3 seconds per dub
- Theoretical maximum: ~500 concurrent dubs being polled
Scenario 3: Voice Discovery
When browsing voices:- Voice listing counts against the 10,000/minute safe limit
- Use pagination and filtering to reduce total requests
- Cache results locally when possible
Error Recovery
When implementing error recovery, consider the rate limit context:Rate Limit Testing
When testing your integration:Development Testing
- Use smaller request volumes to test rate limit handling
- Simulate rate limit responses for robust error handling
- Note that multiple API keys for the same user share rate limits
Load Testing
- Gradually increase request rates to find optimal throughput
- Monitor rate limit headers during testing
- Test error recovery mechanisms under rate limiting
Support and Monitoring
Real-time Monitoring
Monitor your rate limit usage through:- API response headers on every request
- Application logging and metrics
Getting Help
If you consistently hit rate limits for legitimate usage:- Contact support at [email protected]
- Provide details about your use case and request patterns
Rate limits are designed to ensure fair usage and system stability. Most applications should work comfortably within these limits with proper implementation of polling delays and error handling.