Rate Limits

The VoiceDub API implements comprehensive rate limiting to ensure fair usage and optimal performance for all users. This guide explains how rate limits work, how to handle them, and best practices for staying within limits.

Rate Limit Structure

The API uses a tiered rate limiting system with different limits based on operation type:

Global Rate Limit

15,000 requests per minute
Applied to all requests across all endpoints
Provides a safety buffer above individual endpoint limits

Read Operations (Safe Operations)

10,000 requests per minute
Applied to GET requests that don’t modify data
Includes:
- Polling dub status: GET /v1/me/dubs/{dubId}
- Polling voice status: GET /v1/me/voices/{voiceId}
- Listing voices: GET /v1/voices
- Getting account details: GET /v1/me

Write Operations (Unsafe Operations)

500 requests per minute
20,000 requests per hour
Applied to POST, PATCH, DELETE requests that create or modify data
Includes:
- Creating dubs: POST /v1/me/dubs
- Starting dub processing: POST /v1/me/dubs/{dubId}/preprocess
- Starting dub generation: POST /v1/me/dubs/{dubId}/generate
- Creating custom voices: POST /v1/me/voices
- Starting voice training: POST /v1/me/voices/{voiceId}/clone

Rate Limit Headers

When you make requests, the API returns headers with current rate limit information:

X-RateLimit-Limit: 500
X-RateLimit-Remaining: 498
X-RateLimit-Reset: 1704067200

Important: Since multiple rate limits may apply to each request (e.g. global and time-window limits), the headers reflect the most restrictive rate limit that applies to your request. This ensures you’re aware of whichever limit you’re closest to exceeding.

Header Descriptions

Header	Description
`X-RateLimit-Limit`	Maximum number of requests allowed in the current window for the most restrictive limit
`X-RateLimit-Remaining`	Number of requests remaining in the current window for the most restrictive limit
`X-RateLimit-Reset`	Unix timestamp (in seconds) when the most restrictive rate limit window resets

Rate Limit Exceeded Response

When you exceed a rate limit, you’ll receive a 429 Too Many Requests response:

{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Please wait before making more requests."
}

Additional Headers on 429 Response

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704067200
Retry-After: 30

The Retry-After header indicates how many seconds to wait before making another request.

Best Practices

1. Implement Exponential Backoff

When you receive a 429 response, implement exponential backoff:

async function makeRequestWithBackoff(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        if (attempt === maxRetries) {
          throw new Error('Max retries exceeded');
        }
        
        const retryAfter = response.headers.get('Retry-After');
        const delay = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, attempt) * 1000;
        
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      
      return response;
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }
}

2. Monitor Rate Limit Headers

Check rate limit headers proactively to avoid hitting limits. Remember, headers show the most restrictive limit:

function checkRateLimit(response) {
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
  const resetTime = parseInt(response.headers.get('X-RateLimit-Reset'));
  
  if (remaining < 10) {
    const resetDate = new Date(resetTime * 1000);
    console.warn(`Rate limit low: ${remaining} requests remaining until ${resetDate}`);
  }
}

3. Respect Polling Guidelines

For status polling, follow these guidelines:

Maximum polling frequency: Once every 3 seconds
Use appropriate delays: Don’t poll continuously
Implement jitter: Add random delays to avoid thundering herds

async function pollDubStatus(dubId, maxAttempts = 100) {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const response = await fetch(`https://api.voicedub.ai/v1/me/dubs/${dubId}`, {
      headers: { 'Authorization': `Api-Key ${apiKey}` }
    });
    
    const dub = await response.json();
    
    if (dub.status === 'completed' || dub.status === 'failed') {
      return dub;
    }
    
    // Wait 3-5 seconds with jitter
    const delay = 3000 + Math.random() * 2000;
    await new Promise(resolve => setTimeout(resolve, delay));
  }
  
  throw new Error('Polling timeout');
}

5. Implement Request Queuing

For applications with high request volumes, implement a queue system:

class APIQueue {
  constructor(maxConcurrent = 5, delayBetweenRequests = 100) {
    this.queue = [];
    this.active = 0;
    this.maxConcurrent = maxConcurrent;
    this.delay = delayBetweenRequests;
  }
  
  async enqueue(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }
  
  async process() {
    if (this.active >= this.maxConcurrent || this.queue.length === 0) {
      return;
    }
    
    this.active++;
    const { requestFn, resolve, reject } = this.queue.shift();
    
    try {
      const result = await requestFn();
      resolve(result);
    } catch (error) {
      reject(error);
    } finally {
      this.active--;
      setTimeout(() => this.process(), this.delay);
    }
  }
}

Rate Limit Architecture

Understanding how rate limits are implemented can help you optimize your usage:

User-Based Limiting

Rate limits are applied per user account
If a user has multiple API keys, they all share the same rate limit pool
Limits reset on a fixed window basis

Operation Classification

Safe operations (GET requests) have higher limits since they don’t modify data
Unsafe operations (POST/PATCH/DELETE) have lower limits due to computational cost
Global limits prevent any single user from overwhelming the system

Fixed Windows

Rate limits use fixed time windows with discrete reset points
Windows are calculated per minute (60 seconds) and per hour (3600 seconds)
Limits reset completely at the start of each new window period

Common Rate Limiting Scenarios

Scenario 1: Bulk Dub Creation

When creating many dubs:

Each dub creation counts against the 500/minute unsafe limit
Each dub requires 3 operations: create, preprocess, generate
Maximum ~165 dubs per minute (500 ÷ 3 operations)

Scenario 2: Status Polling

When polling multiple dubs:

Each status check counts against the 10,000/minute safe limit
Poll maximum once every 3 seconds per dub
Theoretical maximum: ~500 concurrent dubs being polled

Scenario 3: Voice Discovery

When browsing voices:

Voice listing counts against the 10,000/minute safe limit
Use pagination and filtering to reduce total requests
Cache results locally when possible

Error Recovery

When implementing error recovery, consider the rate limit context:

function isRateLimitError(error) {
  return error.response?.status === 429 && 
         error.response?.data?.code === 'rate_limit_exceeded';
}

async function handleRateLimitError(error, retryFn) {
  if (!isRateLimitError(error)) {
    throw error;
  }
  
  const retryAfter = error.response.headers['retry-after'];
  const delay = retryAfter ? parseInt(retryAfter) * 1000 : 60000; // Default 1 minute
  
  console.log(`Rate limited. Retrying after ${delay}ms`);
  await new Promise(resolve => setTimeout(resolve, delay));
  
  return retryFn();
}

Rate Limit Testing

When testing your integration:

Development Testing

Use smaller request volumes to test rate limit handling
Simulate rate limit responses for robust error handling
Note that multiple API keys for the same user share rate limits

Load Testing

Gradually increase request rates to find optimal throughput
Monitor rate limit headers during testing
Test error recovery mechanisms under rate limiting

Support and Monitoring

Real-time Monitoring

Monitor your rate limit usage through:

API response headers on every request
Application logging and metrics

Getting Help

If you consistently hit rate limits for legitimate usage:

Contact support at [email protected]
Provide details about your use case and request patterns

Remember that if you have multiple API keys for the same user account, they all share the same rate limits. Creating additional API keys will not increase your rate limits.

Rate limits are designed to ensure fair usage and system stability. Most applications should work comfortably within these limits with proper implementation of polling delays and error handling.

Getting Started

Core Features

Advanced Topics

Rate Limit Structure

Global Rate Limit

Read Operations (Safe Operations)

Write Operations (Unsafe Operations)

Rate Limit Headers

Header Descriptions

Rate Limit Exceeded Response

Additional Headers on 429 Response

Best Practices

1. Implement Exponential Backoff

2. Monitor Rate Limit Headers

3. Respect Polling Guidelines

5. Implement Request Queuing

Rate Limit Architecture

User-Based Limiting

Operation Classification

Fixed Windows

Common Rate Limiting Scenarios

Scenario 1: Bulk Dub Creation

Scenario 2: Status Polling

Scenario 3: Voice Discovery

Error Recovery

Rate Limit Testing

Development Testing

Load Testing

Support and Monitoring

Real-time Monitoring

Getting Help

Getting Started

Core Features

Advanced Topics

​Rate Limit Structure

​Global Rate Limit

​Read Operations (Safe Operations)

​Write Operations (Unsafe Operations)

​Rate Limit Headers

​Header Descriptions

​Rate Limit Exceeded Response

​Additional Headers on 429 Response

​Best Practices

​1. Implement Exponential Backoff

​2. Monitor Rate Limit Headers

​3. Respect Polling Guidelines

​5. Implement Request Queuing

​Rate Limit Architecture

​User-Based Limiting

​Operation Classification

​Fixed Windows

​Common Rate Limiting Scenarios

​Scenario 1: Bulk Dub Creation

​Scenario 2: Status Polling

​Scenario 3: Voice Discovery

​Error Recovery

​Rate Limit Testing

​Development Testing

​Load Testing

​Support and Monitoring

​Real-time Monitoring

​Getting Help

Rate Limit Structure

Global Rate Limit

Read Operations (Safe Operations)

Write Operations (Unsafe Operations)

Rate Limit Headers

Header Descriptions

Rate Limit Exceeded Response

Additional Headers on 429 Response

Best Practices

1. Implement Exponential Backoff

2. Monitor Rate Limit Headers

3. Respect Polling Guidelines

5. Implement Request Queuing

Rate Limit Architecture

User-Based Limiting

Operation Classification

Fixed Windows

Common Rate Limiting Scenarios

Scenario 1: Bulk Dub Creation

Scenario 2: Status Polling

Scenario 3: Voice Discovery

Error Recovery

Rate Limit Testing

Development Testing

Load Testing

Support and Monitoring

Real-time Monitoring

Getting Help