Skip to main content
The VoiceDub API implements comprehensive rate limiting to ensure fair usage and optimal performance for all users. This guide explains how rate limits work, how to handle them, and best practices for staying within limits.

Rate Limit Structure

The API uses a tiered rate limiting system with different limits based on operation type:

Global Rate Limit

  • 15,000 requests per minute
  • Applied to all requests across all endpoints
  • Provides a safety buffer above individual endpoint limits

Read Operations (Safe Operations)

  • 10,000 requests per minute
  • Applied to GET requests that don’t modify data
  • Includes:
    • Polling dub status: GET /v1/me/dubs/{dubId}
    • Polling voice status: GET /v1/me/voices/{voiceId}
    • Listing voices: GET /v1/voices
    • Getting account details: GET /v1/me

Write Operations (Unsafe Operations)

  • 500 requests per minute
  • 20,000 requests per hour
  • Applied to POST, PATCH, DELETE requests that create or modify data
  • Includes:
    • Creating dubs: POST /v1/me/dubs
    • Starting dub processing: POST /v1/me/dubs/{dubId}/preprocess
    • Starting dub generation: POST /v1/me/dubs/{dubId}/generate
    • Creating custom voices: POST /v1/me/voices
    • Starting voice training: POST /v1/me/voices/{voiceId}/clone

Rate Limit Headers

When you make requests, the API returns headers with current rate limit information:
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 498
X-RateLimit-Reset: 1704067200
Important: Since multiple rate limits may apply to each request (e.g. global and time-window limits), the headers reflect the most restrictive rate limit that applies to your request. This ensures you’re aware of whichever limit you’re closest to exceeding.

Header Descriptions

HeaderDescription
X-RateLimit-LimitMaximum number of requests allowed in the current window for the most restrictive limit
X-RateLimit-RemainingNumber of requests remaining in the current window for the most restrictive limit
X-RateLimit-ResetUnix timestamp (in seconds) when the most restrictive rate limit window resets

Rate Limit Exceeded Response

When you exceed a rate limit, you’ll receive a 429 Too Many Requests response:
{
  "code": "rate_limit_exceeded",
  "message": "Rate limit exceeded. Please wait before making more requests."
}

Additional Headers on 429 Response

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 500
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1704067200
Retry-After: 30
The Retry-After header indicates how many seconds to wait before making another request.

Best Practices

1. Implement Exponential Backoff

When you receive a 429 response, implement exponential backoff:
async function makeRequestWithBackoff(url, options, maxRetries = 3) {
  for (let attempt = 0; attempt <= maxRetries; attempt++) {
    try {
      const response = await fetch(url, options);
      
      if (response.status === 429) {
        if (attempt === maxRetries) {
          throw new Error('Max retries exceeded');
        }
        
        const retryAfter = response.headers.get('Retry-After');
        const delay = retryAfter ? parseInt(retryAfter) * 1000 : Math.pow(2, attempt) * 1000;
        
        await new Promise(resolve => setTimeout(resolve, delay));
        continue;
      }
      
      return response;
    } catch (error) {
      if (attempt === maxRetries) throw error;
    }
  }
}

2. Monitor Rate Limit Headers

Check rate limit headers proactively to avoid hitting limits. Remember, headers show the most restrictive limit:
function checkRateLimit(response) {
  const remaining = parseInt(response.headers.get('X-RateLimit-Remaining'));
  const resetTime = parseInt(response.headers.get('X-RateLimit-Reset'));
  
  if (remaining < 10) {
    const resetDate = new Date(resetTime * 1000);
    console.warn(`Rate limit low: ${remaining} requests remaining until ${resetDate}`);
  }
}

3. Respect Polling Guidelines

For status polling, follow these guidelines:
  • Maximum polling frequency: Once every 3 seconds
  • Use appropriate delays: Don’t poll continuously
  • Implement jitter: Add random delays to avoid thundering herds
async function pollDubStatus(dubId, maxAttempts = 100) {
  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    const response = await fetch(`https://api.voicedub.ai/v1/me/dubs/${dubId}`, {
      headers: { 'Authorization': `Api-Key ${apiKey}` }
    });
    
    const dub = await response.json();
    
    if (dub.status === 'completed' || dub.status === 'failed') {
      return dub;
    }
    
    // Wait 3-5 seconds with jitter
    const delay = 3000 + Math.random() * 2000;
    await new Promise(resolve => setTimeout(resolve, delay));
  }
  
  throw new Error('Polling timeout');
}

5. Implement Request Queuing

For applications with high request volumes, implement a queue system:
class APIQueue {
  constructor(maxConcurrent = 5, delayBetweenRequests = 100) {
    this.queue = [];
    this.active = 0;
    this.maxConcurrent = maxConcurrent;
    this.delay = delayBetweenRequests;
  }
  
  async enqueue(requestFn) {
    return new Promise((resolve, reject) => {
      this.queue.push({ requestFn, resolve, reject });
      this.process();
    });
  }
  
  async process() {
    if (this.active >= this.maxConcurrent || this.queue.length === 0) {
      return;
    }
    
    this.active++;
    const { requestFn, resolve, reject } = this.queue.shift();
    
    try {
      const result = await requestFn();
      resolve(result);
    } catch (error) {
      reject(error);
    } finally {
      this.active--;
      setTimeout(() => this.process(), this.delay);
    }
  }
}

Rate Limit Architecture

Understanding how rate limits are implemented can help you optimize your usage:

User-Based Limiting

  • Rate limits are applied per user account
  • If a user has multiple API keys, they all share the same rate limit pool
  • Limits reset on a fixed window basis

Operation Classification

  • Safe operations (GET requests) have higher limits since they don’t modify data
  • Unsafe operations (POST/PATCH/DELETE) have lower limits due to computational cost
  • Global limits prevent any single user from overwhelming the system

Fixed Windows

  • Rate limits use fixed time windows with discrete reset points
  • Windows are calculated per minute (60 seconds) and per hour (3600 seconds)
  • Limits reset completely at the start of each new window period

Common Rate Limiting Scenarios

Scenario 1: Bulk Dub Creation

When creating many dubs:
  • Each dub creation counts against the 500/minute unsafe limit
  • Each dub requires 3 operations: create, preprocess, generate
  • Maximum ~165 dubs per minute (500 ÷ 3 operations)

Scenario 2: Status Polling

When polling multiple dubs:
  • Each status check counts against the 10,000/minute safe limit
  • Poll maximum once every 3 seconds per dub
  • Theoretical maximum: ~500 concurrent dubs being polled

Scenario 3: Voice Discovery

When browsing voices:
  • Voice listing counts against the 10,000/minute safe limit
  • Use pagination and filtering to reduce total requests
  • Cache results locally when possible

Error Recovery

When implementing error recovery, consider the rate limit context:
function isRateLimitError(error) {
  return error.response?.status === 429 && 
         error.response?.data?.code === 'rate_limit_exceeded';
}

async function handleRateLimitError(error, retryFn) {
  if (!isRateLimitError(error)) {
    throw error;
  }
  
  const retryAfter = error.response.headers['retry-after'];
  const delay = retryAfter ? parseInt(retryAfter) * 1000 : 60000; // Default 1 minute
  
  console.log(`Rate limited. Retrying after ${delay}ms`);
  await new Promise(resolve => setTimeout(resolve, delay));
  
  return retryFn();
}

Rate Limit Testing

When testing your integration:

Development Testing

  • Use smaller request volumes to test rate limit handling
  • Simulate rate limit responses for robust error handling
  • Note that multiple API keys for the same user share rate limits

Load Testing

  • Gradually increase request rates to find optimal throughput
  • Monitor rate limit headers during testing
  • Test error recovery mechanisms under rate limiting

Support and Monitoring

Real-time Monitoring

Monitor your rate limit usage through:
  • API response headers on every request
  • Application logging and metrics

Getting Help

If you consistently hit rate limits for legitimate usage:
  • Contact support at [email protected]
  • Provide details about your use case and request patterns
Remember that if you have multiple API keys for the same user account, they all share the same rate limits. Creating additional API keys will not increase your rate limits.
Rate limits are designed to ensure fair usage and system stability. Most applications should work comfortably within these limits with proper implementation of polling delays and error handling.