Skip to main content
Transform any audio into a different voice using our library of 10,000+ voices or your own custom trained models. This guide walks you through the complete dubbing process from audio input to final output.

Overview

Creating a dub involves three main steps:
1

Create the dub

Initialize a new dub with your target voice and audio source
2

Preprocess the audio

Analyze and prepare the audio for voice conversion
3

Generate the dub

Apply the voice transformation and download your result

Step 1: Choose a Voice

First, browse our available voices to find the perfect match for your project:
curl -X GET 'https://api.voicedub.ai/v1/voices?limit=20&styles=singing' \
  -H 'Authorization: Api-Key YOUR_API_KEY'
{
  "voices": [
    {
      "id": "123e4567-e89b-12d3-a456-426614174000",
      "name": "Voice A",
      "languages": ["english"],
      "genres": ["pop", "rock"],
      "styles": ["singing", "speaking"],
      "avatarUrl": "https://example.com/avatar.webp",
      ...
    },
    ...
  ],
  "nextCursor": "2_123e4567-e89b-12d3-a456-426614174001"
}

Voice Filters

You can filter voices by several criteria:
Search for voices by name (e.g., β€œplankton”, β€œspongebob”)
genres
array
Filter by musical genres: pop, rap, rock, country, rnb, alternative, jazz
languages
array
Filter by supported languages: english, spanish, korean, japanese, french, russian, thai
styles
array
Filter by voice styles: singing, rapping, speaking, character, other

Step 2: Create the Dub

Once you’ve selected a voice, create a new dub. You can either provide an audio URL or upload a file:

Option A: Using a URL

curl -X POST 'https://api.voicedub.ai/v1/me/dubs' \
  -H 'Authorization: Api-Key YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "voiceId": "123e4567-e89b-12d3-a456-426614174000",
    "link": "https://example.com/audio.mp3",
    "separate": true,
    "pitchShift": 0
  }'

Option B: Uploading a File

curl -X POST 'https://api.voicedub.ai/v1/me/dubs' \
  -H 'Authorization: Api-Key YOUR_API_KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "voiceId": "123e4567-e89b-12d3-a456-426614174000",
    "separate": true,
    "pitchShift": 0
  }'
{
  "dub": {
    "id": "456e7890-e89b-12d3-a456-426614174000",
    "voiceId": "123e4567-e89b-12d3-a456-426614174000",
    "status": "new",
    "separate": true,
    "pitchShift": 0,
    "inputSource": "upload",
    "createdAt": "2024-01-15T10:00:00.000Z",
    ...
  },
  "uploadUrl": "https://s3.amazonaws.com/bucket/uploads/...?signature=..."
}
If you chose the upload option, use the provided uploadUrl to upload your audio file:
curl -X PUT 'UPLOAD_URL_FROM_RESPONSE' \
  -H 'Content-Type: audio/mpeg' \
  --data-binary @your-audio-file.mp3

Parameters Explained

voiceId
string
required
UUID of the voice model to use for dubbing. Get this from the /v1/voices endpoint.
URL to audio/video file to dub. Alternative to file upload. Supports MP3, WAV, M4A, and video formats.
separate
boolean
default:true
Whether to separate vocals from backing track. Set to false if your audio is already isolated vocals.
pitchShift
integer
default:0
Pitch adjustment in semitones (-24 to +24). Positive values increase pitch, negative values decrease it.
Duration Limit: Audio files are automatically limited to 30 minutes maximum. Longer files will be truncated during preprocessing.

Step 3: Preprocess the Dub

After creating the dub, start preprocessing to analyze the audio:
curl -X POST 'https://api.voicedub.ai/v1/me/dubs/456e7890-e89b-12d3-a456-426614174000/preprocess' \
  -H 'Authorization: Api-Key YOUR_API_KEY'
{
  "dub": {
    "status": "preprocessing",
    "requiredCredits": null,
    ...
  }
}

Step 4: Monitor Progress

Poll the dub status to track preprocessing progress:
curl -X GET 'https://api.voicedub.ai/v1/me/dubs/456e7890-e89b-12d3-a456-426614174000' \
  -H 'Authorization: Api-Key YOUR_API_KEY'
{
  "status": "preprocessed",
  "inputDuration": 180000,
  "requiredCredits": 30,
  ...
}

Status Values

  • new - Dub created, waiting for preprocessing
  • preprocessing - Analyzing audio and preparing for generation
  • preprocessed - Ready for generation (check requiredCredits)
  • queued - Waiting in generation queue
  • starting - Generation starting
  • processing - Generating the dub
  • finalizing - Completing and uploading result
  • done - Complete! Download URL available
  • error - Something went wrong (check errorMessage)
Poll the dub status maximum once every 3 seconds to avoid rate limiting.

Step 5: Generate the Final Dub

Once preprocessing is complete and you’ve confirmed you have enough credits, start generation:
curl -X POST 'https://api.voicedub.ai/v1/me/dubs/456e7890-e89b-12d3-a456-426614174000/generate' \
  -H 'Authorization: Api-Key YOUR_API_KEY'
{
  "dub": {
    "status": "queued",
    "apiCreditsUsed": 30,
    "apiCreditsLeft": 970,
    ...
  }
}

Step 6: Download Your Dub

Continue polling until the status is done, then download your result:
{
  "status": "done",
  "completedAt": "2024-01-15T10:30:00.000Z",
  "dubUrl": "https://example.com/dubs/456e7890-e89b-12d3-a456-426614174000.mp3",
  ...
}
Download Dub:
curl -X GET 'DUB_URL_FROM_RESPONSE' \
  --output my-voice-dub.mp3

Complete Example

Here’s a complete Node.js example that creates and processes a dub:
const apiKey = process.env.VOICEDUB_API_KEY;
const baseUrl = 'https://api.voicedub.ai';

async function createVoiceDub(voiceId, audioUrl) {
  // Step 1: Create the dub
  const createResponse = await fetch(`${baseUrl}/v1/me/dubs`, {
    method: 'POST',
    headers: {
      'Authorization': `Api-Key ${apiKey}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      voiceId: voiceId,
      link: audioUrl,
      separate: true
    })
  });
  
  const { dub } = await createResponse.json();
  console.log('Dub created:', dub.id);
  
  // Step 2: Start preprocessing
  await fetch(`${baseUrl}/v1/me/dubs/${dub.id}/preprocess`, {
    method: 'POST',
    headers: { 'Authorization': `Api-Key ${apiKey}` }
  });
  
  // Step 3: Wait for preprocessing
  let status = 'preprocessing';
  while (status === 'preprocessing') {
    await new Promise(resolve => setTimeout(resolve, 3000)); // Wait 3 seconds
    
    const statusResponse = await fetch(`${baseUrl}/v1/me/dubs/${dub.id}`, {
      headers: { 'Authorization': `Api-Key ${apiKey}` }
    });
    
    const dubData = await statusResponse.json();
    status = dubData.status;
    
    if (status === 'preprocessed') {
      console.log(`Credits required: ${dubData.requiredCredits}`);
      break;
    }
  }
  
  // Step 4: Start generation
  await fetch(`${baseUrl}/v1/me/dubs/${dub.id}/generate`, {
    method: 'POST',
    headers: { 'Authorization': `Api-Key ${apiKey}` }
  });
  
  // Step 5: Wait for completion
  status = 'queued';
  while (!['done', 'error'].includes(status)) {
    await new Promise(resolve => setTimeout(resolve, 3000));
    
    const statusResponse = await fetch(`${baseUrl}/v1/me/dubs/${dub.id}`, {
      headers: { 'Authorization': `Api-Key ${apiKey}` }
    });
    
    const dubData = await statusResponse.json();
    status = dubData.status;
    
    console.log('Status:', status);
    
    if (status === 'done') {
      console.log('Dub completed! Download URL:', dubData.dubUrl);
      return dubData.dubUrl;
    } else if (status === 'error') {
      throw new Error(`Dub failed: ${dubData.errorMessage}`);
    }
  }
}

// Usage
createVoiceDub('123e4567-e89b-12d3-a456-426614174000', 'https://example.com/audio.mp3')
  .then(url => console.log('Success!', url))
  .catch(err => console.error('Error:', err));

Duration Limits

All audio files are subject to a 30-minute maximum duration limit. This restriction applies during the preprocessing stage:
  • Files longer than 30 minutes are automatically truncated
  • The limit helps ensure optimal processing performance
  • For longer content, consider splitting into shorter segments

Pricing & Credits

Dubs consume 10 API credits per minute of audio. The exact cost is calculated after preprocessing based on the input duration:
  • 30-second clip = 10 credits (minimum charge)
  • 3-minute song = 30 credits
  • 10-minute podcast = 100 credits
  • 30-minute file = 300 credits (maximum)
Check the requiredCredits field after preprocessing to see the exact cost before starting generation.

Best Practices

Follow these tips for best results:
  • Use high-quality source audio (192kbps+ MP3 or lossless formats)
  • Choose appropriate voices - match the style (singing/speaking) and genre
  • Test with shorter clips first to verify quality before processing long audio
  • Keep files under 30 minutes - longer files will be automatically truncated
  • Split long content into multiple shorter dubs for content over 30 minutes
  • Use vocal separation (separate: true) unless your audio is already isolated vocals
  • Adjust pitch if needed - some voices work better with slight pitch adjustments

Troubleshooting

Preprocessing usually takes 30-300 seconds depending on the duration of the input audio. If it’s been longer than 5 minutes, the dub may have encountered an error. Check the errorMessage field in the response.
  • Ensure source audio is high quality (192kbps+ for music)
  • Try a different voice that matches your content style
  • Adjust the pitchShift parameter (+/-2 semitones often helps)
  • Make sure separate is true for mixed audio (music with vocals)
Generation time depends on audio length and current queue load. Peak times may have longer queues.
Supported formats: MP3, WAV, M4A, FLAC, OGG, and most video formats (MP4, MOV, AVI). Audio is automatically extracted from video files.
Files longer than 30 minutes are automatically limited during preprocessing. If you need to process longer content:
  • Split your audio into 30-minute segments
  • Create separate dubs for each segment
  • Combine the results using audio editing software