Overview
Creating a dub involves three main steps:1
Create the dub
Initialize a new dub with your target voice and audio source
2
Preprocess the audio
Analyze and prepare the audio for voice conversion
3
Generate the dub
Apply the voice transformation and download your result
Step 1: Choose a Voice
First, browse our available voices to find the perfect match for your project:Show Response
Show Response
Voice Filters
You can filter voices by several criteria:Search for voices by name (e.g., βplanktonβ, βspongebobβ)
Filter by musical genres:
pop, rap, rock, country, rnb, alternative, jazzFilter by supported languages:
english, spanish, korean, japanese, french, russian, thaiFilter by voice styles:
singing, rapping, speaking, character, otherStep 2: Create the Dub
Once youβve selected a voice, create a new dub. You can either provide an audio URL or upload a file:Option A: Using a URL
Option B: Uploading a File
Show Response
Show Response
uploadUrl to upload your audio file:
Parameters Explained
UUID of the voice model to use for dubbing. Get this from the
/v1/voices endpoint.URL to audio/video file to dub. Alternative to file upload. Supports MP3, WAV, M4A, and video formats.
Whether to separate vocals from backing track. Set to
false if your audio is already isolated vocals.Pitch adjustment in semitones (-24 to +24). Positive values increase pitch, negative values decrease it.
Step 3: Preprocess the Dub
After creating the dub, start preprocessing to analyze the audio:Show Response
Show Response
Step 4: Monitor Progress
Poll the dub status to track preprocessing progress:Show Response (when complete)
Show Response (when complete)
Status Values
new- Dub created, waiting for preprocessingpreprocessing- Analyzing audio and preparing for generationpreprocessed- Ready for generation (checkrequiredCredits)queued- Waiting in generation queuestarting- Generation startingprocessing- Generating the dubfinalizing- Completing and uploading resultdone- Complete! Download URL availableerror- Something went wrong (checkerrorMessage)
Step 5: Generate the Final Dub
Once preprocessing is complete and youβve confirmed you have enough credits, start generation:Show Response
Show Response
Step 6: Download Your Dub
Continue polling until the status isdone, then download your result:
Show Response (when complete)
Show Response (when complete)
Complete Example
Hereβs a complete Node.js example that creates and processes a dub:Duration Limits
All audio files are subject to a 30-minute maximum duration limit. This restriction applies during the preprocessing stage:- Files longer than 30 minutes are automatically truncated
- The limit helps ensure optimal processing performance
- For longer content, consider splitting into shorter segments
Pricing & Credits
Dubs consume 10 API credits per minute of audio. The exact cost is calculated after preprocessing based on the input duration:- 30-second clip = 10 credits (minimum charge)
- 3-minute song = 30 credits
- 10-minute podcast = 100 credits
- 30-minute file = 300 credits (maximum)
Check the
requiredCredits field after preprocessing to see the exact cost before starting generation.Best Practices
- Use high-quality source audio (192kbps+ MP3 or lossless formats)
- Choose appropriate voices - match the style (singing/speaking) and genre
- Test with shorter clips first to verify quality before processing long audio
- Keep files under 30 minutes - longer files will be automatically truncated
- Split long content into multiple shorter dubs for content over 30 minutes
- Use vocal separation (
separate: true) unless your audio is already isolated vocals - Adjust pitch if needed - some voices work better with slight pitch adjustments
Troubleshooting
Dub stuck in 'preprocessing' status
Dub stuck in 'preprocessing' status
Preprocessing usually takes 30-300 seconds depending on the duration of the input audio. If itβs been longer than 5 minutes, the dub may have encountered an error. Check the
errorMessage field in the response.Poor quality output
Poor quality output
- Ensure source audio is high quality (192kbps+ for music)
- Try a different voice that matches your content style
- Adjust the
pitchShiftparameter (+/-2 semitones often helps) - Make sure
separateis true for mixed audio (music with vocals)
Generation taking too long
Generation taking too long
Generation time depends on audio length and current queue load. Peak times may have longer queues.
Unsupported audio format
Unsupported audio format
Supported formats: MP3, WAV, M4A, FLAC, OGG, and most video formats (MP4, MOV, AVI). Audio is automatically extracted from video files.
Audio file was truncated
Audio file was truncated
Files longer than 30 minutes are automatically limited during preprocessing. If you need to process longer content:
- Split your audio into 30-minute segments
- Create separate dubs for each segment
- Combine the results using audio editing software