Cross-lingual voice cloning trained to preserve timbre, pacing, and character across every language your release covers. One performance, 15 languages, every market, one voice.
Studios and consumer-media platforms dub on Inworld every day.
Consumer-media platforms run over a million dubbing minutes a month on Inworld across Spanish, French, German, and Portuguese.
Dubbing at studio and consumer scale
Consumer-media platforms
1M+ dubbing minutes a month · ES/FR/DE/PT
Studio partners
Custom tools for dubbing, translation, news
Developer platforms
Video dubbing integration
Ad-tech + localization
Personalized voiceover at scale
Cross-lingual voice cloning
Clone once. Dub into every language on your release schedule.
The performer's voice is part of the brand. Cross-lingual cloning preserves timbre and pacing across every supported language. No re-casting.
Cross-lingual cloning
Original
EN · Narrator A
Dubbed
ES / FR / DE / JA · same voice
One cloned performance, every target language, no re-casting.
Cross-lingual cloning
Original
EN · Narrator A
Dubbed
ES / FR / DE / JA · same voice
One cloned performance, every target language, no re-casting.
Cross-lingual voice cloning
Clone once. Dub into every language on your release schedule.
The performer's voice is part of the brand. Cross-lingual cloning preserves timbre and pacing across every supported language. No re-casting.
15 languages out of the gate
Every major release market in a single API.
TTS 1.5 covers 15 languages, from English and Spanish to Japanese, Korean, Mandarin, Hindi, and Arabic. Cross-lingual cloning works across every pair.
TTS 1.5 · language coverage
15
Languages,
one cloned voice
TTS 2.0 expands this further: steering, conversationality, and more languages on the roadmap.
15 languages out of the gate
Every major release market in a single API.
TTS 1.5 covers 15 languages, from English and Spanish to Japanese, Korean, Mandarin, Hindi, and Arabic. Cross-lingual cloning works across every pair.
TTS 1.5 · language coverage
15
Languages,
one cloned voice
TTS 2.0 expands this further: steering, conversationality, and more languages on the roadmap.
Ship every market the same day
One script in, every target language out, in parallel.
You don't wait on Spanish to start French. The whole release dubs at once, and your launch moves as fast as the slowest language instead of the stack of them.
Scripted pipeline, parallel output
non-streaming endpoint
01
Source script
EN master
02
Clone once
5-15s reference audio
03
Dub in parallel
15 target languages
04
Deliver
MP3 · WAV · 48kHz
Scripted pipeline, parallel output
non-streaming endpoint
01
Source script
EN master
02
Clone once
5-15s reference audio
03
Dub in parallel
15 target languages
04
Deliver
MP3 · WAV · 48kHz
Ship every market the same day
One script in, every target language out, in parallel.
You don't wait on Spanish to start French. The whole release dubs at once, and your launch moves as fast as the slowest language instead of the stack of them.
Dub that sounds like acting
The performance survives the translation.
Most dubs sound like someone reading the subtitles. The Spanish take keeps the same pause and half-laugh as the English original, because the performance transfers, not just the words.
One voice, every target language
EN
source
ES
dubbed
FR
dubbed
DE
dubbed
JA
dubbed
PT
dubbed
Same timbre, same pacing, same performer. One clone preserves the voice across every dub.
Dub that sounds like acting
The performance survives the translation.
Most dubs sound like someone reading the subtitles. The Spanish take keeps the same pause and half-laugh as the English original, because the performance transfers, not just the words.
One voice, every target language
EN
source
ES
dubbed
FR
dubbed
DE
dubbed
JA
dubbed
PT
dubbed
Same timbre, same pacing, same performer. One clone preserves the voice across every dub.
Unreleased scripts stay in the building
Dub on-prem, so pre-release IP never leaves your cluster.
Deploy in your own infrastructure and keep every frame behind your firewall. SOC 2 Type II, GDPR, zero retention by default.
Pre-release content stays in the studio
On-premise deployment
H100 / A100 / H200 / B200 / B300
Studio-specific custom models
Trained on your tonality
Zero data retention
Pre-release content never leaves
SOC 2 Type II · GDPR
Certified
Run the whole dub pipeline inside your own cluster. Unreleased scripts never leave the lot.
Pre-release content stays in the studio
On-premise deployment
H100 / A100 / H200 / B200 / B300
Studio-specific custom models
Trained on your tonality
Zero data retention
Pre-release content never leaves
SOC 2 Type II · GDPR
Certified
Run the whole dub pipeline inside your own cluster. Unreleased scripts never leave the lot.
Unreleased scripts stay in the building
Dub on-prem, so pre-release IP never leaves your cluster.
Deploy in your own infrastructure and keep every frame behind your firewall. SOC 2 Type II, GDPR, zero retention by default.
FAQ
15 languages in TTS 1.5: English, Spanish, French, German, Italian, Portuguese, Japanese, Korean, Mandarin, Hindi, Arabic, Polish, Dutch, Russian, Hebrew. TTS 2.0 expands this; see TTS 2.0 launch intel for the roadmap.
Instant cloning with 5-15 seconds of reference audio. Professional cloning with 30+ minutes for higher fidelity. Cross-lingual cloning preserves voice identity across every supported language, clone in English, dub in Spanish, French, German with the same voice.
Yes. Professional Voice Clones (PVC) are available for studio-scale workflows with licensing coordination, dedicated training, and higher-fidelity models. Contact sales for PVC scoping.
Cloning requires rights to the source audio, and rights clearance sits with you. The tooling and infrastructure are ours; the licensing is yours. Cloned voices belong to your account under the Inworld terms of service and can be used commercially within that scope.
Dubbing is typically asynchronous (non-streaming TTS endpoint). The Realtime API is not the dubbing path, it's the live avatar path. For live speech-to-speech translation use cases, the Realtime API plus a translation-layer LLM gives you near-realtime dubbed response.
Yes. On-premise deployment on H100, A100, H200, B200, B300 for pre-release content protection. Studio-specific custom training, zero data retention on TTS by default, SOC 2 Type II, GDPR. Pre-release IP never leaves your infrastructure.
TTS 2.0 adds natural-language steering, conversationality with disfluencies, singing, denoise, and more. That's where 'dub that sounds like acting' comes from, performance-level output rather than translated reading. TTS 2.0 is in active development; contact sales for early access.
MP3, WAV, PCM, LINEAR16, OGG_OPUS, μ-law, A-law, FLAC. Sample rates 8-48kHz. 48kHz recommended for broadcast and streaming dubbing pipelines.
Dub without re-casting.
Cross-lingual voice cloning across 15 languages. Studio-grade on-prem. Performance that sounds like acting.