State-of-the-art voice AI at a radically accessible price. Instant voice cloning, rich multilingual support, real-time streaming, and emotion plus non-verbal control, all for just $5 per million characters.
$5 per million characters—just 5% of competitors’ pricing—with no compromises on quality.
Multi-lingual
Multiple languages including English, Spanish, French, Korean, and Chinese, all with native-speaker quality.
State-of-the-art quality
Launched at #1 on Hugging Face TTS Arena with clearer speech, lower WER, and higher SIM than leading systems.
Blazingly fast
Sub-250 ms latency optimized for real-time conversational AI with streaming support.
Voice cloning
Create custom voices instantly from 2–15 seconds of audio, or fine-tune a professionally cloned voice.
Voice tags
Add emotion, delivery style, and non-verbal sounds to make speech more expressive and natural.
Full breakdown
Version
Inworld-TTS-1
Inworld-TTS-1-max
Radically accessible pricing
$5/1M characters
(≈ $0.25 per audio-hour)
$10/1M characters
(≈ $0.50 per audio-hour)
Power
State-of-the-art quality
(WER & similarity)
Real-time latency
Soon
Multilingual
Free zero-shot voice cloning *
Professional voice cloning
(custom fine-tuning)
Audio markups *
(emotion/style/non-verbals)
Timestamp alignment
Custom pronunciation
Embedded safeguards
SOC2 Type II
GDPR
On-Premise deployments
Open-source training & modeling code
Cross-lingual
(same voice, language switch)
* Experimental feature
Research
Cutting-edge research
Publications
Explore our latest research advancing the state of the art in speech synthesis, voice cloning, and real-time TTS
Training code available
Open source
We’ve open-sourced the full training framework behind Inworld TTS-1 — everything from codec to SpeechLM fine-tuning — so you can build your own high-quality TTS models faster.
Integrations
LiveKit
Real-time web and mobile voice AI with low latency and streaming.
Placeholder
Placeholder
Placeholder
Placeholder
Try Inworld TTS now
Test out zero-shot voice-cloning, audio mark-ups and so much more in our TTS Playground