Inworld TTS
Making state-of-the-art Voice AI radically accessible. Read our blog post here.
Note: This demo shows only a few of our most popular English voices. Get started to explore 50 voices in 11 languages—or to clone your own.
Available today in preview
Version | Inworld-TTS-1 | Inworld-TTS-1-max |
---|---|---|
Radically accessible pricing | $5/1M characters (roughly $0.25 per audio-hour) | $10/1M characters (roughly $0.50 per audio-hour) |
Power | ||
State-of-the-art quality (Word Error Rate and Similarity Scores) | ||
Real-time latency | Soon | |
Multilingual (support for 11 languages) | ||
Professional voice cloning (custom fine-tuning) | ||
Embedded safeguards | ||
SOC2 Type II | ||
On-Premise Deployments | ||
Open-Source training & modelling code | ||
Free zero-shot voice cloning | Experimental | Experimental |
Audio markups (prompt tags for emotion, style and non-verbals) | Experimental | Experimental |
Cross-lingual (language switching with same voice) | Experimental | Experimental |
Multilingual voices
Emotions and non-verbal
Education
Unlock engaging learning with expressive voices for e-learning platforms, language apps, and educational creators needing clear pronunciation and motivating narration.
Entertainment
Create immersive characters with emotionally dynamic voices for game developers, streaming platforms, and content creators bringing fictional worlds to life.
Content & Media
Deliver professional-grade narration with natural pacing for publishers, news organizations, and podcasters needing versatile, broadcast-quality human-like voices.
Voice assistant
Build trusted conversational experiences with warm, helpful voices for app developers, customer service platforms, and smart devices requiring empathetic interactions.
Supported Audio Formats
Format | Sample rate | Bitrate |
---|---|---|
MP3 | 16kHz - 48kHz | 32kbps - 320kbps |
PCM (PCL16) | 8kHz - 48kHz | – |
μ-law /A-law | 8kHz | |
Opus | 8kHz - 48kHz | 6kbps - 256kbps |