Inworld TTS
Making state-of-the-art Voice AI radically accessible.
Pricing: $5/million characters
Get startedAvailable today in preview
Radically accessible pricing | $5 / 1M chars |
State-of-the-art quality (Word Error Rate and Similarity Scores) | |
Real-time latency | |
Multilingual | |
Professional voice cloning (custom fine-tuning) | |
Embedded safeguards | |
SOC2 Type II | |
On-Premise Deployments | |
Open-Source training & modeling code | |
Larger (Max) model for use cases requiring ultra-realism | Experimental |
Free zero shot voice cloning | Experimental |
Audio markups (prompt tags for emotion, style and non-verbals) | Experimental |
Cross-lingual (language switching with same voice) | Experimental |
Multilingual voices
Now also available for Simplified Chinese (Mandarin), Korean, and Japanese
Emotions and non-verbal
Education
Unlock engaging learning with expressive voices for e-learning platforms, language apps, and educational creators needing clear pronunciation and motivating narration.
Alex
Julia
Edward
Entertainment
Create immersive characters with emotionally dynamic voices for game developers, streaming platforms, and content creators bringing fictional worlds to life.
Hades
Sarah
Theodore
Content & Media
Deliver professional-grade narration with natural pacing for publishers, news organizations, and podcasters needing versatile, broadcast-quality human-like voices.
Ashley
Deborah
Mark
Voice assistant
Build trusted conversational experiences with warm, helpful voices for app developers, customer service platforms, and smart devices requiring empathetic interactions.
Shaun
Timothy
Dennis
TTS Pricing
$5/million characters
Supported audio formats & quality
Format
Format | Sample rate | Bit depth | Bit rate |
---|---|---|---|
MP3 | 16kHz - 48kHz | – | 32kbps - 320kbps |
PCM | 8kHz - 48kHz | – | – |
μ-law /A-law | 8kHz | – | – |
Opus | 8kHz - 48kHz | – | 6kbps - 256kbps |