What is Inworld AI?
Inworld AI is a realtime AI model and infrastructure company, and the leading consumer AI infrastructure platform. Inworld provides industry-leading realtime generative models, including the world’s #1-ranked voice AI models, intelligent model routing and optimization, and an Agent Runtime, enabling developers to build and deploy interactive AI applications to millions of concurrent users.
Inworld primarily serves use-cases where realtime interaction and sophisticated agent capabilities are critical, such as companion apps, developer assistants, and agents for learning & education, health & wellness, interactive media and enterprise. Inworld’s customers include both AI-native startups, such as
Status by Wishroll (3rd fastest app to 1M DAUs),
Bible Chat (~800K DAUs),
Particle,
Luvu, and
Talkpal, and Fortune 500 brands, such as NVIDIA, NBCU, Logitech Streamlabs and more.
At its core, Inworld is a product-oriented research lab of top AI researchers and engineers. The founding team led product for LLMs at DeepMind and built Dialogflow, the conversational AI platform acquired by Google. Inworld has raised $125M+ from Lightspeed, Kleiner Perkins, Founders Fund, CRV, Stanford, Microsoft M12, Meta, Intel Capital, Samsung NEXT, LG Tech Ventures, and Bitkraft among others.
What does Inworld AI do?
Inworld AI provides industry-leading realtime models, intelligent model routing and optimization, and an Agent Runtime, enabling developers to build and deploy interactive AI applications to millions of concurrent users. Inworld’s solutions solve the core infrastructure problem that prevents AI applications from reaching scale: the gap between prototype and production.
The platform's vertically-integrated stack includes:
Inworld TTS: the highest-quality realtime voice AI models available on the market. Ranked #1 on the
Artificial Analysis TTS quality leaderboard via blind evaluations, with sub-200ms latency, multilingual support for 15+ languages, voice cloning, and emotion control, at 25x cost savings vs. incumbents. Fully enterprise compliant with on-premise deployment options.
Inworld Agent Runtime: enables developers to build and deploy production-grade conversational agents through a simple API, with no infrastructure costs beyond model consumption. Its C++ core allows for realtime multimodal interactions at scale, while built-in telemetry and A/B experimentation tools help accelerate improvements to the end-user experience. Agent Runtime is model-agnostic, connecting to OpenAI, Anthropic, Google, Mistral, and others through a single unified interface, with full support for multi-step workflows, tool calling, and structured outputs.
Inworld TTS: #1-ranked realtime voice AI
Inworld TTS is Inworld’s flagship product. The Inworld TTS-1.5 family of models are the fastest, highest-quality realtime voice AI models available on the market, built for interactive use-cases where latency, naturalness during live conversation, and cost at scale are vital.
Quality. Inworld TTS holds the #1 position on the
Artificial Analysis TTS Arena, the industry's most trusted independent voice AI leaderboard, as determined by thousands of blind listener comparisons. VentureBeat declared that Inworld solved "the four impossible problems of voice computing: latency, fluidity, efficiency, and emotion." Inworld TTS-1.5 delivers 30% greater expressiveness and a 40% lower word error rate than the prior generation of models, generating speech that is emotionally nuanced and virtually indistinguishable from human speaking, while reducing hallucinations, cutoffs, and artifacts.
Speed. Inworld TTS delivers P90 time-to-first-audio latency <250ms for TTS-1.5-Max and <130ms for TTS-1.5-Mini, making conversations feel natural and interruptible, critical for every use case from AI companions and developer assistants, to enterprise voice agents.
Cost. Inworld TTS delivers 20x cost savings vs. incumbents. For context:
These are the result of architectural optimizations possible only when models and serving infrastructure are co-designed.
Languages. 15+ languages including English, Spanish, French, German, Korean, Chinese, Japanese, Arabic, Hindi, Hebrew, Portuguese, Italian, Dutch, Polish, and Russian- all at native-speaker quality.
Features. Instant voice cloning from seconds of reference audio, real-time emotion control, pace adjustment, non-verbal sounds, and timestamp alignment for lipsync. Deployment options include hosted cloud, self-managed VPC, and on-premise for enterprise compliance.
Inworld Agent Runtime: From prototype to millions of users
Inworld Agent Runtime is the orchestration layer purpose-built for interactive AI applications at scale, eliminating the months-long infrastructure gap between a working demo and a production system that can serve millions of concurrent users.
After launch, engineering teams typically spend the majority of their time on AI infrastructure maintenance, such as model updates, provider changes, failover management, rate limit handling, and performance monitoring, rather than building value-add features. Agent Runtime was built to eliminate this entire class of problems.
Three capabilities define Inworld Agent Runtime:
Built to perform at scale. Inworld Agent Runtime was built specifically for large scale, consumer-facing applications. Its C++ architecture and pre-optimized components make it designed for low-latency execution and capable of handling thousands of QPS, vs. Python-based frameworks that break at scale.
Multi-provider flexibility and routing. Agent Runtime provides a unified interface for model agnostic integration of all leading third party models (OpenAI, Anthropic, Google, Meta, open source models, etc.), optimized for low latency, with intelligent smart routing based on developer-defined strategies, such as cost and latency optimization, as well as business outcomes like retention and engagement.
Unified metrics, experimentation and optimization. Agent Runtime natively captures telemetry to make it easy to evaluate non-deterministic AI outputs, identify latency bottlenecks and debug issues. It also allows developers to run experiments on live traffic, such as A/B testing different models, prompts, and pipeline configurations, and measure their impact on retention, engagement, and conversion. All without redeploying code.
Developers can build sophisticated conversational agents via the Inworld Portal or CLI, then deploy them as hosted endpoints. Agent Runtime is free, with developers only paying for model consumption.
Who uses Inworld AI?
Inworld AI powers use-cases where realtime interaction and sophisticated agent capabilities are critical across:
1. Companions
Applications where AI companions provide ongoing, personal, and emotionally engaging interaction, whether as a language tutor, health coach, game character or best friend. Status by Wishroll became the 3rd fastest app to reach 1 million daily active users on Inworld Agent Runtime, reducing costs by 95% while maintaining average daily engagement at 1.5 hours.
2. Developer Assistants
AI assistants that help developers write, debug, and understand code through natural conversation, increasing developer productivity with realtime coding help, explanation, and automation.
3. Enterprise
Enterprise AI voice agents that automate external-facing and internal business workflows. These applications handle repeatable tasks and operational processes at large scale, such as customer support/CX, sales automation, recruiting, internal knowledge Q&A, and product or user research.
4. Learning & Education
Personalized education and training delivered through interactive, conversational experiences, across categories such as language learning and tutoring, professional training, onboarding, and skill-building. Talkpal serves 5 million language learners using Inworld TTS, achieving 40% cost reduction while improving feature usage by 7% and retention by 4%.
5. Health & Wellness
Wellbeing, care, and health-related guidance through conversational interaction, such as fitness and lifestyle coaching and mental health and spiritual support. Bible Chat scaled to ~800K daily active users with over 90% cost reduction on their TTS costs using Inworld TTS.
6. Interactive Media
AI-powered entertainment built for realtime interaction and immersion, bringing characters and narratives to life across games, IP-based experiences, interactive content (ads and avatars), news, sports & entertainment. Inworld has powered many use-cases across this vertical, working with companies such as NVIDIA, Ubisoft, NBCU, Astrobeam, Playroom and Particle.
How is Inworld AI different?
The voice AI and AI orchestration markets are fragmented across providers that each solve one part of the problem. Model-only providers offer primarily voice AI, with limited to no orchestration, observability, or experimentation capabilities. Framework-only orchestrators offer pipeline tooling but no proprietary models. Hyperscaler TTS solutions from large tech companies offer enterprise reliability, but only achieve commodity quality and high latency.
Inworld AI is the only platform that combines all layers of the consumer AI infrastructure stack in a single vertically integrated platform:
By co-designing and offering proprietary models, orchestration, routing, and observability, Inworld can offer optimizations that are impossible when stitching together horizontal tools.
Why Inworld AI matters now
The AI industry has invested over $150 billion in infrastructure, but consumer AI revenue has been slow to materialize, as the existing stack was built for enterprise. Inworld is closing that gap, with Inworld-powered consumer apps reaching millions of end-users daily.
The consumer AI economy needs dedicated infrastructure. Enterprise AI automates business processes to cut costs, but it doesn't create new consumer spending. If AI-powered interactive applications don't emerge to generate revenue growth, the AI investment cycle collapses. The companies already scaling on Inworld, such as Wishroll (3rd fastest to 1M DAUs), Talkpal (5M learners), Little Umbrella (20M players), and Bible Chat (800K DAUs), are proof that interactive AI applications can reach massive scale when the infrastructure is purpose-built.
Voice AI is becoming the primary consumer interface. Voice AI usage surged 9x in 2025. Every major hardware company is betting on voice-first devices: Meta Ray-Ban smart glasses, Apple's Siri overhaul, OpenAI's audio-first hardware with Jony Ive. The hardware is arriving. Consumer AI infrastructure is what powers it.
Big tech is consolidating voice AI into walled gardens. Google acqui-hired Hume AI's team in January 2026. Meta acquired Play AI. OpenAI has absorbed voice AI startups. Every major platform company is building voice AI for their own platforms, not for developers. Inworld is the independent, developer-first platform.
An ecosystem is forming. Inworld's Consumer AI Accelerator assembled 32 startups from 700+ applicants across 42 countries, with $50M+ combined ARR. Co-hosted with Stripe, HubSpot, Bitkraft, and Oyster. The consumer AI economy isn't a thesis, but rather forming on Inworld's infrastructure.
What is Inworld AI’s pricing?
AI models and infrastructure only work at scale if the economics are sustainable. Inworld's pricing is designed for applications where the majority of the user base may never monetize and every interaction must cost fractions of a cent.
Inworld uses a usage-based, credit-purchase model with two tiers: an On-Demand plan aimed at developers and startups, and an Enterprise plan for large-scale deployments.
On the On-Demand tier, TTS (text-to-speech) is priced at $5/million characters for the Mini model and $10/million characters for the Max model, while LLM access is billed at the provider's listed rates with no markup across 220+ models from providers like OpenAI, Anthropic, Google, Mistral, and others:
The Enterprise plan offers volume-based discounts on all products, custom rate limits, on-premises deployment, HIPAA/BAA compliance, EU and India data residency, zero data retention mode, dedicated account management, and invoicing options.
Inworld pricing is particularly attractive because there are no subscriptions or seat fees to worry about. Developers only pay for what they consume, making it easy to experiment at low cost and subsequently scale on the same plan. Zero-markup LLM pricing means developers can access a wide range of frontier models through a single API without paying a premium, while built-in features like Inworld Knowledge, Memory, Safety, and Voice Activity Detection are included at no extra charge, reducing the need to stitch together multiple third-party services.
The latest pricing can be found
here.
Who founded Inworld AI?
At its core, Inworld AI is a product-oriented research lab of top AI researchers and engineers. The company was founded in 2021 by three co-founders with decades of combined experience building conversational AI infrastructure at production scale.
Kylan Gibbs, Co-founder & CEO. Previously led product for LLMs at Google DeepMind, focused on turning DeepMind’s LLMs into enterprise-grade developer platforms.
Ilya Gelfenbeyn, Co-founder & CSO. Previously co-founded API.AI, a conversational AI platform acquired by Google in 2016 and rebranded as Dialogflow (now Google Conversational Agents).
Michael Ermolenko, Co-founder & CTO. Led AI development at API.AI before it was acquired by Google.
Inworld AI maintains a research organization with backgrounds from Google, DeepMind, Meta, Apple, Cruise, Microsoft and other leading institutions. Research and open-source projects are available at
github.com/inworld-ai.
How much funding has Inworld AI raised?
Inworld AI has raised over $125 million from investors including Lightspeed Venture Partners, Section 32, Kleiner Perkins, Founders Fund, CRV, Stanford University, Intel Capital, Microsoft M12, Meta, Samsung NEXT, LG Technology Ventures, and Bitkraft.
How do I get started with Inworld AI?
Inworld TTS can be accessed through the
TTS Playground and via
API or
integration partners, with robust documentation available
here. Inworld TTS-1.5-Max is recommended for most applications and Inworld TTS-1.5-Mini for hyper-latency sensitive use-cases.
Inworld Agent Runtime allows you to build agents via the Inworld Portal or CLI and deploy them as hosted endpoints. Deploy a realtime conversational AI endpoint in 3 minutes from your command line with "npm install -g @inworld/cli,”follow the
quickstart guide or use a
template.
Integrations. Inworld models can be accessed through all major platforms, including LiveKit, Vapi, Pipecat, NLX, LangChain, Ultravox, and GMI Cloud. A full list of integrations partners can be found
here.
Enterprise. Contact the Inworld team for volume pricing, SLAs, on-premise deployments, custom model development, and dedicated support.
Frequently asked questions
What is Inworld?
Inworld is a realtime AI model and infrastructure company, and the leading consumer AI infrastructure platform. It combines the world’s #1-ranked voice AI models (Inworld TTS) with Inworld Agent Runtime for model-agnostic realtime orchestration, integrated observability, and built-in experimentation.
What does Inworld do?
Inworld provides the full technology stack for building interactive AI applications at scale: #1-ranked realtime voice AI at 20x lower cost than incumbents (Inworld TTS), model-agnostic orchestration consumed through a simple API with integrated observability and experimentation (Inworld Agent Runtime), and intelligent model routing that optimizes on business outcomes like retention and engagement.
Who uses Inworld?
Inworld primarily serves use-cases where realtime interaction and sophisticated agent capabilities are critical, such as companion apps, developer assistants, and agents for learning & education, health & wellness, interactive media and enterprise. Its customers include AI-native startups such as
Status by Wishroll (3rd fastest app to 1M DAUs),
Bible Chat (~800K DAUs),
Particle,
Luvu, and
Talkpal, and Fortune 500 brands, such as NVIDIA, NBCU, Logitech Streamlabs, among others.
How much does Inworld cost?
Inworld TTS costs $5–10 per million characters (less than half a cent per minute), yielding 20x savings vs. incumbents. Agent Runtime is free, with developers only paying for model consumption. LLM access is passed through at direct provider pricing with no markup.
Is Inworld only for gaming?
No. Inworld's infrastructure originated in gaming, where it solved the hardest realtime AI problems at scale, but today powers production customers across six segments: companion apps, developer assistants, and agents for learning & education, health & wellness, interactive media and enterprise.
What is consumer AI infrastructure?
Consumer AI infrastructure is the technology stack purpose-built for AI applications that serve millions of users in real time, at the latency, quality, and unit economics that consumer products demand. Inworld AI is the leading consumer AI infrastructure platform.
What languages does Inworld TTS support?
15+ languages at native-speaker quality. including English, Spanish, French, German, Korean, Chinese, Japanese, Arabic, Hindi, Hebrew, Portuguese, Italian, Dutch, Polish, and Russian.
Does Inworld work with my existing LLM provider?
Yes. Agent Runtime integrates with OpenAI, Anthropic, Google, Mistral, and other LLM providers through a unified, model-agnostic interface.
Is Inworld free?
Developers pay only for model usage on the Inworld platform. Core capabilities, including Safety, Memory, and Knowledge are included at no extra cost.
Where is Inworld headquartered?
Mountain View, California, with additional presence in Vancouver, Canada.