Get started
Customer stories/Interactive Media/AstroBeam

How AstroBeam built the world's first voice-only VR game

AstroBeam
4.6★
Meta Quest rating
5+ hrs
Longest single session
100%
Voice-only gameplay
Listen to this article
Powered by Inworld TTS
0:00

At a glance

AstroBeam is on a mission to make voice a first-class input in games. Their debut title, Stellar Cafe, is available on Meta Quest and is fully playable using only your voice. No controllers, no buttons. To pull this off, they needed NPC speech that felt genuinely conversational: fast, natural, and affordable enough to ship in a one-time purchase game. AstroBeam chose Inworld TTS as the only solution that met every constraint.

When we adopted Inworld TTS it was a game changer. Players immediately began mentioning how magical it was talking to our NPCs.

Devin Reimer, Founder & CEO, AstroBeam

Stellar Cafe trailer

The problem

Voice games demand TTS that feels alive, not transactional

Previous attempts at voice in games relied on keyword matching. Say the magic word, trigger the response. AstroBeam wanted something fundamentally different: players holding real conversations with NPCs, using context and natural speech rather than scripted triggers. That meant the game had to respond the same way it was being spoken to. A solid TTS solution became foundational to the entire design.

Building a voice-driven game also presented unique constraints that most AI applications do not face. Unlike chatbots where the user prompts the AI and waits for a response, games require conversational speed. Latency is critical: it is the difference between something feeling magical and something feeling like a prompt.

Lastly, AstroBeam needed a TTS solution with high enough quality to sustain hours of continuous natural speech, and a cost model compatible with a one-time paid game, not a subscription. Every TTS provider they evaluated failed on at least one dimension: too slow, too robotic, or too expensive to make the economics work.

The solution

The only TTS that met every constraint

Inworld TTS was the only solution that successfully met all three criteria simultaneously: conversational-speed latency, voice quality high enough for hours of natural play, and a cost structure that made sense for a paid game.

Integration was straightforward. Inworld's API was flexible enough to handle the unique challenges of real-time VR gameplay: word-timing interruptions, streaming audio responses, and the continuous low-latency demands of live interaction.

Results

When AstroBeam started development two years ago, the top feedback from playtests was that the voices were not good enough. Some users were skeptical that voice-driven gameplay could ever feel good. With Inworld TTS, that skepticism disappeared.

4.6★

Quest Store rating

5+ hrs

Longest single session

In particular, session length exceeded every pre-launch expectation. Players regularly clock multi-hour sessions in VR, with one user logging over 5.5 hours in a single sitting. Remarkable for any game, let alone one played entirely through speech.

What’s next

The tip of the iceberg

The team is now exploring bringing Stellar Cafe to new platforms, continuing R&D on voice as an input, and developing their next game. They see this as only the tip of the iceberg.

Start building

Join millions of developers building the next wave of AI applications.

Copyright © 2021-2026 Inworld AI
AstroBeam × Inworld | How AstroBeam built the world's first voice-only VR game