Get started
Realtime Router

Your coding agent. Every model. No markup.

One API endpoint that routes every request to the right model. Hundreds of models from every major provider, one key, no markup. Slash-command switching and automatic fallbacks included.
# .env - drop-in replacement for any OpenAI-compatible tool
OPENAI_BASE_URL=https://api.inworld.ai/v1 OPENAI_API_KEY=your-inworld-api-key OPENAI_MODEL=inworld/vibe-coding-by-task
# Works with: Cursor, Claude Code, Codex CLI, Aider, # Continue, LangChain, Vercel AI SDK, and more. # # Use /code, /review, or /docs prefixes in your prompts # to route to the best model for each task.
Works with
CursorClaude CodeCodex CLIAiderContinueWindsurf

Built for how you actually code.

Whether you work with Cursor, Claude Code, or any other coding agent, the Realtime Router handles model selection, fallbacks, and cost optimization so you can focus on shipping.
All providers

One key. Every model.

OpenAI, Anthropic, Google, xAI, Mistral, DeepSeek, Meta, Groq, and more. One API key, one endpoint, one bill. Switch models, test challengers, and configure fallbacks without touching application code.
Router

Hundreds

of models available
GoogleOpenAIAnthropicxAIMistralDeepSeekMetaGroqAnd more...
Drop-in replacement

Drop-in OpenAI SDK replacement.

One env var. No code changes, no wrapper library, no migration scripts. Every SDK call, system prompt, and tool you've already built keeps working exactly as before.
Read the quickstart
.env
# beforeOPENAI_BASE_URL=https://api.openai.com/v1OPENAI_API_KEY=sk-...# after — 3 minutesOPENAI_BASE_URL=https://api.inworld.ai/v1OPENAI_API_KEY=your-inworld-api-keyOPENAI_MODEL=inworld/vibe-coding-by-task
Automatic fallbacks

Your workflow never stops.

If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Configure ordered provider fallbacks or let the router select automatically. Your coding session keeps going.
Fallbacks

Workflow never stops.

Primary model
rate-limited
Fallback 1
timeout
Fallback 2
200 OK
At-cost pricing

Pay-per-token, no markup.

Pay per token at exactly what providers charge. During research preview, there's no platform fee, no per-seat cost, no monthly minimum. Scale from one agent to a hundred without surprises.
Pricing

Provider rates. Nothing added.

0%

platform markup
No per-seat fee
No monthly min
Research preview · Pricing may change at GA
Slash-command routing

Switch models mid-session.

Type /code to route to the best coding model, /review for analysis, /docs for writing. No slash command? Your request goes to the cost-efficient default. One model ID, every task type.
Task routing

Best model per task, inline.

/code
GPT 5.4
best coding
/review
Gemini 3.1 Pro Preview
analysis
/docs
MiniMax 2.7
writing
default
Claude Sonnet 4.6
efficient

Works with your tools

Set up in minutes with any AI coding tool. Change one env var and everything else stays the same.
## Configure Cursor 1. Open Cursor Settings (Cmd+Shift+J or Ctrl+Shift+J) 2. Go to "Models" section 3. Click "Add Model" 4. Set: - Model name: inworld/vibe-coding-by-task - API Base: https://api.inworld.ai/v1 - API Key: your-inworld-api-key 5. Select "inworld/vibe-coding-by-task" as your default model Alternatively, use the OpenAI override:

SDKs & Frameworks

Building an AI-powered app? Use Realtime Router with your favorite SDK or framework.
from openai import OpenAI # Before: # client = OpenAI(api_key="sk-...") # After: client = OpenAI( base_url="https://api.inworld.ai/v1", api_key="your-inworld-api-key", ) # Default route - MiniMax M2.5 response = client.chat.completions.create( model="inworld/vibe-coding-by-task", messages=[{"role": "user", "content": "Hello!"}], ) print(response.choices[0].message.content) # Use /code prefix for selected coding model: stream = client.chat.completions.create( model="inworld/vibe-coding-by-task", messages=[{"role": "user", "content": "/code Implement a connection pool with retry logic"}], stream=True, ) for chunk in stream: if chunk.choices[0].delta.content: print(chunk.choices[0].delta.content, end="")

FAQ

An OpenAI-compatible API endpoint that routes every LLM request to the right model automatically. Change one env var in your coding agent, then use slash commands like /code, /review, and /docs to switch models mid-session. Automatic fallbacks keep your workflow running if a provider goes down.
Set OPENAI_BASE_URL=https://api.inworld.ai/v1 and OPENAI_API_KEY=your-inworld-api-key in your environment. Most tools pick these up automatically. See the setup guides above for tool-specific instructions, or read the quickstart for step-by-step details.
Prefix your prompt with /code, /review, or /docs. The router reads the prefix and routes to the model configured for that task type. /code routes to the selected coding model (GPT 5.4), /review to an analysis model (Gemini 3.1 Pro Preview), and /docs to a writing model (MiniMax 2.7). No prefix routes to the cost-efficient default (Claude Sonnet 4.6).
Two env var changes: OPENAI_BASE_URL and OPENAI_API_KEY. Set OPENAI_MODEL to inworld/vibe-coding-by-task. Every existing SDK call, system prompt, and tool stays unchanged. See the quickstart for step-by-step instructions.
Hundreds of models from Google, OpenAI, Anthropic, xAI, Mistral, DeepSeek, Meta, Groq, and more. See the full list at inworld.ai/models.
During research preview, all model access is at cost with no markup from Inworld. You pay exactly what the underlying providers charge. No per-seat fees, no monthly minimums. Pricing terms may change when the product exits research preview.
The router automatically fails over to the next provider in the chain. You can configure ordered fallback providers per route, or omit the provider prefix to let the router select and fall back automatically. Your coding session continues without interruption.

Start routing smarter today

Free tier included. No credit card required. Set up in 3 minutes.
Copyright © 2021-2026 Inworld AI
Realtime Router for Vibe Coding: Smart LLM Routing for AI-Assisted Development