Realtime Router

Your coding agent. Every model. No markup.

One API endpoint that routes every request to the right model. Hundreds of models from every major provider, one key, no markup. Slash-command switching and automatic fallbacks included.

Start for free Read the quickstart Setup guides

# .env - drop-in replacement for any OpenAI-compatible tool

OPENAI_BASE_URL=https://api.inworld.ai/v1
OPENAI_API_KEY=your-inworld-api-key
OPENAI_MODEL=inworld/vibe-coding-by-task

# Works with: Cursor, Claude Code, Codex CLI, Aider,
# Continue, LangChain, Vercel AI SDK, and more.
#
# Use /code, /review, or /docs prefixes in your prompts
# to route to the best model for each task.

Works with

CursorClaude CodeCodex CLIAiderContinueWindsurf

Built for how you actually code.

Whether you work with Cursor, Claude Code, or any other coding agent, the Realtime Router handles model selection, fallbacks, and cost optimization so you can focus on shipping.

All providers

One key. Every model.

OpenAI, Anthropic, Google, xAI, Mistral, DeepSeek, Meta, Groq, and more. One API key, one endpoint, one bill. Switch models, test challengers, and configure fallbacks without touching application code.

Router

Hundreds

of models available

GoogleOpenAIAnthropicxAIMistralDeepSeekMetaGroqAnd more...

All providers

One key. Every model.

Router

Hundreds

of models available

GoogleOpenAIAnthropicxAIMistralDeepSeekMetaGroqAnd more...

Drop-in replacement

Drop-in OpenAI SDK replacement.

One env var. No code changes, no wrapper library, no migration scripts. Every SDK call, system prompt, and tool you've already built keeps working exactly as before.

Read the quickstart

.env

# beforeOPENAI_BASE_URL=https://api.openai.com/v1OPENAI_API_KEY=sk-...# after — 3 minutesOPENAI_BASE_URL=https://api.inworld.ai/v1OPENAI_API_KEY=your-inworld-api-keyOPENAI_MODEL=inworld/vibe-coding-by-task

.env

# beforeOPENAI_BASE_URL=https://api.openai.com/v1OPENAI_API_KEY=sk-...# after — 3 minutesOPENAI_BASE_URL=https://api.inworld.ai/v1OPENAI_API_KEY=your-inworld-api-keyOPENAI_MODEL=inworld/vibe-coding-by-task

Drop-in replacement

Drop-in OpenAI SDK replacement.

One env var. No code changes, no wrapper library, no migration scripts. Every SDK call, system prompt, and tool you've already built keeps working exactly as before.

Read the quickstart

Automatic fallbacks

Your workflow never stops.

If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Configure ordered provider fallbacks or let the router select automatically. Your coding session keeps going.

Fallbacks

Workflow never stops.

Primary model

rate-limited

Fallback 1

timeout

Fallback 2

200 OK

Automatic fallbacks

Your workflow never stops.

Fallbacks

Workflow never stops.

Primary model

rate-limited

Fallback 1

timeout

Fallback 2

200 OK

At-cost pricing

Pay-per-token, no markup.

Pay per token at exactly what providers charge. During research preview, there's no platform fee, no per-seat cost, no monthly minimum. Scale from one agent to a hundred without surprises.

Pricing

Provider rates. Nothing added.

platform markup

No per-seat fee

No monthly min

Research preview · Pricing may change at GA

Pricing

Provider rates. Nothing added.

platform markup

No per-seat fee

No monthly min

Research preview · Pricing may change at GA

At-cost pricing

Pay-per-token, no markup.

Pay per token at exactly what providers charge. During research preview, there's no platform fee, no per-seat cost, no monthly minimum. Scale from one agent to a hundred without surprises.

Slash-command routing

Switch models mid-session.

Type /code to route to the best coding model, /review for analysis, /docs for writing. No slash command? Your request goes to the cost-efficient default. One model ID, every task type.

Task routing

Best model per task, inline.

/code

GPT 5.4

best coding

/review

Gemini 3.1 Pro Preview

analysis

/docs

MiniMax 2.7

writing

default

Claude Sonnet 4.6

efficient

Slash-command routing

Switch models mid-session.

Type /code to route to the best coding model, /review for analysis, /docs for writing. No slash command? Your request goes to the cost-efficient default. One model ID, every task type.

Task routing

Best model per task, inline.

/code

GPT 5.4

best coding

/review

Gemini 3.1 Pro Preview

analysis

/docs

MiniMax 2.7

writing

default

Claude Sonnet 4.6

efficient

Works with your tools

Set up in minutes with any AI coding tool. Change one env var and everything else stays the same.

## Configure Cursor

1. Open Cursor Settings (Cmd+Shift+J or Ctrl+Shift+J)
2. Go to "Models" section
3. Click "Add Model"
4. Set:
   - Model name: inworld/vibe-coding-by-task
   - API Base: https://api.inworld.ai/v1
   - API Key: your-inworld-api-key
5. Select "inworld/vibe-coding-by-task" as your default model

Alternatively, use the OpenAI override:

## Configure Cursor

1. Open Cursor Settings (Cmd+Shift+J or Ctrl+Shift+J)
2. Go to "Models" section
3. Click "Add Model"
4. Set:
   - Model name: inworld/vibe-coding-by-task
   - API Base: https://api.inworld.ai/v1
   - API Key: your-inworld-api-key
5. Select "inworld/vibe-coding-by-task" as your default model

Alternatively, use the OpenAI override:

SDKs & Frameworks

Building an AI-powered app? Use Realtime Router with your favorite SDK or framework.

from openai import OpenAI

# Before:
# client = OpenAI(api_key="sk-...")

# After:
client = OpenAI(
    base_url="https://api.inworld.ai/v1",
    api_key="your-inworld-api-key",
)

# Default route - MiniMax M2.5
response = client.chat.completions.create(
    model="inworld/vibe-coding-by-task",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Use /code prefix for selected coding model:
stream = client.chat.completions.create(
    model="inworld/vibe-coding-by-task",
    messages=[{"role": "user", "content": "/code Implement a connection pool with retry logic"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

from openai import OpenAI

# Before:
# client = OpenAI(api_key="sk-...")

# After:
client = OpenAI(
    base_url="https://api.inworld.ai/v1",
    api_key="your-inworld-api-key",
)

# Default route - MiniMax M2.5
response = client.chat.completions.create(
    model="inworld/vibe-coding-by-task",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

# Use /code prefix for selected coding model:
stream = client.chat.completions.create(
    model="inworld/vibe-coding-by-task",
    messages=[{"role": "user", "content": "/code Implement a connection pool with retry logic"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

FAQ

An OpenAI-compatible API endpoint that routes every LLM request to the right model automatically. Change one env var in your coding agent, then use slash commands like /code, /review, and /docs to switch models mid-session. Automatic fallbacks keep your workflow running if a provider goes down.

Set OPENAI_BASE_URL=https://api.inworld.ai/v1 and OPENAI_API_KEY=your-inworld-api-key in your environment. Most tools pick these up automatically. See the setup guides above for tool-specific instructions, or read the quickstart for step-by-step details.

Prefix your prompt with /code, /review, or /docs. The router reads the prefix and routes to the model configured for that task type. /code routes to the selected coding model (GPT 5.4), /review to an analysis model (Gemini 3.1 Pro Preview), and /docs to a writing model (MiniMax 2.7). No prefix routes to the cost-efficient default (Claude Sonnet 4.6).

Two env var changes: OPENAI_BASE_URL and OPENAI_API_KEY. Set OPENAI_MODEL to inworld/vibe-coding-by-task. Every existing SDK call, system prompt, and tool stays unchanged. See the quickstart for step-by-step instructions.

Hundreds of models from Google, OpenAI, Anthropic, xAI, Mistral, DeepSeek, Meta, Groq, and more. See the full list at inworld.ai/models.

During research preview, all model access is at cost with no markup from Inworld. You pay exactly what the underlying providers charge. No per-seat fees, no monthly minimums. Pricing terms may change when the product exits research preview.

The router automatically fails over to the next provider in the chain. You can configure ordered fallback providers per route, or omit the provider prefix to let the router select and fall back automatically. Your coding session continues without interruption.