































































Route to different models based on user attributes like language, location, or subscription tier. Each user gets the model that fits them best.
curl 'https://api.inworld.ai/router/v1/routers' \
-H "Authorization: Basic $INWORLD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"displayName": "User-Aware",
"routes": [
{
"condition": {
"cel_expression": "language == \"es\""
},
"route": {
"variants": [
{ "variant": { "modelId": "openai/gpt-5.2" }, "weight": 100 }
]
}
},
{
"condition": {
"cel_expression": "plan == \"free\""
},
"route": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-haiku-4-5" }, "weight": 100 }
]
}
}
],
"defaultRoute": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-sonnet-4-6" }, "weight": 100 }
]
}
}'curl 'https://api.inworld.ai/router/v1/routers' \
-H "Authorization: Basic $INWORLD_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"displayName": "User-Aware",
"routes": [
{
"condition": {
"cel_expression": "language == \"es\""
},
"route": {
"variants": [
{ "variant": { "modelId": "openai/gpt-5.2" }, "weight": 100 }
]
}
},
{
"condition": {
"cel_expression": "plan == \"free\""
},
"route": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-haiku-4-5" }, "weight": 100 }
]
}
}
],
"defaultRoute": {
"variants": [
{ "variant": { "modelId": "anthropic/claude-sonnet-4-6" }, "weight": 100 }
]
}
}'Route to different models based on user attributes like language, location, or subscription tier. Each user gets the model that fits them best.
A typical gateway adds about 5% to every call. Realtime Router passes provider rates through at cost, with zero markup on routed third-party models. At $150K per month in LLM spend, that is $90K per year back. Route to Inworld's first-party realtime inference and top open models run up to 50% below the public third-party rate.
View pricing
A typical gateway adds about 5% to every call. Realtime Router passes provider rates through at cost, with zero markup on routed third-party models. At $150K per month in LLM spend, that is $90K per year back. Route to Inworld's first-party realtime inference and top open models run up to 50% below the public third-party rate.
View pricing
Pass language, country, tier, or emotion as metadata. CEL expressions evaluate conditions in real time and pick the right model per user. Update rules without redeploying.
Pass language, country, tier, or emotion as metadata. CEL expressions evaluate conditions in real time and pick the right model per user. Update rules without redeploying.
Swap your base URL to api.inworld.ai, update your API key, done. Full OpenAI and Anthropic SDK compatibility. No request or response changes.

from openai import OpenAIclient = OpenAI(base_url="https://api.openai.com/v1"base_url="https://api.inworld.ai/v1")
Swap your base URL to api.inworld.ai, update your API key, done. Full OpenAI and Anthropic SDK compatibility. No request or response changes.

from openai import OpenAIclient = OpenAI(base_url="https://api.openai.com/v1"base_url="https://api.inworld.ai/v1")
Split traffic between Claude and GPT with sticky user assignment. Measure retention, satisfaction, or conversion per model. Ramp the winner to 100% via API or dashboard.
A/B testing docsSplit traffic between Claude and GPT with sticky user assignment. Measure retention, satisfaction, or conversion per model. Ramp the winner to 100% via API or dashboard.
A/B testing docsAdd an audio parameter to any chat completions request. Get streamed text and speech in a single response. Route to the best LLM, then pipe straight into Realtime TTS. No second integration. No added latency.

{"model": "inworld/my-router","messages": [...],"audio": {"voice": "Sarah","model": "inworld-tts-2"}}
Add an audio parameter to any chat completions request. Get streamed text and speech in a single response. Route to the best LLM, then pipe straight into Realtime TTS. No second integration. No added latency.

{"model": "inworld/my-router","messages": [...],"audio": {"voice": "Sarah","model": "inworld-tts-2"}}
If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Your workflow never stops. Works with Cursor, Claude Code, Codex CLI, Aider, Continue, and Windsurf.
If a model goes down or rate-limits you, traffic reroutes instantly to the next best option. Your workflow never stops. Works with Cursor, Claude Code, Codex CLI, Aider, Continue, and Windsurf.

Capability | Direct APIs | Inworld Router | OpenRouter |
|---|---|---|---|
Models available | 1 per provider | 220+, across all providers | Hundreds across providers |
Pricing markup | None | None. Provider rates at cost | 5.5% fee on credit purchases |
Context-aware routing (CEL) | Build it yourself | Built in | Not available |
A/B testing with sticky users | Build it yourself | Built in | Not available |
Automatic failover | Build it yourself | TTFT-based | Retry on error |
Integrated TTS | Separate integration | Single API call | Not available |
Per-request observability | Build it yourself | Built in + export | Basic logging |
SDK compatibility | N/A (native) | OpenAI + Anthropic | OpenAI only |
Caching | Provider-dependent | Implicit + explicit | Provider-dependent |
Web search / grounding | Provider-dependent | Tool-based + native | Available |
“We scaled from prototype to 1 million users in 19 days with over 20x cost reduction.”
Fai Nur, CEO, Status · Self-reported
