Fallbacks

Stay online when any AI provider goes down

List your backup models in one ordered array and Router swaps to the next one the moment a provider refuses. One OpenAI-compatible endpoint, every provider, no custom retry logic.

Set up failover See the docs

Failover event

Primary

auto-retry 1x with backoff

openai/gpt-5.4 · 429 · retry-after 60

Fallback

stream continuity preserved

anthropic/claude-sonnet-4-6 · 200 · streaming

Router

Stay up when any provider goes down.

Ordered cross-provider chains, streaming continuity, transparent retries. One JSON field turns a single point of failure into four nines of uptime.

Cross-provider uptime

Works with

Router

Four nines when no single provider hits three.

Chain three providers in an ordered array and compounded uptime lands at 99.99%. One field, no custom retry logic.

Three providers, one chain

OpenAI

99.5%

Anthropic

99.5%

Google

99.5%

99.99%

combined uptime

Cross-provider uptime

Works with

Router

Four nines when no single provider hits three.

Chain three providers in an ordered array and compounded uptime lands at 99.99%. One field, no custom retry logic.

Three providers, one chain

OpenAI

99.5%

Anthropic

99.5%

Google

99.5%

99.99%

combined uptime

No retry code to write

Works with

Router

Declare the chain. Router handles the swap.

List the models you want tried, in order, and Router moves on when one refuses. Your app never writes backoff logic again.

Failover chain · time to success

openai/gpt-5.4

429

anthropic/claude-sonnet-4-6

503

google/gemini-3.1-pro

200

0ms

250ms

500ms

750ms

1000ms

Failover chain · time to success

openai/gpt-5.4

429

anthropic/claude-sonnet-4-6

503

google/gemini-3.1-pro

200

0ms

250ms

500ms

750ms

1000ms

No retry code to write

Works with

Router

Declare the chain. Router handles the swap.

List the models you want tried, in order, and Router moves on when one refuses. Your app never writes backoff logic again.

Cross-provider by default

Works with

Router

Same-vendor fallbacks fail in the same outage.

A gpt-5.4 to gpt-4.1 chain doesn't help when OpenAI itself is down. Mix OpenAI, Anthropic, Google in one list.

Cross-provider fallback matrix

OpenAI

Anthropic

Google

OpenAI

Anthropic

Google

Green edges stay up when the red edge (same-vendor pair) fails.

Cross-provider by default

Works with

Router

Same-vendor fallbacks fail in the same outage.

A gpt-5.4 to gpt-4.1 chain doesn't help when OpenAI itself is down. Mix OpenAI, Anthropic, Google in one list.

Cross-provider fallback matrix

OpenAI

Anthropic

Google

OpenAI

Anthropic

Google

Green edges stay up when the red edge (same-vendor pair) fails.

Streaming survives the swap

Works with

Router

Mid-stream failure, mid-stream recovery.

If a primary fails after streaming starts, Router swaps to the next model and resumes. Your SSE client sees one continuous response.

Streaming survives the swap

// stream starts on openai/gpt-5.4
chunk 0: 'The core'
chunk 1: ' idea is'
// 503 at chunk 2 — Router swaps to Claude
chunk 2: ' to treat'
chunk 3: ' async as'
// your client never saw the transition

Streaming survives the swap

// stream starts on openai/gpt-5.4
chunk 0: 'The core'
chunk 1: ' idea is'
// 503 at chunk 2 — Router swaps to Claude
chunk 2: ' to treat'
chunk 3: ' async as'
// your client never saw the transition

Streaming survives the swap

Works with

Router

Mid-stream failure, mid-stream recovery.

If a primary fails after streaming starts, Router swaps to the next model and resumes. Your SSE client sees one continuous response.

Keep the SDK you already have

Works with

Router

Your OpenAI SDK, plus one field. That's the whole integration.

No new client to install, no gateway library to learn. Fallbacks ride on the request you're already sending, so your dependencies stay where they are.

No gateway library to install

one JSON field

client.chat.completions.create({
  model: 'openai/gpt-5.4',
  messages: [...],
  // the one field that gives you fallbacks
  models: ['anthropic/claude-sonnet-4-6', 'google/gemini-3.1-pro'],
});
// no LiteLLM. no custom retry wrapper. just OpenAI SDK + one field.

Keep the SDK you already have

Works with

Router

Your OpenAI SDK, plus one field. That's the whole integration.

No new client to install, no gateway library to learn. Fallbacks ride on the request you're already sending, so your dependencies stay where they are.

No gateway library to install

one JSON field

client.chat.completions.create({
  model: 'openai/gpt-5.4',
  messages: [...],
  // the one field that gives you fallbacks
  models: ['anthropic/claude-sonnet-4-6', 'google/gemini-3.1-pro'],
});
// no LiteLLM. no custom retry wrapper. just OpenAI SDK + one field.

Post-mortem-ready receipts

Works with

Router

Know exactly what happened when it mattered.

Every swap logs what tripped it, how many retries it took, and which model finally answered. When something breaks at 2am, the trail is already there.

Failover log · last 3 events

live tail

02:14:06

429 rate_limit

200

openai/gpt-5.4 · retries: 1 · anthropic/claude-sonnet-4-6

02:14:09

503 capacity

200

openai/gpt-5.4 · retries: 0 · google/gemini-3.1-pro

02:14:12

timeout 8s

200

openai/gpt-5.4 · retries: 0 · anthropic/claude-sonnet-4-6

Every swap logs trigger, retry count, and which model finally answered.

Failover log · last 3 events

live tail

02:14:06

429 rate_limit

200

openai/gpt-5.4 · retries: 1 · anthropic/claude-sonnet-4-6

02:14:09

503 capacity

200

openai/gpt-5.4 · retries: 0 · google/gemini-3.1-pro

02:14:12

timeout 8s

200

openai/gpt-5.4 · retries: 0 · anthropic/claude-sonnet-4-6

Every swap logs trigger, retry count, and which model finally answered.

Post-mortem-ready receipts

Works with

Router

Know exactly what happened when it mattered.

Every swap logs what tripped it, how many retries it took, and which model finally answered. When something breaks at 2am, the trail is already there.

FAQ

Add a `models` array to your Chat Completions request with an ordered list of backup models. Example: `{ model: 'openai/gpt-5.4', models: ['anthropic/claude-sonnet-4-6', 'google/gemini-3.1-pro'] }`. If the primary returns a 429, 503, or a provider 5xx, Router retries on the next model in the list.

Yes. Router treats every provider the same, OpenAI, Anthropic, Google, Meta, Mistral, Groq, Fireworks. Your fallback chain can mix providers freely. The request shape is identical across them, so your response parser never needs to change.

Yes. SSE streaming passes through Router. If the primary fails mid-stream, Router swaps to the next model and resumes streaming without your client seeing a disconnect. Tool-use and structured-output streams handle the same way.

Yes. Router is OpenAI SDK compatible, change the base URL, keep your code. The `models` field is a native extension on the Chat Completions request; no custom client or gateway library.

Only the successful model charges tokens. A 429 or 503 before any tokens stream returns zero cost. A partial stream that fails over charges the primary for the partial response and the fallback for the completion. All logged per-request.

Every failover event is logged: the trigger (429, 503, timeout), retry count, backoff duration, primary and fallback model, final status. Query via the logs API or watch live in the portal.

Router tracks provider health continuously. A provider with rolling failures gets deprioritized automatically, even inside the same fallback chain, so a persistent outage doesn't add latency to every subsequent request.

Router is free during Research Preview. Zero markup on underlying model costs, zero fee per failover event.