Skip to content

Routing Strategies

Routing is the path from a client request to one upstream account. Codex Pooler uses a Pool API key to identify the Pool, checks whether the request can run there, then chooses an eligible upstream account for that turn.

A Pool API key represents a Pool. It doesn’t represent one upstream account. That is the main difference from direct account credentials: clients stay configured with one stable key, while Codex Pooler can choose among the Pool’s active upstream assignments for each supported request.

Use this page with Runtime Routes and Operator admin UI when you need to understand why a request did, or didn’t, choose a specific account.

Every runtime request follows the same broad path:

  1. Codex Pooler authenticates the bearer token as a Pool API key.
  2. It admits the request into the route family and route class, such as normal HTTP, stream, websocket, compact, file upload, audio transcription, or backend control-plane proxy.
  3. It classifies the requested model and route surface, either the Codex backend compatibility route or the narrow OpenAI-compatible /v1 surface.
  4. It finds the authenticated Pool and starts from that Pool’s configured upstream assignments.
  5. It applies hard eligibility checks for assignment status, upstream status, model support, requested capabilities, quota, health, route-class circuit state, file affinity, and session continuity.
  6. It orders the remaining eligible upstream accounts using the Pool’s routing strategy and routing preferences.
  7. It tries the ordered shortlist. If one upstream fails in a retryable way, Codex Pooler can move to the next eligible account in that shortlist.
  8. It records metadata-only request, route, attempt, quota, and audit evidence. It doesn’t store raw prompts, completions, files, audio, images, credentials, or websocket frames.

The important split is this: eligibility decides what is allowed. Strategy decides which allowed account is preferred first.

Hard eligibility checks remove upstream accounts from consideration before strategy ordering begins. A routing strategy can’t select an account that failed these checks.

CheckWhat it means for users
Pool accessThe Pool API key must be active and tied to the Pool that should receive the request
Pool assignmentsThe upstream account must be assigned to that Pool and active for routing
Upstream lifecyclePaused, disabled, deleted, or reauth-required upstream accounts aren’t eligible for new ordinary work
Model availabilityThe requested model must be exposed to the Pool and mapped to the assignment
Capability supportThe assignment must support the requested shape, such as streaming, tools, reasoning, image input, audio transcription, service tier, or compact responses when that support is known and required
QuotaQuota evidence must be usable for the account, model, and limit family. Exhausted, stale, resetless, wrong-model, wrong-account, or untrusted evidence can remove an account
HealthRoute-class circuit state can temporarily remove an account for a route after repeated backend failures
Session continuityExisting Codex sessions, websocket sessions, previous response links, and file affinity can pin the request to the upstream assignment that owns that state

If every account fails eligibility, the request is rejected before upstream dispatch. Common user-facing outcomes include no eligible backend, no compatible backend, quota exhaustion, quota evidence unavailable, session assignment unavailable, or file assignment conflict.

After eligibility, Codex Pooler orders the remaining accounts. The Pool’s strategy affects preference, not permission.

StrategyUser-facing behavior
Bridge ringDefault strategy. It spreads requests with stable scoring, then uses continuity, prompt-cache locality, and temporary demotions to keep useful stickiness without pinning every stateless request
Deterministic rotationRotates the eligible set from a stable request seed so repeated independent requests don’t always start with the same account
Least recent successPrefers accounts that haven’t recently completed successful work, which can help spread successful turns across the Pool
Quota firstPrefers eligible accounts with more usable remaining quota evidence for the requested model

The ring size controls how many ordered eligible accounts are tried for a request. A ring size of 3 means Codex Pooler prepares up to three eligible upstream accounts in order. If the first account fails in a retryable way, it can try the next one. The ring doesn’t include accounts that failed eligibility.

Temporary demotion also affects order. When an upstream attempt fails with a retryable backend or network reason, Codex Pooler can demote that assignment briefly so another eligible account is tried first. A later success clears that demotion.

Session Continuity Versus Stateless Routing

Section titled “Session Continuity Versus Stateless Routing”

Session continuity is stronger than ordinary stateless routing.

Stateless requests can be routed to any eligible upstream account in the Pool. Strategy, quota headroom, prompt-cache locality, and recent failures shape the order.

Session-bound requests are different. If a request belongs to an existing Codex session, websocket session, previous response chain, or file affinity, Codex Pooler keeps it with the upstream assignment that owns that state. This protects resumable conversations, websocket reconnects, tool-result continuations, and uploaded file references.

Continuity headers are local routing inputs. Codex Pooler reads them in this order:

  1. x-codex-session-id
  2. session-id
  3. x-session-affinity
  4. session_id
  5. x-codex-conversation-id

session-id and x-session-affinity aren’t forwarded upstream. They help Codex Pooler find the local routing state only.

If the pinned upstream assignment is no longer eligible, Codex Pooler doesn’t silently move that session to a different account. The user should check the upstream’s lifecycle state, reauth status, Pool assignment, model support, quota evidence, and route health.

Prompt-cache locality is a preference for stateless requests that share a prompt_cache_key.

When prompt-cache affinity is enabled for the Pool and there is no stronger continuity rule, Codex Pooler uses the transient prompt_cache_key as a routing seed. That makes repeat stateless requests prefer the same eligible upstream account, which can improve provider-side cache locality.

This doesn’t store prompts or responses locally. The raw prompt-cache key isn’t stored as request evidence. It is used as a routing input, then request and routing logs remain metadata-only.

Prompt-cache locality can be skipped when:

ConditionResult
No prompt_cache_key is presentNormal strategy ordering applies
Prompt-cache affinity is disabled on the PoolNormal strategy ordering applies
Only one upstream account is eligibleThere is nothing to choose among
A Codex session, idempotency key, or durable affinity already appliesThe stronger continuity rule wins

Fallback is bounded by eligibility and ring size.

If the first chosen upstream account returns a retryable backend status, rate limit, network error, authorization failure, or server error, Codex Pooler can try the next eligible account in the route plan. It records safe attempt metadata and may briefly demote the failing assignment for the same Pool and model.

Fallback doesn’t bypass hard checks. It won’t use a paused account, an account without model support, an account with unusable quota evidence, or an account that conflicts with session or file affinity. For session-bound work, fallback is intentionally limited because moving the request to another account could break the conversation state.

Operators configure routing from the admin UI. The exact visibility depends on role, but the user-facing controls are:

Admin areaWhat users can configure or inspect
PoolsPool lifecycle, Pool assignments, Pool API key assignments, routing strategy, ring size, sticky websocket sessions, HTTP affinity, prompt-cache affinity, analytics forwarding, and /v1 compatibility
UpstreamsUpstream account readiness, lifecycle state, Pool assignment, model support, quota evidence freshness, and reauth state
Pool API keysWhich client credential represents the Pool, plus key status and policy metadata
System settingsGateway admission defaults, route-class limits, diagnostics, circuit thresholds, model metadata, pricing catalog settings, and runtime policy controls
Request logsMetadata-only evidence about route family, Pool, model, selected upstream metadata, status, retries, duration, quota state, and safe error codes

The admin UI doesn’t expose raw prompts, completions, request bodies, response bodies, file bytes, audio bytes, image bytes, websocket frames, bearer tokens, raw Pool API keys after creation, upstream tokens, or Codex auth.json contents.