Routing Strategies
Routing is the path from a client request to one upstream account. Codex Pooler uses a Pool API key to identify the Pool, checks whether the request can run there, then chooses an eligible upstream account for that turn.
A Pool API key represents a Pool. It doesn’t represent one upstream account. That is the main difference from direct account credentials: clients stay configured with one stable key, while Codex Pooler can choose among the Pool’s active upstream assignments for each supported request.
Use this page with Runtime Routes and Operator admin UI when you need to understand why a request did, or didn’t, choose a specific account.
The Request Path
Section titled “The Request Path”Every runtime request follows the same broad path:
- Codex Pooler authenticates the bearer token as a Pool API key.
- It admits the request into the route family and route class, such as normal HTTP, stream, websocket, compact, file upload, audio transcription, or backend control-plane proxy.
- It classifies the requested model and route surface, either the Codex backend compatibility route or the narrow OpenAI-compatible
/v1surface. - It finds the authenticated Pool and starts from that Pool’s configured upstream assignments.
- It applies hard eligibility checks for assignment status, upstream status, model support, requested capabilities, quota, health, route-class circuit state, file affinity, and session continuity.
- It orders the remaining eligible upstream accounts using the Pool’s routing strategy and routing preferences.
- It tries the ordered shortlist. If one upstream fails in a retryable way, Codex Pooler can move to the next eligible account in that shortlist.
- It records metadata-only request, route, attempt, quota, and audit evidence. It doesn’t store raw prompts, completions, files, audio, images, credentials, or websocket frames.
The important split is this: eligibility decides what is allowed. Strategy decides which allowed account is preferred first.
Hard Eligibility Checks
Section titled “Hard Eligibility Checks”Hard eligibility checks remove upstream accounts from consideration before strategy ordering begins. A routing strategy can’t select an account that failed these checks.
| Check | What it means for users |
|---|---|
| Pool access | The Pool API key must be active and tied to the Pool that should receive the request |
| Pool assignments | The upstream account must be assigned to that Pool and active for routing |
| Upstream lifecycle | Paused, disabled, deleted, or reauth-required upstream accounts aren’t eligible for new ordinary work |
| Model availability | The requested model must be exposed to the Pool and mapped to the assignment |
| Capability support | The assignment must support the requested shape, such as streaming, tools, reasoning, image input, audio transcription, service tier, or compact responses when that support is known and required |
| Quota | Quota evidence must be usable for the account, model, and limit family. Exhausted, stale, resetless, wrong-model, wrong-account, or untrusted evidence can remove an account |
| Health | Route-class circuit state can temporarily remove an account for a route after repeated backend failures |
| Session continuity | Existing Codex sessions, websocket sessions, previous response links, and file affinity can pin the request to the upstream assignment that owns that state |
If every account fails eligibility, the request is rejected before upstream dispatch. Common user-facing outcomes include no eligible backend, no compatible backend, quota exhaustion, quota evidence unavailable, session assignment unavailable, or file assignment conflict.
Strategy And Preference Choices
Section titled “Strategy And Preference Choices”After eligibility, Codex Pooler orders the remaining accounts. The Pool’s strategy affects preference, not permission.
| Strategy | User-facing behavior |
|---|---|
| Bridge ring | Default strategy. It spreads requests with stable scoring, then uses continuity, prompt-cache locality, and temporary demotions to keep useful stickiness without pinning every stateless request |
| Deterministic rotation | Rotates the eligible set from a stable request seed so repeated independent requests don’t always start with the same account |
| Least recent success | Prefers accounts that haven’t recently completed successful work, which can help spread successful turns across the Pool |
| Quota first | Prefers eligible accounts with more usable remaining quota evidence for the requested model |
The ring size controls how many ordered eligible accounts are tried for a request. A ring size of 3 means Codex Pooler prepares up to three eligible upstream accounts in order. If the first account fails in a retryable way, it can try the next one. The ring doesn’t include accounts that failed eligibility.
Temporary demotion also affects order. When an upstream attempt fails with a retryable backend or network reason, Codex Pooler can demote that assignment briefly so another eligible account is tried first. A later success clears that demotion.
Session Continuity Versus Stateless Routing
Section titled “Session Continuity Versus Stateless Routing”Session continuity is stronger than ordinary stateless routing.
Stateless requests can be routed to any eligible upstream account in the Pool. Strategy, quota headroom, prompt-cache locality, and recent failures shape the order.
Session-bound requests are different. If a request belongs to an existing Codex session, websocket session, previous response chain, or file affinity, Codex Pooler keeps it with the upstream assignment that owns that state. This protects resumable conversations, websocket reconnects, tool-result continuations, and uploaded file references.
Continuity headers are local routing inputs. Codex Pooler reads them in this order:
x-codex-session-idsession-idx-session-affinitysession_idx-codex-conversation-id
session-id and x-session-affinity aren’t forwarded upstream. They help Codex Pooler find the local routing state only.
If the pinned upstream assignment is no longer eligible, Codex Pooler doesn’t silently move that session to a different account. The user should check the upstream’s lifecycle state, reauth status, Pool assignment, model support, quota evidence, and route health.
Prompt-Cache Locality
Section titled “Prompt-Cache Locality”Prompt-cache locality is a preference for stateless requests that share a prompt_cache_key.
When prompt-cache affinity is enabled for the Pool and there is no stronger continuity rule, Codex Pooler uses the transient prompt_cache_key as a routing seed. That makes repeat stateless requests prefer the same eligible upstream account, which can improve provider-side cache locality.
This doesn’t store prompts or responses locally. The raw prompt-cache key isn’t stored as request evidence. It is used as a routing input, then request and routing logs remain metadata-only.
Prompt-cache locality can be skipped when:
| Condition | Result |
|---|---|
No prompt_cache_key is present | Normal strategy ordering applies |
| Prompt-cache affinity is disabled on the Pool | Normal strategy ordering applies |
| Only one upstream account is eligible | There is nothing to choose among |
| A Codex session, idempotency key, or durable affinity already applies | The stronger continuity rule wins |
Fallback Behavior
Section titled “Fallback Behavior”Fallback is bounded by eligibility and ring size.
If the first chosen upstream account returns a retryable backend status, rate limit, network error, authorization failure, or server error, Codex Pooler can try the next eligible account in the route plan. It records safe attempt metadata and may briefly demote the failing assignment for the same Pool and model.
Fallback doesn’t bypass hard checks. It won’t use a paused account, an account without model support, an account with unusable quota evidence, or an account that conflicts with session or file affinity. For session-bound work, fallback is intentionally limited because moving the request to another account could break the conversation state.
What Operators Configure
Section titled “What Operators Configure”Operators configure routing from the admin UI. The exact visibility depends on role, but the user-facing controls are:
| Admin area | What users can configure or inspect |
|---|---|
| Pools | Pool lifecycle, Pool assignments, Pool API key assignments, routing strategy, ring size, sticky websocket sessions, HTTP affinity, prompt-cache affinity, analytics forwarding, and /v1 compatibility |
| Upstreams | Upstream account readiness, lifecycle state, Pool assignment, model support, quota evidence freshness, and reauth state |
| Pool API keys | Which client credential represents the Pool, plus key status and policy metadata |
| System settings | Gateway admission defaults, route-class limits, diagnostics, circuit thresholds, model metadata, pricing catalog settings, and runtime policy controls |
| Request logs | Metadata-only evidence about route family, Pool, model, selected upstream metadata, status, retries, duration, quota state, and safe error codes |
The admin UI doesn’t expose raw prompts, completions, request bodies, response bodies, file bytes, audio bytes, image bytes, websocket frames, bearer tokens, raw Pool API keys after creation, upstream tokens, or Codex auth.json contents.