Kilo

Kilo Code should use a named OpenAI-compatible provider that points at Codex Pooler’s /v1 base URL. Kilo appends /chat/completions itself, so do not set baseURL to a full /v1/chat/completions endpoint.

Install

Use the current npm package:

npm install -g @kilocode/cli@latest

For one-off use without a global install, run npx -y @kilocode/cli@latest.

Provider shape

For a deployed instance, add ~/.config/kilo/kilo.jsonc:

{
  "$schema": "https://app.kilo.ai/config.json",
  "model": "codex-pooler/gpt-5.5",
  "enabled_providers": ["codex-pooler"],
  "provider": {
    "codex-pooler": {
      "options": {
        "apiKey": "{env:CODEX_POOLER_API_KEY}",
        "baseURL": "https://codex-pooler.example.com/v1"
      },
      "models": {
        "gpt-5.5": {
          "name": "GPT-5.5 via Codex Pooler",
          "tool_call": true,
          "reasoning": true,
          "temperature": false,
          "attachment": true,
          "modalities": {
            "input": ["text", "image"],
            "output": ["text"]
          },
          "limit": {
            "context": 272000,
            "input": 228000,
            "output": 64000
          }
        }
      }
    }
  },
  "compaction": {
    "threshold_percent": 75
  }
}

For local setup, change baseURL to http://localhost:4000/v1.

{env:CODEX_POOLER_API_KEY} keeps the Pool API key outside the config file. Define only model ids your assigned Pool can serve. If you add Kilo permissions, use the object form such as "permission": {"bash": "allow"}; do not set "permission": "ask", which is not a valid Kilo config shape.

Kilo uses OpenCode-style limit.{context,input,output} fields but includes reasoning tokens in overflow accounting and supports compaction.threshold_percent for preflight compaction. limit.input: 228000 leaves 208k usable input tokens after the default 20k reserve; the optional 75% threshold asks Kilo to compact even earlier.

For GPT-5 OpenAI-compatible models, Kilo suppresses the outgoing max-token request field to avoid incompatible max_tokens, so limit.output is still important for local context math and UI even when it is not forwarded.

Route shape

Kilo is a chat-completions client. It sends model requests to POST /v1/chat/completions, and Codex Pooler translates supported chat-completions requests into Codex Responses work internally. Do not point Kilo at /backend-api/codex, /v1/responses, or /v1/chat/completions as the configured base URL.

Connection Check

Run a tool-using prompt from an isolated directory:

mkdir -p /tmp/codex-pooler-kilo-check
cd /tmp/codex-pooler-kilo-check

export CODEX_POOLER_API_KEY=<pool-api-key>
kilo run \
  --model codex-pooler/gpt-5.5 \
  --pure \
  --auto \
  --format json \
  --dir "$PWD" \
  'Use your tools to create kilo-ok.txt containing exactly: kilo ok. After the file exists, reply with exactly: kilo ok'

--pure keeps external plugins out of the check. --auto is only for trusted isolated automation, because it lets Kilo approve tool permissions automatically.

MCP boundary

Kilo model requests use Codex Pooler’s narrow OpenAI-compatible /v1 support for selected SDK routes. Codex Pooler doesn’t provide full OpenAI API parity.

Codex Pooler model use does not require MCP. If you need operator metadata from /mcp, use a separate MCP-capable host and authenticate it with an operator-owned MCP token, not the Pool API key.