# Kilo

Kilo Code should use a named OpenAI-compatible provider that points at Codex Pooler's `/v1` base URL. Kilo appends `/chat/completions` itself, so do not set `baseURL` to a full `/v1/chat/completions` endpoint.

## Install

Use the current npm package:

```bash
npm install -g @kilocode/cli@latest
```

For one-off use without a global install, run `npx -y @kilocode/cli@latest`.

## Provider shape

For a deployed instance, add `~/.config/kilo/kilo.jsonc`:

```jsonc
{
  "$schema": "https://app.kilo.ai/config.json",
  "model": "codex-pooler/gpt-5.5",
  "enabled_providers": ["codex-pooler"],
  "provider": {
    "codex-pooler": {
      "options": {
        "apiKey": "{env:CODEX_POOLER_API_KEY}",
        "baseURL": "https://codex-pooler.example.com/v1"
      },
      "models": {
        "gpt-5.5": {
          "name": "GPT-5.5 via Codex Pooler",
          "tool_call": true,
          "reasoning": true,
          "temperature": false,
          "attachment": true,
          "modalities": {
            "input": ["text", "image"],
            "output": ["text"]
          },
          "limit": {
            "context": 272000,
            "input": 228000,
            "output": 64000
          }
        }
      }
    }
  },
  "compaction": {
    "threshold_percent": 75
  }
}
```

For local setup, change `baseURL` to `http://localhost:4000/v1`.

`{env:CODEX_POOLER_API_KEY}` keeps the Pool API key outside the config file. Define only model ids your assigned Pool can serve. If you add Kilo permissions, use the object form such as `"permission": {"bash": "allow"}`; do not set `"permission": "ask"`, which is not a valid Kilo config shape.

Kilo uses OpenCode-style `limit.{context,input,output}` fields but includes reasoning tokens in overflow accounting and supports `compaction.threshold_percent` for preflight compaction. `limit.input: 228000` leaves 208k usable input tokens after the default 20k reserve; the optional 75% threshold asks Kilo to compact even earlier.

For GPT-5 OpenAI-compatible models, Kilo suppresses the outgoing max-token request field to avoid incompatible `max_tokens`, so `limit.output` is still important for local context math and UI even when it is not forwarded.

## Route shape

Kilo is a chat-completions client. It sends model requests to `POST /v1/chat/completions`, and Codex Pooler translates supported chat-completions requests into Codex Responses work internally. Do not point Kilo at `/backend-api/codex`, `/v1/responses`, or `/v1/chat/completions` as the configured base URL.

## Connection Check

Run a tool-using prompt from an isolated directory:

```bash
mkdir -p /tmp/codex-pooler-kilo-check
cd /tmp/codex-pooler-kilo-check

export CODEX_POOLER_API_KEY=<pool-api-key>
kilo run \
  --model codex-pooler/gpt-5.5 \
  --pure \
  --auto \
  --format json \
  --dir "$PWD" \
  'Use your tools to create kilo-ok.txt containing exactly: kilo ok. After the file exists, reply with exactly: kilo ok'
```

`--pure` keeps external plugins out of the check. `--auto` is only for trusted isolated automation, because it lets Kilo approve tool permissions automatically.

## MCP boundary

Kilo model requests use Codex Pooler's narrow OpenAI-compatible `/v1` support for selected SDK routes. Codex Pooler doesn't provide full OpenAI API parity.

Codex Pooler model use does not require MCP. If you need operator metadata from `/mcp`, use a separate MCP-capable host and authenticate it with an operator-owned MCP token, not the Pool API key.