Providers · 02 macOS only

OpenAI

Two adapters — api.openai.com for API-key traffic, and chatgpt.com for OAuth Codex sessions. Halton Meter is the only proxy that meters both.

macOS 12+ · Python 3.11+ Reading time 2 min Updated May 11, 2026

Halton Meter has two OpenAI adapters because OpenAI has two distinct call surfaces — and they don’t share auth, paths, or wire shapes:

  • adapters/openai.py — owns api.openai.com. The classic Bearer-token API surface used by the OpenAI Python and Node SDKs, Cursor’s GPT integration, and any direct HTTP client.
  • adapters/openai_codex.py — owns chatgpt.com. The OAuth surface Codex (the ChatGPT-account version) routes through. Different paths, different auth, different parsing — same provider = "openai" in the row so reports aggregate cleanly.

Both adapters share name="openai" so a report --by provider rolls them up. The mode column distinguishes them at row level.

api.openai.com — costed paths

PathModes
/v1/chat/completionsStandard, streaming
/v1/responsesThe current-generation Responses API
/v1/embeddingsEmbeddings (input tokens only; cost = input × rate)

Non-/v1/chat, /v1/responses, /v1/embeddings paths are observed but not metered. Examples: /v1/models, /v1/files, control-plane endpoints.

chatgpt.com — Codex OAuth surface

OpenAI Codex when run under a ChatGPT account uses chatgpt.com-prefixed endpoints with OAuth tokens (not API keys). The Codex adapter is a sibling of the OpenAI one specifically because the two surfaces share neither auth shape nor URL space — broadening the main adapter would have made both fragile. The Codex adapter maps each captured call to the same requests row schema as the API adapter, so reports treat them uniformly.

This is a Halton Meter differentiator. LiteLLM, Helicone, Langfuse, and OpenLLMetry capture API-key traffic only; Codex via ChatGPT auth is invisible to all of them. Halton Meter captures it because it intercepts at the network layer, not the SDK.

Captured fields

For both adapters:

  • provider = "openai"
  • model — from the response
  • input_tokens, output_tokens — from the usage block
  • cache_read_tokens — when usage.prompt_tokens_details.cached_tokens is present
  • cost_usd_minor_units — against the active rate card

thinking_tokens and cache_write_tokens are zero for OpenAI; those columns are Anthropic-shaped.

Streaming

/v1/chat/completions and /v1/responses both support stream=true, which emits text/event-stream chunks. The adapter buffers, parses the final usage chunk, and writes the row. Partial streams write tokens_complete = false.

Tools that route through these adapters

ToolAdapterPath
OpenAI Python / Node SDKopenai.pyapi.openai.com via certifi or NODE_EXTRA_CA_CERTS
Cursor with GPT back-endopenai.pySame
Codex (ChatGPT account)openai_codex.pychatgpt.com via Node trust
curl https://api.openai.com/...openai.pysystem keychain or CURL_CA_BUNDLE

Verifying API-key capture

~ — verify api.openai.com capture
$ halton-meter run -- curl -sS https://api.openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{"model":"gpt-4.1-mini","messages":[...]}'

$ halton-meter report --since 5m --by model

For Codex / ChatGPT capture, run Codex normally after init --apps; the OAuth flow happens against chatgpt.com and is captured by the Codex adapter without further setup.

Error classification

Both adapters classify OpenAI errors into the seven canonical buckets — see Error classification. Shipped in v0.3.0.

HTTPProvider error.type / codeerror_classretryable
400invalid_request_errorbad_requestfalse
401authentication_errorauthfalse
403permission_denied / country-blockedauthfalse
404NotFoundError / model_not_foundbad_requestfalse
408APITimeoutErrortimeouttrue
409ConflictErrorserver_errortrue
422UnprocessableEntityErrorbad_requestfalse
429rate_limit_error (RPM / TPM throttle)rate_limittrue
429insufficient_quota (billing exhausted)authfalse
500APIError / InternalServerErrorserver_errortrue
502bad gatewayserver_errortrue
503overloaded / slow_downserver_errortrue
APIConnectionErrornetworktrue

The two HTTP 429 rows are the key distinction: a rate_limit_error is a throttle (back off, retry), but insufficient_quota is exhausted billing (auth, not retryable). See the judgement-call note on the concept page.

Host matching

api.openai.com and chatgpt.com are matched by exact equality (with optional :port). Subdomains do not match.