Halton Meter’s primary store is a single SQLite database at
~/.halton-meter/db.sqlite, opened in WAL mode so the proxy hot path
(write-mostly) and the FastAPI app (read-only) can share it without
locking. The schema is owned by SQLAlchemy models under
daemon/halton_meter/storage/models/. This page is the on-disk
contract — what columns exist, what they mean, what conventions apply.
The five core tables
| Table | Owner model | What it holds |
|---|---|---|
projects | project.py::Project | One row per project slug |
requests | request.py::Request | One row per intercepted LLM call |
request_bodies | request_body.py::RequestBody | Optional captured request/response bodies |
project_settings | project_settings.py::ProjectSettings | Per-project overrides (e.g. body capture toggle) |
pricing_rates | pricing_rate.py::PricingRate | Provider × model × period rate cards |
Plus three operational tables: audit_events, policies, and
reconciliation_records — covered in their own contexts.
requests — the row that matters
Every metered call lands here. The columns, from
storage/models/request.py:
| Column | Type | Meaning |
|---|---|---|
id | UUID4 string | Primary key |
project_id | FK → projects.id | Result of the Smart Attribution chain |
provider | string | anthropic, openai, gemini, xai |
model | string | The model id the provider returned (e.g. claude-opus-4-7) |
mode | string | standard, streaming, embedding, … |
input_tokens | int | Provider-reported input usage |
output_tokens | int | Provider-reported output usage |
thinking_tokens | int | Anthropic extended-thinking budget consumed |
cache_read_tokens | int | Cache hit input tokens |
cache_write_tokens | int | Cache creation input tokens |
cost_usd_minor_units | bigint, nullable | Cost in millicents; NULL if model unknown to the rate card |
latency_ms | float, nullable | Wall-clock latency |
tokens_complete | bool | False for streaming rows where the response stream was truncated mid-flight |
requested_at | timestamptz | UTC timestamp; indexed |
cache_read_tokens and cache_write_tokens are intentionally separate
columns. A single cached_tokens column was renamed on 2026-04-30 once
cache_creation_input_tokens (the write side) became billable on a
different schedule from cache_read_input_tokens.
Money is integer millicents
cost_usd_minor_units stores millicents — 1 USD = 100_000 minor
units — as a BIGINT. Display-time conversion:
usd = cost_usd_minor_units / 100_000
The reasons are operational, not theoretical: REAL columns drift on
sums, and a half-cent rounding error compounded across a month of
captures will drive a reconciliation engineer mad. Storing integers and
dividing at display time keeps every aggregation exact.
projects — the slug join
id UUID4 primary key
slug normalised project slug (the human-readable name)
created_at first-sighting timestamp
The proxy hot path auto-creates a projects row on first sighting via
services.ingest._get_or_create_project. You never have to insert
manually; just attribute correctly (see Project
tagging) and the row appears.
request_bodies — opt-out body capture
When bodies.enabled = true (the default in config.toml), the daemon
captures request and response bodies into a sibling table — separately
from the metering row, so a body-capture failure cannot drop the cost
attribution. The schema:
id UUID4
request_id FK → requests.id
direction 'request' | 'response'
content_type application/json | text/event-stream | …
body BLOB (redacted, up to bodies.max_body_bytes)
captured_at timestamptz
Defaults from config.toml:
bodies.enabled = truebodies.max_body_bytes = 524288(512 KiB per body)bodies.retention_days = 90bodies.cache_ttl_seconds = 300
Redaction passes strip Bearer tokens, API keys, and other recognised secret shapes before the body is written. Per-category counts are visible via:
halton-meter bodies stats To turn off body capture for a single project (overriding the daemon-wide switch):
halton-meter project set <slug> body-capture off To purge old bodies manually:
halton-meter bodies purge --older-than 30d --vacuum pricing_rates — temporal rate cards
id UUID4
provider string
model string
mode string (standard, batch, streaming, …)
input_price_per_million integer (millicents per 1M tokens)
output_price_per_million integer
cache_read_price_per_million integer, nullable
cache_write_price_per_million integer, nullable
effective_from timestamptz
effective_to timestamptz, nullable (NULL = current)
The effective_from / effective_to half-open interval lets the rate
card change without rewriting history. halton-meter recompute-costs
walks every row in requests, looks up the active rate at
requested_at, and rewrites cost_usd_minor_units — useful after a
rate-card correction.
Migrations
The daemon runs schema migrations on first connect via
storage/repo.py. The _migrations table tracks applied marker rows.
If you see a backfill_legacy_requests_v1 row, that is the one-shot
migration that rewrote pre-v0.1.10 rows where project was a free-form
string into the project_id FK.
Reading the data
Every row is reachable via the local API (loopback only):
$ curl -sS http://127.0.0.1:8765/v1/stats
$ curl -sS "http://127.0.0.1:8765/v1/projects/<slug>/requests?limit=50"
$ curl -sS -OJ "http://127.0.0.1:8765/v1/projects/<slug>/requests.csv" Or directly with sqlite if you prefer:
sqlite3 ~/.halton-meter/db.sqlite '.schema requests' For teams that want longer history without changing the on-disk shape, Cloud applies the same schema with a 90-day retention window on top of the local store.
What’s next
- Local-only guarantee — what does and doesn’t leave the machine
halton-meter report— the responsive Rich-formatted query CLI on top of the schema