SQLite schema

What lands on disk for every captured request — the requests table, the projects join, body capture storage, pricing rates, and the millicent integer money convention.

Halton Meter’s primary store is a single SQLite database at ~/.halton-meter/db.sqlite, opened in WAL mode so the proxy hot path (write-mostly) and the FastAPI app (read-only) can share it without locking. The schema is owned by SQLAlchemy models under daemon/halton_meter/storage/models/. This page is the on-disk contract — what columns exist, what they mean, what conventions apply.

The five core tables

Table	Owner model	What it holds
`projects`	`project.py::Project`	One row per project slug
`requests`	`request.py::Request`	One row per intercepted LLM call
`request_bodies`	`request_body.py::RequestBody`	Optional captured request/response bodies
`project_settings`	`project_settings.py::ProjectSettings`	Per-project overrides (e.g. body capture toggle)
`pricing_rates`	`pricing_rate.py::PricingRate`	Provider × model × period rate cards

Plus three operational tables: audit_events, policies, and reconciliation_records — covered in their own contexts.

`requests` — the row that matters

Every metered call lands here. The columns, from storage/models/request.py:

Column	Type	Meaning
`id`	UUID4 string	Primary key
`project_id`	FK → `projects.id`	Result of the Smart Attribution chain
`provider`	string	`anthropic`, `openai`, `gemini`, `xai`
`model`	string	The model id the provider returned (e.g. `claude-opus-4-7`)
`mode`	string	`standard`, `streaming`, `embedding`, …
`input_tokens`	int	Provider-reported input usage
`output_tokens`	int	Provider-reported output usage
`thinking_tokens`	int	Anthropic extended-thinking budget consumed
`cache_read_tokens`	int	Cache hit input tokens
`cache_write_tokens`	int	Cache creation input tokens
`cost_usd_minor_units`	bigint, nullable	Cost in millicents; `NULL` if model unknown to the rate card
`latency_ms`	float, nullable	Wall-clock latency
`tokens_complete`	bool	False for streaming rows where the response stream was truncated mid-flight
`requested_at`	timestamptz	UTC timestamp; indexed

cache_read_tokens and cache_write_tokens are intentionally separate columns. A single cached_tokens column was renamed on 2026-04-30 once cache_creation_input_tokens (the write side) became billable on a different schedule from cache_read_input_tokens.

Money is integer millicents

cost_usd_minor_units stores millicents — 1 USD = 100_000 minor units — as a BIGINT. Display-time conversion:

usd = cost_usd_minor_units / 100_000

The reasons are operational, not theoretical: REAL columns drift on sums, and a half-cent rounding error compounded across a month of captures will drive a reconciliation engineer mad. Storing integers and dividing at display time keeps every aggregation exact.

`projects` — the slug join

id           UUID4 primary key
slug         normalised project slug (the human-readable name)
created_at   first-sighting timestamp

The proxy hot path auto-creates a projects row on first sighting via services.ingest._get_or_create_project. You never have to insert manually; just attribute correctly (see Project tagging) and the row appears.

`request_bodies` — opt-out body capture

When bodies.enabled = true (the default in config.toml), the daemon captures request and response bodies into a sibling table — separately from the metering row, so a body-capture failure cannot drop the cost attribution. The schema:

id            UUID4
request_id    FK → requests.id
direction     'request' | 'response'
content_type  application/json | text/event-stream | …
body          BLOB (redacted, up to bodies.max_body_bytes)
captured_at   timestamptz

Defaults from config.toml:

bodies.enabled = true
bodies.max_body_bytes = 524288 (512 KiB per body)
bodies.retention_days = 90
bodies.cache_ttl_seconds = 300

Redaction passes strip Bearer tokens, API keys, and other recognised secret shapes before the body is written. Per-category counts are visible via:

halton-meter bodies stats

To turn off body capture for a single project (overriding the daemon-wide switch):

halton-meter project set <slug> body-capture off

To purge old bodies manually:

halton-meter bodies purge --older-than 30d --vacuum

`pricing_rates` — temporal rate cards

id           UUID4
provider     string
model        string
mode         string (standard, batch, streaming, …)
input_price_per_million   integer (millicents per 1M tokens)
output_price_per_million  integer
cache_read_price_per_million   integer, nullable
cache_write_price_per_million  integer, nullable
effective_from   timestamptz
effective_to     timestamptz, nullable (NULL = current)

The effective_from / effective_to half-open interval lets the rate card change without rewriting history. halton-meter recompute-costs walks every row in requests, looks up the active rate at requested_at, and rewrites cost_usd_minor_units — useful after a rate-card correction.

Migrations

The daemon runs schema migrations on first connect via storage/repo.py. The _migrations table tracks applied marker rows. If you see a backfill_legacy_requests_v1 row, that is the one-shot migration that rewrote pre-v0.1.10 rows where project was a free-form string into the project_id FK.

Reading the data

Every row is reachable via the local API (loopback only):

~ — read the captures

$ curl -sS http://127.0.0.1:8765/v1/stats
$ curl -sS "http://127.0.0.1:8765/v1/projects/<slug>/requests?limit=50"
$ curl -sS -OJ "http://127.0.0.1:8765/v1/projects/<slug>/requests.csv"

Or directly with sqlite if you prefer:

sqlite3 ~/.halton-meter/db.sqlite '.schema requests'

For teams that want longer history without changing the on-disk shape, Cloud applies the same schema with a 90-day retention window on top of the local store.

What’s next

Local-only guarantee — what does and doesn’t leave the machine
halton-meter report — the responsive Rich-formatted query CLI on top of the schema