Know exactly where
your AI spend goes.

One command installs a local proxy that intercepts every request to Claude, OpenAI, Gemini, and Grok, including OAuth surfaces like ChatGPT and Gemini Code Assist. Every request attributed to a project. Every token costed to the penny. Stored locally. No cloud dependency. No code changes. No SDK swaps.

Install free See how it works uvx halton-meter
02 The problem

LLM spend is real. Price attribution isn't.

Providers give you one monthly total. No project breakdown. No per-developer view. No way to reconcile five dashboards into one number your finance team will sign off on. That gap is LLM price attribution: every token priced to the penny and tied to the project that spent it.

01
$8,400
The invoice arrived

You can't say which project caused it. Which developer. Or whether it could've been $4,000. There's no drill-down. Just a total.

02
4 providers
No single source of truth

Claude, OpenAI, Gemini, Grok: each has its own dashboard. ChatGPT and Gemini Code Assist sit on OAuth surfaces nothing else captures. No unified view. No per-project breakdown. No shared number engineering and finance both trust.

03
1 invoice
Client asks for a breakdown

How does the AI line item break down by project? You can't show them. So they don't quite trust it, and next time they push back on the budget.

03 How it works

A local proxy. One SQLite file. All four providers.

Halton Meter intercepts HTTPS traffic to api.anthropic.com, api.openai.com, generativelanguage.googleapis.com, and api.x.ai via mitmproxy, plus OAuth surfaces atchatgpt.com and cloudcode-pa.googleapis.com that nothing else captures. Zero changes to your workflow. Every request is tagged to a project, costed from live provider pricing, and written to a local SQLite file. The dashboard reads over loopback. Nothing leaves your machine.

Install
One command. Proxy + cert + supervisor.

halton-meter init generates the mitmproxy CA cert, trusts it in the system keychain, and registers the launchd/systemd supervisor. halton-meter start brings the daemon up. A watchdog probes the MITM listener and auto-restarts on failure (capped 3/hour); if the daemon stays down, edge passthrough means your LLM clients keep working: requests fall through directly. Run halton-meter doctor for a full diagnostic.

· Python 3.11+ · macOS · Linux · Windows (beta) · No telemetry
 uvx halton-meter

Prefer pipx? pipx install halton-meter still works.

Tag
Smart Attribution. Zero config.

The daemon walks a resolver chain and picks the first layer that answers. Git repo basename becomes the project tag. In containers you get docker:<hostname> or k8s:<hostname>. Sandboxed Mac apps tag as their bundle ID (mac:ai.perplexity.mac) instead of collapsing to a generic Data folder.

0 SDK swaps · 0 base URL changes · 0 instrumentation
Report
Run the report. See the number.

halton-meter report prints a per-project, per-model cost breakdown from your local SQLite file. Open the dashboard at localhost:3000 for charts and drilldowns. Every figure is computed locally. Not an estimate, not a sampling.

halton-meter report --since 7d --project clara-backend
04 Pricing accuracy

How pricing accuracy works.

Every cost figure Halton Meter shows you was computed locally, from a dated pricing matrix that shipped inside the version you have installed. The cost path never touches the network. The math is reproducible offline today, next quarter, and two years from now, even if a provider republishes their pricing page tomorrow.

That guarantee rests on four things.

  • Bundled, dated rates.

    Each release of halton-meter carries a pricing matrix sourced from each provider's public pricing page: Anthropic, OpenAI, Google Gemini, and xAI, including Gemini's >200k tiered surcharge. The bundle is dated. You can read it.

  • Freshness without surveillance.

    halton-meter doctor quietly checks whether your bundle is behind the latest published manifest and tells you to upgrade if it is. The check is fail-open and skippable on airgapped installs. It never alters a cost number.

  • Negotiated rates, first-class.

    Customers with their own contracts can override any rate. Overrides survive upgrades and are labelled distinctly in every report.

  • Provenance on every row.

    Every logged request stores the exact rate source that priced it: bundled-2026-05-01 or override. A CSV pulled six months from now is self-describing. No "trust us" maths.

05 Product

Terminal report for the developer. Dashboard for everyone else.

halton-meter report prints a per-project, per-model breakdown in seconds. The local dashboard at localhost:3000 turns the same SQLite data into charts your CFO and clients can read. One data source. Two views. Nothing leaves the machine.

ATTRIBUTION

Per-project, per-client, per-developer.

Smart Attribution v0.3. Git repo, container, sandbox bundle ID: the daemon resolves the project tag automatically. HALTON_PROJECT overrides when you want to force one. Stop guessing what to charge.

clara-backend
1,568 requests · 343.8k in / 854.5k out
$107.20
misc
1,005 requests · 196.3k in / 488.0k out
$61.14
unattributed
525 requests · 119.4k in / 296.8k out
$37.37
OPTIMISATION
CLOUD · LIVE NOW

Recommendations that quote a number.

Halton Meter Cloud analyses your last 30 days of local data and tells you exactly where you're overspending, and what swapping models would save. Sits alongside reconciliation against provider billing and a full audit log; the daemon ships the raw data, cloud does the rollup.

47 Opus calls could have been Sonnet
halton-meter · ≤6k input tokens, no thinking required
−$31.20
per day
Enable prompt caching on system prompt
staffhub · 12.4k tokens repeated across 1,842 requests
−$18.40
per day
Route validation calls to Haiku
haltonlabs · 318 calls sub-1k tokens, no reasoning
−$8.80
per day
REPORTS
CLOUD · LIVE NOW

Client-ready cost reports.

Raw cost data shipping today: pull CSV or JSON from the local HTTP API at 127.0.0.1 (loopback). Pipe it anywhere.
Halton Meter Cloud, live at cloud.haltonmeter.com: branded PDF invoices, executive summaries, and per-developer cost reports, all on top of an audit log of every captured request. Attach it. Sign it. Send it.

Cost Report · April 2026
halton-meter
internal · 28 days · generated by halton-meter v0.5.0
Total
$120.01
Requests
2,378
Tokens
1.25M
COVERAGE

Anthropic, OpenAI, Gemini, xAI in production.

Plus OAuth surfaces (ChatGPT and Gemini Code Assist) that nothing else captures. 6 adapters across 4 providers. Each adapter is a single file under daemon/halton_meter/adapters/.

Anthropic
Production · v0.5.0
OpenAI
Beta adapter
Google
Beta adapter
xAI / Grok
Beta adapter
OAUTH
ChatGPT
OAuth · chatgpt.com
OAUTH
Code Assist
OAuth · cloudcode-pa
Adapters live in daemon/halton_meter/adapters/
06 Cloud

When one machine isn't enough.

The daemon captures and attributes cost locally. Cloud is the layer that rolls multiple daemons into one view: team-wide LLM cost attribution, reconciled against provider invoices, with branded PDF reports for clients. It is optional. The daemon keeps working whether you pair it or not.

  • Team rollups: every developer's daemon in one hosted workspace
  • Provider reconciliation: captured spend matched against actual invoices
  • Branded PDF reports: client-ready, generated server-side on demand
cloud.haltonmeter.com
07 Compare

Every alternative requires a code change or a cloud dependency.

6 adapters across 4 providers. 0 SDK changes. 1 process on loopback. LiteLLM and Helicone need a base-URL swap in every codebase. Langfuse and OpenLLMetry need SDK instrumentation in every repo. Helicone fails closed: a service outage blocks your requests. Halton Meter intercepts at the network layer.

Capability Halton MeterLiteLLMLangfuseHeliconeOpenLLMetry
Local-first (no infrastructure) Yes No No No Self-host option
Multi-provider observability Yes Yes Yes Yes Yes
Project-level attribution Yes Tag-based Yes Yes Trace-based
Zero code changes Yes No No No No
Fail-open: never blocks requests Yes No Yes No Yes
Client-ready cost reports Yes No No No No
Intercepts OAuth surfaces Yes No No No No
Stores full request + response bodies (redacted) Yes No Full prompt + output, no built-in redaction Full bodies stored by default; opt-out via header Configurable; off by default in some SDKs

NOTE Where Halton Meter doesn't yet have something, we say "planned" rather than fake-ticking. Honesty is the product.

08 For teams
Live now · cloud.haltonmeter.com

Audit, reconciliation, and team rollups.

The local daemon captures every request on one developer's machine. The hosted cloud rolls those captures up across the team, and adds the two things the daemon cannot do alone: a full audit log of every request and policy event, and reconciliation of captured spend against the invoices your providers actually send. Both surfaces share one cost model; the cloud is the team-shaped roof on the local daemon.

TEAM
Cross-machine team visibility
Roll up spend across every developer's local daemon into one hosted dashboard.
AUDIT
Every request, logged
Every captured request and config change, queryable and exportable when a client asks.
RECONCILIATION
Match captured spend to provider billing
The cloud reconciles your captured spend against the actual invoices each provider sends.
ANALYSIS
Continuous optimisation
Automated recommendations from real usage patterns, not synthetic benchmarks.
TRENDING
Historical trending
Three-month, six-month, year-over-year cost analysis with project drilldown.
REPORTS
Custom client reports
Branded PDF reports on demand, with cost breakdown and methodology.
Get started
Start with Cloud.

Sign up directly at cloud.haltonmeter.com, or leave your email and we'll send you straight there.

Team size

Paid plan · cancel any time · daemon stays free

Link sent.

Check your inbox. Or go straight to cloud.haltonmeter.com.

REF · HM-EA-0000

Runs on your machine. Stays on your machine.

Halton Meter runs as a local binary on your machine. The captured-request database (~/.halton-meter/db.sqlite) is a file on your disk, never replicated, never uploaded. The HTTP API binds 127.0.0.1 and refuses non-loopback connections. There is no analytics endpoint, no crash reporter, no version-check ping. The only outbound traffic the daemon makes is the LLM provider call you were already going to make. Verify the entire surface with lsof -nP -iTCP -sTCP:LISTEN and Little Snitch. A small dashboard ships alongside at localhost:3000: open source under Apache 2.0, free, an accessory rather than the product.

v0.5.0
on PyPI now
17k
installs · all-time
SQLite
local · no cloud
0
outbound endpoints
09

From the workshop

Halton Meter is built by  Halton Labs, a one-person studio that builds software for regulated industries and uses LLMs heavily across every project. We built this because we needed to show clients exactly what their AI work cost. The daemon is what we run; a small dashboard ships with it. The cloud tier (team aggregation, hosted dashboards, reconciliation against provider invoices) is live at cloud.haltonmeter.com.

We're sharing it because the problem is universal. If you're spending real money on AI and can't produce a per-project breakdown, Halton Meter is the tool. Install it in under a minute. If it doesn't work for you, the issue tracker is open.

vk · halton labs MAY '26 · v0.5.0
10 Questions

Common questions

What is LLM cost attribution?
LLM cost attribution ties every LLM API request — and its exact token cost — back to the project that spent it. Halton Meter observes outbound traffic at the proxy, prices each request against published provider rates, and attributes it to a project so you get a per-project LLM cost breakdown instead of one monthly total.
Does Halton Meter require code changes?
No. It runs as a transparent local proxy and captures outbound LLM API calls at the network layer. It never wraps your SDKs, so there are no code changes, no API keys to hand over, and no library to import. One command installs it: uvx halton-meter init --apps.
Which LLM providers does it capture?
Anthropic (Claude), OpenAI, Google Gemini, and xAI / Grok — including OAuth surfaces like ChatGPT and Gemini Code Assist that other tools miss. That unified, multi-provider view is the LLM observability the provider dashboards do not give you.
How does token cost tracking work?
Halton Meter reads the token counts on every captured request and response and prices them against published per-token provider rates, computing the cost to the penny. It is token cost tracking against the real rate card — not an estimate or a sampled average.
Is my data sent anywhere?
No. The daemon is local-first: it stores everything in a SQLite file on your disk (~/.halton-meter/db.sqlite) and its HTTP API binds 127.0.0.1 only. There is no analytics, no telemetry, and no version-check ping. Nothing leaves your machine unless you opt into the hosted cloud tier at cloud.haltonmeter.com.
What does the hosted cloud tier add?
The optional cloud tier adds team rollups, reconciliation against provider invoices, branded PDF cost reports, and a state-change audit log. It pairs with the local daemon over an opt-in encrypted sync. Pricing (USD): Solo $16/mo annual, Team $99/mo annual (10 seats included). The local daemon stays free.