Providers · 03 macOS only

Gemini

Two adapters cover the public Google AI API and the Gemini Code Assist OAuth surface — covering both API-key Gemini and the IDE OAuth flow.

macOS 12+ · Python 3.11+ Reading time 2 min Updated May 11, 2026

Google ships Gemini in two architecturally distinct ways, and Halton Meter has an adapter for each:

  • adapters/gemini.py — owns generativelanguage.googleapis.com. The public Google AI Studio / Generative Language API. API-key auth, REST endpoints under /v1*/models/<model>:<verb>.
  • adapters/gemini_code_assist.py — owns cloudcode-pa.googleapis.com. The Gemini Code Assist surface used by the JetBrains and VS Code Code Assist plugins. OAuth tokens, internal paths under /v1internal:<verb>.

Both adapters share name="gemini" so reports roll them up under one provider. The mode column distinguishes them at row level.

Public API — costed paths

generativelanguage.googleapis.com:

PathPurpose
/v1*/models/<model>:generateContentStandard generation
/v1*/models/<model>:streamGenerateContentStreaming variant
/v1*/models/<model>:embedContentSingle embedding
/v1*/models/<model>:batchEmbedContentsBatch embeddings
/v1*/models/<model>:countTokensToken-count helper (zero-cost row, useful for visibility)

/v1* covers /v1, /v1beta, /v1beta1, /v1alpha — the API version path is part of the URL and matched as a prefix.

Code Assist — costed paths

cloudcode-pa.googleapis.com:

PathPurpose
/v1internal:generateContentStandard Code Assist completion
/v1internal:streamGenerateContentStreaming Code Assist completion

This is a differentiator. LiteLLM, Helicone, Langfuse, and OpenLLMetry don’t capture Gemini Code Assist — it doesn’t go through their SDK shims. Halton Meter captures it because it intercepts at the network layer.

Captured fields

For both adapters:

  • provider = "gemini"
  • model — from the response’s usageMetadata.modelVersion
  • input_tokensusageMetadata.promptTokenCount
  • output_tokensusageMetadata.candidatesTokenCount
  • thinking_tokensusageMetadata.thoughtsTokenCount, when extended thinking is enabled
  • cost_usd_minor_units — against the active rate card

cache_read_tokens and cache_write_tokens are zero — Google’s prompt-cache pricing model is different from Anthropic’s and isn’t yet split out.

Tools that route through these adapters

ToolAdapterAuth
Google AI Studio Python SDK (google-genai)gemini.pyAPI key
curl against generativelanguage.googleapis.comgemini.pyAPI key
Gemini Code Assist (JetBrains, VS Code)gemini_code_assist.pyOAuth

Verifying public-API capture

~ — verify Gemini capture
$ halton-meter run -- curl -sS \
  "https://generativelanguage.googleapis.com/v1beta/models/gemini-2.5-flash:generateContent?key=$GEMINI_API_KEY" \
  -d '{"contents":[{"parts":[{"text":"hi"}]}]}'

$ halton-meter report --since 5m --by model

For Code Assist capture, run Code Assist normally after init --apps — the OAuth handshake against cloudcode-pa.googleapis.com is captured by the Code Assist adapter without further setup.

Error classification

Gemini reports errors as gRPC status codes mapped onto HTTP. Both adapters classify them into the seven canonical buckets — see Error classification. Shipped in v0.3.0.

gRPC statusHTTPerror_classretryable
INVALID_ARGUMENT400bad_requestfalse
UNAUTHENTICATED401authfalse
PERMISSION_DENIED403authfalse
NOT_FOUND404bad_requestfalse
FAILED_PRECONDITION400bad_requestfalse
RESOURCE_EXHAUSTED429rate_limittrue
DEADLINE_EXCEEDED504timeouttrue
ABORTED409server_errortrue
INTERNAL500server_errortrue
UNAVAILABLE503server_errortrue

Host matching

generativelanguage.googleapis.com and cloudcode-pa.googleapis.com are matched exactly. Other Google subdomains (aiplatform.googleapis.com, cloudaicompanion.googleapis.com) are not in scope today.