Skip to content

ADR-001: Observability Surfaces and Cost Attribution

DateStatusDecidersRelatedConfidence
2026-04-09AcceptedFizeau maintainersFEAT-005, FEAT-006, SD-001High

Context

AspectDescription
ProblemThe planning stack still treats OpenTelemetry as an optional add-on while the project needs one analytics model that can be compared against Claude Code and Codex without losing replay fidelity or corrupting cost data.
Current StateFizeau writes JSONL session logs for replay. Existing docs describe per-model pricing-table estimation and leave OTel undecided. Research across Claude Code, Codex, and OTel shows that JSONL-style transcripts are useful for replay, while OTel GenAI conventions are the best available interoperability layer for analytics.
RequirementsPreserve exact local replay, support cross-tool analytics, record cost without guessing, distinguish local runtimes such as bragi, vidar, and grendel, and expose enough timing data to compute throughput only when the provider supplies a valid timing window.

Decision

We will keep JSONL session logs as the replay artifact and adopt OpenTelemetry GenAI-aligned spans and metrics as the canonical analytics surface.

We will record provider- or gateway-reported cost when it is available. If no reported cost exists, we will use only explicit runtime-specific pricing for the exact provider system and resolved model. If neither exists, cost remains unknown. Fizeau will not guess cost from generic stale price tables.

We will use standard OTel GenAI fields wherever the semantic conventions cover the need, and a ddx.* namespace for gaps such as billed cost source, runtime-specific timing, and provider-system identity.

The detailed telemetry schema, formulas, and interoperability rules live in docs/helix/02-design/contracts/CONTRACT-001-otel-telemetry-capture.md.

Key Points: JSONL remains replay-first | OTel is analytics-first | known cost is preserved and unknown cost stays unknown

Alternatives

OptionProsConsEvaluation
Keep JSONL as the only observability surfaceLowest implementation complexity, preserves replay, no collector dependencyWeak interoperability, poor analytics ergonomics, no standard cross-tool schemaRejected: insufficient for the stated analytics goal
Copy Claude Code or Codex local storage formats directlyEasier superficial compatibility with one tool at a timeVendor internals are not a stable universal contract, locks Fizeau to foreign storage assumptions, still leaves cost/timing gapsRejected: wrong abstraction boundary
Use JSONL for replay and OTel for analytics, with project-namespaced extensions for gapsPreserves exact session reconstruction, aligns with the strongest emerging standard, supports one analytics tool across Fizeau, Claude, and CodexAdds telemetry implementation work, requires careful schema discipline, some fields such as cost still need custom attributesSelected: best fit for interoperability without sacrificing replay fidelity

Consequences

TypeImpact
PositiveReplay and analytics are both first-class and no longer compete for one storage shape.
PositiveFizeau can normalize analytics across Claude Code, Codex, and its own session logs using OTel-compatible ingestion.
PositiveVariable-price gateways such as OpenRouter can preserve actual billed cost instead of being recomputed later from unstable tables.
PositiveLocal runtimes such as bragi, vidar, and grendel become explicit analytics dimensions rather than disappearing behind a generic local-provider bucket.
NegativeTelemetry adds implementation complexity and collector configuration surface beyond the current JSONL logger.
NegativeSome required analytics fields, especially cost source and runtime-specific timing, still need a DDX-specific namespace until OTel standardizes them.
NeutralJSONL remains necessary even after OTel lands, because replay and forensic inspection require exact turn bodies and sequence order.

Risks

RiskProbImpactMitigation
OTel overhead is high enough to distort per-tool-call sessionsMMKeep export best-effort, benchmark overhead, and allow sampling/export config without removing local JSONL logging
Providers expose inconsistent timing breakdowns, making throughput incomparableHMDerive rates only from validated timing windows and leave missing rates unset
Teams misuse ddx.* fields and create schema driftMHKeep a single documented telemetry contract in FEAT-005 and SD-001, with explicit namespaced-field ownership
Session totals become ambiguous when some turns have known cost and others do notMMDefine totals as unknown if any contributing turn is unknown, and surface unknown-cost counts in usage reporting

Validation

Success MetricReview Trigger
PRD, feature specs, architecture, and solution design all describe the same split model: JSONL replay plus OTel analyticsAny governing artifact still framing OTel as optional or JSONL as the sole observability surface
Providers that report cost preserve that cost without recomputation driftReported billing differs from stored billing after ingestion
Unknown-cost sessions remain explicitly unknownAny code path silently fills unknown cost from a generic fallback table
Throughput analytics are emitted only when backed by matching timing windowsA dashboard displays input, output, or cache tok/s without source timing evidence
One analytics tool can ingest Fizeau telemetry alongside Claude/Codex adaptersCross-tool ingestion requires bespoke schema logic for Fizeau alone

Concern Impact

  • No concern impact: This ADR does not override the repo’s active Go/library concerns. It selects an observability strategy within existing concerns.

References

Review Checklist

  • Context names a specific problem
  • Decision statement is actionable
  • At least two alternatives were evaluated
  • Each alternative has concrete pros and cons
  • Selected option’s rationale explains why it wins
  • Consequences include positive and negative impacts
  • Negative consequences have mitigations
  • Risks are specific with probability and impact assessments
  • Validation section defines review triggers
  • Concern impact is complete
  • ADR is consistent with governing feature spec and PRD requirements