Demos
Terminal recordings of Fizeau. One reel runs inside a Docker container
with llama-server + a 0.5 B Qwen coder model — no GPU, no API key, no
internet. Three more were captured against qwen/qwen3.6-27b via
OpenRouter, kept here for comparison. The final three reels — usage,
update --check-only, and a peek at the on-disk JSONL session log —
demonstrate Fizeau-specific affordances (cost attribution, atomic
self-update, and structured observability) and require no LLM at all.
About time compression. Slow operations (model downloads, model loads, long LLM responses) are fast-forwarded in the playback so the reels stay watchable. When this happens a dimmed banner appears, like
⏩ Fast-forward: model load (47.2s → 2.0s), and the cast title is suffixed with[time-compressed]. Everything else plays at wall-clock speed. The threshold (default: any LLM turn over 8 seconds) is set viaFIZEAU_LATENCY_THRESHinmake demos-regen.
Cost cap halts the loop mid-run
fiz run --cost-cap-usd 0.005 -p '<task>' walks each iteration’s running
cost against the cap and refuses to issue the next llm.request once
projected cost would cross the line. Status budget_halted, exit code 2.
Demoed against a tiny scratch repo with a per-file editing task that
naturally takes more than $0.005 of Qwen3.6-27B time.
Origin: OpenRouter (qwen/qwen3.6-27b). Captured 2026-05-10. Real spend: $0.0035 of $0.005 cap.
Quickstart — install fiz, run a query, no GPU
The literal end-to-end “getting started” flow: install the binary,
download a 390 MB GGUF model, start llama-server, and run your first
prompt. Captured in the CPU-only Docker image (demos/docker/Dockerfile.cpu).
Origin: Docker / local llama-server. Model: Qwen2.5-Coder-0.5B-Instruct (Q4_K_M) — 390 MB on disk, ~900 MB RSS at first boot. Runs on a 2-core / 2 GB CI runner. See
demos/docker/for the Dockerfile.
Read a file and explain it
Model reads main.go using the read tool and describes the program.
Origin: OpenRouter /
qwen/qwen3.6-27b.
Edit a config file
Model reads a config, edits the port number, and verifies the change.
Origin: OpenRouter /
qwen/qwen3.6-27b.
Explore project structure
Model uses bash to find all Go files and summarizes the package layout.
Origin: OpenRouter /
qwen/qwen3.6-27b.
Cost attribution — known vs unknown
fiz usage rolls up every session JSONL in your history and prints
per-(provider, model) totals. Where the catalog has a price for the
model the COST column is exact; where it doesn’t (a self-hosted
vllm deployment, a model with no published rate) the column reads
unknown rather than guessing. Operators can see at a glance which
slice of their spend Fizeau can attribute and which it cannot — the
“never guess” policy from the cost-attribution spec made tangible.
Origin: local (no LLM call). Reads existing
~/.fizeau/sessions/*.jsonl.
Self-update check
fiz ships as a single static binary and updates itself in place; the
--check-only flag does the version comparison without downloading or
swapping anything. Exit code 1 means “outdated”, 0 means “current”. A
shell script can wrap this for a daily cron, or you can drop the flag
to perform the actual atomic in-place upgrade.
Origin: local (single GET to GitHub releases API, no LLM call).
Structured session log on disk
Every fiz invocation appends a line-delimited JSON event log to
~/.fizeau/sessions/<session-id>.jsonl. The file is the source of
truth for fiz replay, fiz usage, and downstream observability.
A short jq projection over the first three events shows the
per-turn token counts, latency, and model identifier — every figure
on the website’s benchmark pages comes from rolling up these files.
Origin: local (reads
demos/sessions/file-read.jsonl, the same JSONL that powers the Read a file and explain it reel above).
How these are produced
| Reel | Capture path | Backend | Model |
|---|---|---|---|
quickstart | make demos-capture-docker | local llama-server | Qwen2.5-Coder-0.5B-Instruct (Q4_K_M) |
file-read | make demos-capture (OpenRouter) | OpenRouter API | qwen/qwen3.6-27b |
file-edit | make demos-capture (OpenRouter) | OpenRouter API | qwen/qwen3.6-27b |
bash-explore | make demos-capture (OpenRouter) | OpenRouter API | qwen/qwen3.6-27b |
fiz-usage | ./demos/capture-subcommands.sh | local | n/a (reads existing session logs) |
fiz-update-check | ./demos/capture-subcommands.sh | local | n/a (single GitHub releases GET) |
fiz-jsonl | ./demos/capture-subcommands.sh | local | n/a (reads demos/sessions/) |
The first four reels render to asciicast v2 via make demos-regen from
canonical session JSONLs in
demos/sessions/.
Rendering is deterministic and never makes a live LLM call. The
time-compression banner is implemented in
demos/regen.py.
The three subcommand reels (fiz-usage, fiz-update-check, fiz-jsonl)
have no agent loop, so they bypass regen.py and emit asciicast v2
directly via
demos/scripts/build-subcommand-cast.py.
Each step’s stdout is captured verbatim from a real fiz invocation —
no fabrication — and replayed with realistic typing/pause delays.