Explorer
Browser Analytical Workbench
Benchmark Explorer
Initializing browser database...
Summary
What Was Collected
The explorer loads the valid benchmark cell dataset directly in the browser with DuckDB-WASM. A cell is one result-bearing run: an explicit timeout, a completed graded pass, or a completed graded failure. Setup/auth/provider-invalid rows are excluded from the Parquet table and counted in the manifest.
Rows
-
Pass rate
-
Timeouts
-
Tasks
-
Models
-
GPUs
-
Tokens
-
Wall p50
-
Task Type Distribution
GPU Distribution
Wall Time by GPU
Wall Time by Model
Raw Database
Benchmark Cells
Raw enum filters
Rows
-
Pass rate
-
Timeouts
-
Models
-
GPUs
-
Tokens
-
Known cost
-
Wall p50
-
Pairwise Gap
Model Comparison
Comparison scope filters
Combination Aggregates