Methodology
This is a working draft. Final copy ships in plan 5.
Tools
Every number on this site is produced by hwbench, an open-source benchmark tool. The tool captures CPU, memory, storage, LLM inference, power, and thermal metrics into a single JSON file per run.
Metrics we headline
- Generation tok/s. Sustained token-generation rate.
- tok/s/W. Our primary ranking metric — efficiency matters more than peak throughput for 24/7 home inference.
- TTFT. Time-to-first-token; latency matters for chat UX.
- Peak CPU temp + min clock. Exposes sustained-load throttling.
Reproducibility
git clone kranky-ai/hwbench && ./install.sh gets you the same tool we use.
Every benchmark commit on this site includes the hwbench git SHA, so any result can be re-run.
What this site does not do
- We do not run vendor-supplied benchmarks.
- We do not accept sponsored reviews (Phase 1 — sample-with-disclosure may begin later).
- We do not anonymize bad results. If a machine is bad, we say so.