Sources
BitterMill, cloud, routers
One benchmark surface for local cells, provider APIs, and routed model traffic.
Evidence Over Vendor Claims
BitterBench makes local cells, cloud APIs, and routers comparable by running the same packet through each plane and keeping the metrics, cost, and output evidence in one place.
Sources
One benchmark surface for local cells, provider APIs, and routed model traffic.
Packets
Prompt shape, limits, schemas, and notes travel together so throughput claims stay comparable.
Current phase
The public edge stays simple while the app surface hardens around real benchmark records.
Workloads
A benchmark starts with a durable workload definition, not an anecdotal prompt.
Runs
Queue, prefill, decode, total, cost, tokens, failures, and output evidence should sit in one comparable record.
Decisions
The goal is not pretty charts. The goal is to decide what should run where, and why.
Boundary
`bitterbench.com` should explain the offer clearly. Authentication, workload submission, run history, and comparison state stay on `app.bitterbench.com`.