A single-screen cockpit for running, comparing, and scoring local language models on one NVIDIA DGX Spark — live GPU telemetry, an efficiency frontier, reference-based eval, and a leak-proof leaderboard over your own artifacts. Private by construction, on the machine under your desk.
14 features, one screen
Drive the cockpit yourself
The preview is recorded on a DGX Spark and runs sidecar-less — no GPU, no backend, nothing phones home. Chat and Compare replay real sessions token-by-token; the telemetry rail, efficiency frontier, and leaderboard are the genuine cuts.
Walk the cockpit, open a model card, duel two lanes in Compare, then run it for real on your own Spark with one command.
How it came together
Orionfold Arena was built in one day and an overnight (~15.4 hours) — 12,733 lines, 125 tests, 12 sessions of agentic coding with Claude Code. The launch story walks every feature, the build metrics, and the workflow that produced it.
Orionfold Arena · ships inside fieldkit
Run it on your own Spark.
Install fieldkit with the arena extra, start the sidecar, and open the cockpit over your own models, artifacts, and benches.
Install the cockpit
$ pip install fieldkit[arena]▌