Now shipping · Cockpit

Orionfold Arena

The cockpit for running, comparing, and scoring local language models on a DGX Spark.

15.4h
to build
12,733
lines of code
125
tests
14
features
Orionfold Arena cockpit
What it is

A single-screen cockpit for running, comparing, and scoring local language models on one NVIDIA DGX Spark — live GPU telemetry, an efficiency frontier, reference-based eval, and a leak-proof leaderboard over your own artifacts. Private by construction, on the machine under your desk.

Private by construction Single-screen cockpit Over your own artifacts NVIDIA DGX Spark
Inside the cockpit

14 features, one screen

The cockpit

One screen to see every artifact, bench, and the warm model's live telemetry — the operator's home base.

Live telemetry rail

Always-on GPU, temperature, and unified-memory readouts so you watch the Spark's envelope while a model runs.

Leaderboard

Bench-anchored rankings over your own models, served from a leak-proof public mirror — never your prompts or completions.

Efficiency frontier

Quality versus throughput on one chart with the Pareto skyline in gold — where you decide which quant is worth shipping.

Models browser

Every artifact you can run, filterable by kind and license, one click from chat or compare.

Model detail

Positioning, quant economics with the sweet-spot row, known drift, and a per-model efficiency curve — the full card before you commit GPU.

Chat against any lane

Talk to the warm resident model, an on-demand local GGUF, or a hosted lane — markdown, reasoning, and live tok/s in one composer.

Eval prompts + reference scoring

Pull the exact bench a model was measured on, autofill the composer, and auto-score the answer against gold without leaving chat.

Compare — any vs. any

Duel two lanes side by side with a deterministic rubric score and a head-to-head delta strip — local-vs-local, local-vs-hosted, your call.

Command palette

Hit ⌘K and jump anywhere — fuzzy-search every model, article, and lane, or fire a chat or compare without touching the mouse.

The Lab

A living board of what's shipped, what's next, and what's being explored, with a built-together timeline mined from the commit log.

Orionfold Arena — live preview
Live preview

Drive the cockpit yourself

The preview is recorded on a DGX Spark and runs sidecar-less — no GPU, no backend, nothing phones home. Chat and Compare replay real sessions token-by-token; the telemetry rail, efficiency frontier, and leaderboard are the genuine cuts.

Walk the cockpit, open a model card, duel two lanes in Compare, then run it for real on your own Spark with one command.

The build

How it came together

Orionfold Arena was built in one day and an overnight (~15.4 hours) — 12,733 lines, 125 tests, 12 sessions of agentic coding with Claude Code. The launch story walks every feature, the build metrics, and the workflow that produced it.

Orionfold Arena · ships inside fieldkit

Run it on your own Spark.

Install fieldkit with the arena extra, start the sidecar, and open the cockpit over your own models, artifacts, and benches.

Install the cockpit

Terminal
$ pip install fieldkit[arena]