GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
Active Lane idle no warm brain
OpenRouter $0.00 spend · since start
Unified · 60 s 8 GB guard band shown at top

Orionfold Arena · v0.2

Cockpit

resident brain idle · waiting for the sidecar

▶ Open Chat

Chat with any model — the resident brain, an on-demand local fine-tune, or an OpenRouter frontier — and pick it right in the chat tab.

Artifacts 18 manifests in roster
Articles 55 published deep-dives
Benches 3 cached evidence sources
Runs scored 16 bench + live
Envelope 128 GB unified · 8 GB guard
Top runs · last cut M6 mirror
#1
frontier-only · hermes-cost-routing-local-and-openrouter:cost_router · 1 run
100.0%
#2
cyber · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#3
finance · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#4
medical · hermes-vertical-router-on-spark:vertical_router · 1 run
100.0%
#5
cost-routed · hermes-cost-routing-local-and-openrouter:cost_router · 1 run
91.7%
#6
qwen3-30b-moe-llamacpp-q4km · picking-the-hermes-brain-on-spark:hermes_brain · 1 run
90.0% 84t/s
#7
qwen3-30b-moe-vllm-fp8 · picking-the-hermes-brain-on-spark:hermes_brain · 1 run
87.5% 55t/s
#8
brain · hermes-vertical-router-on-spark:vertical_router · 1 run
80.0%
Active lane resident
This run vs the bar live
What is Spark Arena?

Spark Arena is the operator-driven alternative to public cloud model arenas: private eval leaderboards, efficiency-as-metric (quality and tok/s, unified-mem peak, TTFT, $/M), closed-loop eval → fine-tune → re-rank, tool-call replay, custom rubrics, and a cost-per-quality Pareto frontier anchored to the hardware the votes ran on — here the operator is the hardware.