Orionfold Arena

Now shipping 6

Models browser

The dead "Models" tab is now a filterable capability catalog — positioning, quant economics, bounded drift, deep-links into chat & compare.

A Pareto frontier anchored to real Spark hardware — quality × throughput scatter with a deterministic gold skyline, on the leaderboard marquee.

Side-by-side now renders markdown + syntax highlighting at chat parity, with a rubric-derived winner banner and a head-to-head delta strip.

A global fuzzy palette over every model, article, lane, and page — keyboard-first, offline-safe, with ask/compare quick actions.

in this build

The cockpit now contrasts live resident-brain throughput against the published baselines that 49 deep-dives measured.

A public window into the operator+AI build loop, with an operator-private margin (pin a note on any card when the sidecar is live).

in this build

Next queued 6

pip install fieldkit[arena] → fieldkit arena up → the full cockpit at 127.0.0.1:7866/arena/demo/ — no clone, no npm. The app rides the package.

Mutate ~/.hermes/config.yaml from the LanePill to hot-swap the resident brain without leaving chat.

Today compare pits the resident brain against the OpenRouter frontier; v0.2 lets you duel two local quants head-to-head.

Regenerate a single assistant turn into a side-by-side variant instead of a hard replace.

Promote the session-switcher popover into a persistent left rail of prior chats.

Proposed deep-dive (placeholder).

Exploring open questions 4

Whether a 120B MoE can repin as the resident brain if it clears the multi-step capacity wall the 30B-MoE hit on H6 (assumption A3).

A real background job surface for long eval sweeps, promoted from BackgroundTasks once the queue actually wires up (assumption A1).

A static Arena preview as an Orionfold Space, matching where the models and datasets already live.

Wire the second-brain MCP into ⌘K so "search the blog" returns ranked article chunks inline.

Built together 9 Arena commits · newest first

2026-05-28 v0.2 leap pt.1 — rebrand to Orionfold Arena, models browser, efficiency frontier, compare depth, ⌘K palette 716875d0
2026-05-28 v0.1.1 cockpit density + chat overhaul aa86d92d
2026-05-28 standalone web app shell + cockpit chrome polish f3de82bd
2026-05-28 M6 leaderboard + mirror exporter b22f2b4b
2026-05-28 M5 compare + rubric scorer 76993d57
2026-05-28 M4 chat against the resident brain 1a4e7000
2026-05-28 M3 telemetry SSE + cockpit gauge live d016d9bf
2026-05-28 M2 retroactive import (40 lanes + 17 bench rows + 55 articles) ee186de7
2026-05-28 M1 spec + skeleton (Cockpit series · spark-arena-v1) f6a6734e