Models browser
The dead "Models" tab is now a filterable capability catalog — positioning, quant economics, bounded drift, deep-links into chat & compare.
fieldkit arena serve on the Spark to feed this rail.
Co-iteration · built in tandem
Orionfold Arena is built by the operator and Claude, in the open. This is the workbench: what we’re shipping now, what’s queued next, and what we’re still exploring — plus a timeline of every commit the loop has produced.
The dead "Models" tab is now a filterable capability catalog — positioning, quant economics, bounded drift, deep-links into chat & compare.
A Pareto frontier anchored to real Spark hardware — quality × throughput scatter with a deterministic gold skyline, on the leaderboard marquee.
Side-by-side now renders markdown + syntax highlighting at chat parity, with a rubric-derived winner banner and a head-to-head delta strip.
A global fuzzy palette over every model, article, lane, and page — keyboard-first, offline-safe, with ask/compare quick actions.
The cockpit now contrasts live resident-brain throughput against the published baselines that 49 deep-dives measured.
A public window into the operator+AI build loop, with an operator-private margin (pin a note on any card when the sidecar is live).
pip install fieldkit[arena] → fieldkit arena up → the full cockpit at 127.0.0.1:7866/arena/demo/ — no clone, no npm. The app rides the package.
Mutate ~/.hermes/config.yaml from the LanePill to hot-swap the resident brain without leaving chat.
Today compare pits the resident brain against the OpenRouter frontier; v0.2 lets you duel two local quants head-to-head.
Regenerate a single assistant turn into a side-by-side variant instead of a hard replace.
Promote the session-switcher popover into a persistent left rail of prior chats.
Proposed deep-dive (placeholder).
Proposed deep-dive (placeholder).
Proposed deep-dive (placeholder).
Proposed deep-dive (placeholder).
Proposed deep-dive (placeholder).
Proposed deep-dive (placeholder).
Whether a 120B MoE can repin as the resident brain if it clears the multi-step capacity wall the 30B-MoE hit on H6 (assumption A3).
A real background job surface for long eval sweeps, promoted from BackgroundTasks once the queue actually wires up (assumption A1).
A static Arena preview as an Orionfold Space, matching where the models and datasets already live.
Wire the second-brain MCP into ⌘K so "search the blog" returns ranked article chunks inline.
716875d0 aa86d92d f3de82bd b22f2b4b 76993d57 1a4e7000 d016d9bf ee186de7 f6a6734e