harness

One always-on brain, five specialists, zero LLM-classifier overhead.

A Spark holds one strong model warm at a time. The pinned MoE is excellent at general agentic work but is not your domain expert. The five Orionfold vertical GGUFs are domain experts but compete for the same 128 GB envelope. A router picks per prompt: keyword-matched prompts get the right specialist (warm on demand, ~5–10 s), everything else stays with the brain.

base Hermes Agent v0.14.0 · license mit ·recommended Default brain (MoE)

▶ Try in chat ＋ Send to compare

What it's for

Route a Hermes agent prompt to a vertical specialist by keyword
Reproduce the 30-prompt router-accuracy + per-vertical quality bench
Embed a deterministic, auditable router into a Hermes config

Audience — DGX Spark power users running a local, no-API-key agent harness across multiple domains.

Quant economics quality × speed per build

Variant
Patent prosecution
Legal reasoning
Financial analysis
Defensive cyber
Clinical reasoning
Default brain (MoE) sweet spot

Known drift bounded · honest

Router-accuracy sample size router classification measured over 30 prompts (5 per vertical + 5 default-brain) — not a large-N guarantee.
Keyword-set tuning vertical keywords were tuned against the 30 bench prompts (5 per vertical); out-of-distribution prompts may misroute.
Per-vertical pass-rate basis 5 prompts per vertical; deterministic substring/regex rubrics — open-ended answers (haiku, drafted claims) marked vibe.
One-at-a-time vertical serving verticals are served on demand on :8090 (~5–10s warm); the default brain stays warm on :8080 (always-on, ~32 GB).

Get it

Open on HuggingFace ↗ Read the build article

Run it local

Yours, offline, on the Spark:

pip install fieldkit[arena]
fieldkit arena up

then drive this model from the cockpit — prompts and telemetry never leave the box.