GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
Active Lane idle no warm brain
OpenRouter $0.00 spend · since start
Unified · 60 s 8 GB guard band shown at top

← Models

What it's for
  • Route a Hermes agent prompt to a vertical specialist by keyword
  • Reproduce the 30-prompt router-accuracy + per-vertical quality bench
  • Embed a deterministic, auditable router into a Hermes config

Audience — DGX Spark power users running a local, no-API-key agent harness across multiple domains.

Quant economics quality × speed per build
Variant
Patent prosecution
Legal reasoning
Financial analysis
Defensive cyber
Clinical reasoning
Default brain (MoE) sweet spot
Known drift bounded · honest
  • Router-accuracy sample size router classification measured over 30 prompts (5 per vertical + 5 default-brain) — not a large-N guarantee.
  • Keyword-set tuning vertical keywords were tuned against the 30 bench prompts (5 per vertical); out-of-distribution prompts may misroute.
  • Per-vertical pass-rate basis 5 prompts per vertical; deterministic substring/regex rubrics — open-ended answers (haiku, drafted claims) marked vibe.
  • One-at-a-time vertical serving verticals are served on demand on :8090 (~5–10s warm); the default brain stays warm on :8080 (always-on, ~32 GB).