GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
Active Lane idle no warm brain
OpenRouter $0.00 spend · since start
Unified · 60 s 8 GB guard band shown at top

← Models

What it's for
  • Builder: reproduce the release — feasibility envelope, quantize sweep, the Spark-tested quad + variants table, publish — as fieldkit calls
  • User: differential diagnosis, management-and-contraindication reasoning, drug-interaction analysis, and documented second opinions with the <think> chain surfaced
  • User: ground answers in guideline text with fieldkit.rag and gate the MCQ shape with fieldkit.eval.mcq_letter
  • Both: run offline on a DGX Spark or on a free Colab / Kaggle GPU (dual-path, runtime-detected)

Audience — AI researchers and engineers who want to reproduce the quant, and clinicians, medical educators, and health-app developers who want a private, on-device reasoning assistant — on Spark-class hardware (GB10, 128 GB unified memory) or a free cloud GPU.

Quant economics quality × speed per build
Variant
builder sweet spot
user
Known drift bounded · honest
  • Cloud (Colab / Kaggle) path serves the Q4_K_M quant; the Spark path serves Q5_K_M One quant level apart, and the medical bench is the wider gap — Q4_K_M scores 42% on the MedMCQA n=50 mini-eval vs Q5_K_M's 52% (10 points, ~1.4× the binomial noise floor); both run the identical code path. See the sibling GGUF card.
  • The builder notebook's quantize + publish steps render the recorded Spark run, not a live re-execution 2 recorded Spark-only cells (the quantize sweep and the publish dry-run); the remaining cells — feasibility envelope, the spark_quad panel, and the variants table — run live on any runtime from the manifest.
  • The user notebook's live model-chat cells are not captured in the published marketing snapshot 4 use-case cells call the model live on any runtime; the snapshot defers their capture pending a serving-detok check, so the reasoning-chain output is described, not screenshotted (the deterministic charts + banners are captured).