GPU Util % utilisation
GPU Temp °C die
Unified GB of 128 · 8 GB guard
Throughput tok / second
TTFT ms · first token
Active Lane idle no warm brain
OpenRouter $0.00 spend · since start
Unified · 60 s 8 GB guard band shown at top
Models6quant · lora · adapter
Benches2eval datasets
Notebooks5runnable on-ramps
Tooling5harnesses · skills
Free tier18run offline · yours
Kind License
lora patent free

Offline patent-prosecution reasoning on Spark-class hardware

base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

nemo
recommended BF16 Explore →
quant patent free

Offline patent-prosecution reasoning on Spark-class hardware

base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

nemo
recommended Q5_K_M · 35 tok/s Explore →
quant medical free

ii-medical-8b-gguf

base Intelligent-Internet/II-Medical-8B

recommended Q5_K_M · 36 tok/s Explore →
quant cyber free

securityllm-gguf

base ZySec-AI/SecurityLLM

recommended Q4_K_M · 48 tok/s Explore →
quant legal free

saul-7b-instruct-v1-gguf

base Equall/Saul-7B-Instruct-v1

recommended Q5_K_M · 20 tok/s Explore →
quant finance free

finance-chat-gguf

base AdaptLLM/finance-chat

recommended F16 · 12 tok/s Explore →
bench free

hermes-brain-bench-v0.1

base n/a

0 variants Explore →
bench patent free

patent-strategist-bench-v0.1

base n/a

0 variants Explore →
notebook finance free

Build the finance-chat quant — and call the model — on a Spark or a free cloud GPU

base AdaptLLM/finance-chat

recommended builder Explore →
notebook legal free

Build the Saul-7B quant — and call the legal model — on a Spark or a free cloud GPU

base Equall/Saul-7B-Instruct-v1

recommended builder Explore →
notebook cyber free

Build the SecurityLLM quant — and call the model — on a Spark or a free cloud GPU

base ZySec-AI/SecurityLLM

recommended builder Explore →
notebook medical free

Build the II-Medical-8B quant — and call the reasoner — on a Spark or a free cloud GPU

base Intelligent-Internet/II-Medical-8B

recommended builder Explore →
notebook patent free

Run the patent-strategist build — and use the model — on a Spark or a free cloud GPU

base deepseek-ai/DeepSeek-R1-0528-Qwen3-8B

nemo
recommended builder Explore →
arena run free

An operator cockpit you run on your own DGX Spark

base fieldkit[arena] · Astro + FastAPI sidecar

0 variants Explore →
harness free

When does local stop being enough? Measure first, then route.

base Hermes Agent v0.14.0

recommended Local Spark — Qwen3-30B-A3B MoE Q4_K_M Explore →
skill free

The skills you write for Claude Code load into Hermes unchanged.

base agentskills.io SKILL.md (Hermes / Claude Code compatible)

recommended spark-serve Explore →
harness free

Which local lane should drive your always-on Spark agent?

base Hermes Agent v0.14.0

recommended llama.cpp · Qwen3-30B-A3B (MoE, Q4_K_M) · 88 tok/s Explore →
harness free

One always-on brain, five specialists, zero LLM-classifier overhead.

base Hermes Agent v0.14.0

recommended Default brain (MoE) Explore →