Tag

#hermes

Articles tagged "hermes" — 8 entries.

Article №51 agentic Foundation ~4 hours including the OpenRouter bakeoff + harness publish
Harnesses

Cost-Routing the Hermes Harness — When Local Stops Being Enough on a DGX Spark

The local 30B-MoE on a Spark is at $0 marginal cost — until it isn't. H6 measures the failure-mode curve: where does local stop being enough, and what does the dollar curve look like when you escalate to OpenRouter only when you have to?

uses fieldkit.harnessfieldkit.eval

Article №50 agentic Foundation ~3 hours including bakeoff + harness publish
Harnesses

The Hermes Vertical Router on a DGX Spark — One Brain Always Warm, Five Specialists Summoned on Demand

Five published Orionfold verticals plus the pinned MoE brain become a router on one Spark — not by parallel inference (the unified-memory envelope forbids that), but by a deterministic keyword classifier that dispatches the prompt and serves the right specialist one-at-a-time.

uses fieldkit.harness

Article №49 agentic NIM ~6 hours across three serving lanes, N=5 attempts per prompt
Harnesses

Picking the Hermes Brain on a DGX Spark — When Throughput Stops Being the Answer

The Hermes serving-lane bakeoff couldn't pick a winner: all five lanes cleared the tool-call format bar. A graded brain-quality rubric breaks the tie — and shows the fastest serving lane is also the better agent, by a margin throughput could never have measured.

uses fieldkit.evalfieldkit.harness

Article №48 agentic Foundation ~3 hours, including the live tool-call gate against a local NIM
Harnesses

Hermes Drives the Spark via fieldkit-as-MCP — The Agent That Operates Its Own Machine

The keystone of the Harnesses series: expose a curated slice of fieldkit as MCP tools and the local Hermes agent can measure, quantize, publish, and retrieve on the box itself. The gate is a real llama-bench run the agent drove end-to-end — 0% tool-call format error, no API key.

uses fieldkit.harnessfieldkit.capabilitiesfieldkit.quantfieldkit.publishfieldkit.rag

Article №47 agentic Foundation ~2 hours, most of it the hostile-tool-call containment battery
Harnesses

Hardening the Hermes Harness on a DGX Spark — The Box Contains It, You Don't Trust the Model

Before you leave a tool-wielding agent running on your desk, harden it. One pure function turns Hermes' permissive defaults into a desk-grade posture, then a scripted hostile-tool-call test proves it: egress denied at the sandbox, secrets in .env only, the config surviving a restart.

uses fieldkit.harness

Article №46 deployment NIM ~3 hours, most of it model pulls and four cold-starts
Harnesses

The Hermes Serving Lane on a DGX Spark — MoE vs Dense, and the Number That Actually Picks the Lane

Five Hermes serving lanes on one DGX Spark: Qwen3-30B-A3B MoE vs Qwen3-32B dense across vLLM, llama.cpp, and NIM. The MoE runs ~8.5× faster for the same memory — but the lane is picked by tool-call reliability, which took two config fights to get to 0% everywhere.

uses fieldkit.capabilitiesfieldkit.harnessfieldkit.nim

Article №45 agentic NIM ~1 hour, most of it the NIM's first cold-start
Harnesses

The Hermes Harness on a DGX Spark — A Local Cockpit That Holds Tools, With No API Key

Installing the Hermes agent harness on a DGX Spark and running the first local agent turn against the cached Nemotron-Nano-9B-v2 NIM — reliable tool calls, no API key, no cloud hop. The defensible angle is NIM-first; everyone else's Spark Hermes write-up leads with Ollama.

uses fieldkit.nimfieldkit.capabilitiesfieldkit.harness

Upcoming agentic Foundation planned ~2 hours
Harnesses

Field-Fixing the Hermes Harness on a DGX Spark — When the NIM Won't Stream Tool Calls, and Other Rough Edges

Fifth in the Harnesses series: the field fixes that take a fresh Hermes agent on a local NIM from 'mostly works' to 'just works.' Leads with the one that bit hardest — the Spark NIM ships a non-streaming tool parser, fixed by bind-mounting NVIDIA's own streaming parser.

uses fieldkit.harness