hermes-brain-bench-v0.1

A reproducible agent-brain-quality benchmark for scoring local language models on a DGX Spark — published open with provenance and a wire-back to the field note.

free · cc-by-4.0 28 May 2026

Bench scaffold

The repository is published as the canonical home for this bench's rows; shape composition, modes, and results land on this page once the companion field note exercises the bench end-to-end.

How to load

License: free · cc-by-4.0. Released as a HuggingFace dataset; available via the standard datasets library.

from datasets import load_dataset

ds = load_dataset("Orionfold/hermes-brain-bench-v0.1", split="train")
print(ds)  # N rows

Companion methodology

This bench is the methodology artifact for the field note picking-the-hermes-brain-on-spark — the paired article walks through how the seven shapes were designed, how the three-mode bracket was scored, and what the headline finding means for the next fine-tuning cycle.

Read the methodology article