Tag

#machine-that-builds-machines

Articles tagged "machine-that-builds-machines" — 6 entries.

Article №56 fine-tuning NeMo 10 Jun 2026 ~16 min read — synthesis of a two-day advisor build on one Spark

The Refusal Floor Is Trainable — What a Frozen Curveball Proved About Prompts vs Weights

A 30B model with a hand-tuned prompt contract refused 3 of 9 adversarial pretexts and fabricated private-looking state 3 times. A 4B trained for 21 minutes refused 9 of 9. The bench that saw the difference was frozen before training — and that discipline is the whole method.

uses fieldkit.arenafieldkit.eval

Article №55 agentic Foundation 02 Jun 2026 ~15 min read — no setup; a synthesis of work already shipped on this box

Machine that Builds Machines

The Meta-Program on a DGX Spark — When the Tool You Build With Is an Instance of the Thing You Build

The opener for the Machine-that-Builds-Machines arc. The book describes a meta-program on a SaaS platform; this is the same pattern on one personal box — a pane → hands → engine loop where the spec is the application and the skills are configuration over code.

Article №53 fine-tuning Foundation 03 Jun 2026 ~16 min read — a synthesis of a proven run plus the engine it became

Machine that Builds Machines

The Machine Improves Itself — Closed-Loop RLVR on a DGX Spark, Where the Eval Harness Is the Reward

Closed-loop RLVR on one box: an eval→reward→fine-tune loop where the Spark's own verifiers ARE the reward — no learned reward model. The hero finding is defensive: pick the checkpoint on a frozen held-out split, never the training pool, or the loop reports success while it regresses.

uses fieldkit.rlfieldkit.rewardfieldkit.evalfieldkit.lineage

Article №52 fine-tuning NeMo 05 Jun 2026 ~18 min read — synthesis of a multi-day greenfield-vertical build on one Spark

Machine that Builds Machines

The Gate Before the GPU — Deciding SFT vs RL vs RLVR Before You Spend the Run

Building Kepler — a numeric astrodynamics reasoner — from scratch on one Spark. The method choice (SFT vs RL vs RLVR) is decided by cheap gates before any GPU run: a base preflight, an SFT gate, and a Goldilocks headroom gate. A flawless RLVR run that changed nothing is the proof.

uses fieldkit.rlfieldkit.rewardfieldkit.eval

Article №36 fine-tuning NeMo 11 May 2026 ~30 min read

Machine that Builds Machines

Adaptive Turn Clipping on a Single Spark — A²TGPO, Studied from Source

A²TGPO redesigns how Information Gain feeds GRPO: turn-group normalization, variance-rescaled accumulation, and adaptive turn-level clipping. The paper's release is the code; the Spark's contribution is the lineage primitive that records what each trial learned.

uses fieldkit.capabilitiesfieldkit.trainingfieldkit.lineage

Article №35 agentic NeMo 10 May 2026 ~28 min read

Machine that Builds Machines

Reading the Lineage Primitive — cxcscmu Auto-Research, Studied from release_artifacts

cxcscmu's own lineage_on vs lineage_off ablation closes the case: same agent, same trial budget, same prompt template — only the rendered lineage block differs, and the run with lineage produces 5.3× more keeps and 3.2× less wall-time waste. This piece extracts that primitive into fieldkit.lineage.

uses fieldkit.capabilitiesfieldkit.trainingfieldkit.lineage