Tag

#reinforcement-learning

Articles tagged "reinforcement-learning" — 2 entries.

Article №36 fine-tuning NeMo ~30 min read
Machine that Builds Machines

Adaptive Turn Clipping on a Single Spark — A²TGPO, Studied from Source

A²TGPO redesigns how Information Gain feeds GRPO: turn-group normalization, variance-rescaled accumulation, and adaptive turn-level clipping. The paper's release is the code; the Spark's contribution is the lineage primitive that records what each trial learned.

uses fieldkit.capabilitiesfieldkit.trainingfieldkit.lineage

Upcoming agentic NemoClaw ~30 min read
Machine that Builds Machines

SkillOS: Learning Skill Curation for Self-Evolving Agents — Spark reproduction notes

Reproducing the SkillOS curator/executor split on a DGX Spark — both Qwen3-8B (frozen executor + LoRA-trained curator) over a markdown SkillRepo with BM25 retrieval, then extracting the pattern into `fieldkit.skills`.