Tag

#lora

Articles tagged "lora" — 12 entries.

Article №44 fine-tuning NeMo 21 May 2026 ~16 hours wall (7h 34m Unsloth + 5h 38m NeMo + conversion + merge + probe)

Two Trainers, One LoRA: NeMo Framework Beats Unsloth by 26% on a Patent-Strategist Fine-Tune

Same recipe, same R1-distilled base, same 5000-row patent corpus — once via Unsloth, once via NeMo Framework + Megatron-Bridge. NeMo finishes 26% faster and produces 44% longer patent-strategic chains. The cost is one YARN-defaults landmine and a stdout that lied for four hours.

Article №43 fine-tuning Foundation 19 May 2026 ~1 hour (one container, six gates, two GGUFs)

Machine that Builds Machines

Unsloth on the Spark — When the Train-Time Peak Equals the Base-Load Peak

Six gates clear in one container against the v1 reset: pip install --no-deps preserves the s40 stack, FastLanguageModel loads at 16.94 GB peak, a 100-step LoRA train holds the same envelope, save_pretrained_gguf() emits both quants in 207 seconds end-to-end.

Article №42 fine-tuning Foundation 19 May 2026 ~12 hours (2× 131-min trains + diagnosis)

Machine that Builds Machines

The Trainer Was Fine, the Corpus Wasn't: Three Misdiagnoses on a Patent-Specialist Fine-Tune

Five thousand rows of synthetic patent reasoning, two clean 131-minute LoRA trains, three rounds of confident diagnosis — and none of them found the bug. The bug was the corpus all along. A field report on the cheapest mistake to make on the Spark.

Article №34 fine-tuning NeMo 09 May 2026 ~18.5 hours wall (50 T²PO steps + three evals)

Frontier Scout

T²PO on Spark — When the Training Pool Says 28/32 and Held-out Says 9/158

T²PO's two deltas on the Phase 6 ClawGym harness: mean turns 5.00 → 4.61, task_complete 154/158, but the per-assertion ceiling stays flat at 47.7%. The strongest training-side step (45) is the worst held-out checkpoint — pool saturation lies on a single Spark.

uses fieldkit.capabilitiesfieldkit.evalfieldkit.training

Article №33 fine-tuning NeMo 05 May 2026 ~9 hours wall (34 GRPO steps + two evals)

Frontier Scout

ClawGym GRPO on Spark — Closing the Loop the SFT Adapter Couldn't

Phase 5 SFT taught the agent to keep working but never to stop. 34 GRPO steps with a shaped reward unlearn the failure mode — same model, same base, same LoRA-init, but task_complete climbs 0/158 → 154/158, mean turns drop 12 → 5, and per-assertion still inches up +3.1 pp.

Article №32 fine-tuning NeMo 05 May 2026 ~3 days end-to-end (mostly waiting on rollouts)

Frontier Scout

ClawGym on Spark — A 7B Base, A LoRA Adapter, and the +15 pp the Adapter Earned

ClawGym shipped only a .github profile, so we built the substrate ourselves — persona task synth, sandbox harness, 200-task corpus, LoRA SFT, matched-base eval. The adapter earns +3.8 pp task pass and +15.0 pp per-assertion against its own base. The diagnostic is the lift.

uses fieldkit.nim

Article №25 fine-tuning NeMo Customizer 01 May 2026 ~2 hours wall — 4 min LoRA training, 4 min race, the rest writing

Machine that Builds Machines

Distilling the Architect — A 3B LoRA Trained on the Agent's Own Trajectory

A4's 50-iter trajectory becomes training data for a Qwen2.5-3B LoRA proposer. Holding out 8 iters, the 3B mode-collapses onto d_model=768 (the trajectory's most-frequent keep) and matches 0 / 8 exact; the 8B at T=0.5 matches 4 / 8 of its own past picks.

Article №23 foundations Foundation 25 Apr 2026 ~15 minute read · no GPU required

Looking Beyond Spark

What the Agent Actually Built — Five Articles in Plain English, and Why You Probably Don't Want to Train From Scratch

Five technical articles in one day built an unattended AI research loop on a desk for $0.02 of electricity. The plain-English readout: what the agent built (not a usable model), what it changes for one person, and a four-tier roadmap from LoRA in minutes to from-scratch in weeks.

Article №16 foundations Foundation 23 Apr 2026 ~25 minute read

Looking Beyond Spark

Looking Beyond Spark — Fine-Tuning a 100B Nemotron

A working answer to: how many GPUs to fine-tune a 100B Nemotron? Three methods, three memory footprints — full FT ≈ 1.6 TB needs 24× H100; LoRA ≈ 250 GB fits 8× H100; QLoRA ≈ 65 GB fits 1× H200. The Spark's 3B LoRA teaches the math.

uses fieldkit.capabilities

Article №14 fine-tuning Hugging Face PEFT + Qwen2.5-3B-Instruct 23 Apr 2026 ~45 minutes end-to-end — 5 min corpus via NIM 8B, 69 s training, 3 min benchmark, plus a 6 GB base-model download

Second Brain

LoRA on Your Own Q&A — What 231 Pairs Actually Teach a 3B Model

231 own-voice Q&A pairs, a rank-16 LoRA, 69 s of training on a GB10 Spark. The adapter won't memorize your exact numbers, but it will take a model that refuses 61% of questions about your work and turn it into one that answers all of them in your voice. For facts you still need RAG.

uses fieldkit.eval

Upcoming fine-tuning NeMo Customizer + Nemotron Nano 9B v2 planned ~4 hours per sweep

LLM Wiki

LoRA on Nemotron Nano — Fine-tuning a 9B Without Blowing Unified Memory

A planned walk through LoRA fine-tuning on Nemotron Nano 9B with NeMo Customizer: rank and alpha sweeps, a tiny domain corpus, and the memory accounting that keeps a PEFT run from tripping the Spark's 128 GB unified-memory wall.

Upcoming agentic NemoClaw ~30 min read

Machine that Builds Machines

SkillOS: Learning Skill Curation for Self-Evolving Agents — Spark reproduction notes

Reproducing the SkillOS curator/executor split on a DGX Spark — both Qwen3-8B (frozen executor + LoRA-trained curator) over a markdown SkillRepo with BM25 retrieval, then extracting the pattern into `fieldkit.skills`.