Tag

#deepseek-r1

Articles tagged "deepseek-r1" — 2 entries.

Article №44 fine-tuning NeMo 21 May 2026 ~16 hours wall (7h 34m Unsloth + 5h 38m NeMo + conversion + merge + probe)

Two Trainers, One LoRA: NeMo Framework Beats Unsloth by 26% on a Patent-Strategist Fine-Tune

Same recipe, same R1-distilled base, same 5000-row patent corpus — once via Unsloth, once via NeMo Framework + Megatron-Bridge. NeMo finishes 26% faster and produces 44% longer patent-strategic chains. The cost is one YARN-defaults landmine and a stdout that lied for four hours.

Article №41 fine-tuning Foundation 17 May 2026 ~10 hours (mostly automated overnight sweeps)

Three-Mode Bracket: Baselining a Reasoning Model Before Fine-Tuning, On One Spark

Before you fine-tune a small reasoning model on a domain bench you need to know where it stands. Three context modes — closed, retrieval, oracle — triangulate the model's ceiling on one Spark, no Judge backend or cluster required.