Tag
#continued-pretraining
Articles tagged "continued-pretraining" — 1 entry.
Machine that Builds Machines
Continued Pre-training on a DGX Spark — NeMo Framework Without a Cluster
When does it make sense to continue pre-training on a single GB10 box, and when is it a category error? A planned run that pushes NeMo Framework, Megatron-LM parallelism, and BF16 mixed precision against the 128 GB unified-memory wall with a small domain corpus.