Tag

#reasoning

Articles tagged "reasoning" — 2 entries.

Article №40 deployment llama.cpp 16 May 2026 ~5 hours end-to-end on a DGX Spark

Orionfold/II-Medical-8B-GGUF on Spark — five medical-reasoning variants, MedMCQA mini-eval, ChatML reasoning format

Five GGUF variants of Intelligent-Internet/II-Medical-8B (Qwen3-8B + DAPO reasoning recipe) measured on a DGX Spark. Q5_K_M lands at 36.4 tok/s, 5.45 GB, and 52% on a MedMCQA n=50 mini-eval — above F16. First reasoning recipe in the series.

uses fieldkit.quantfieldkit.publishfieldkit.evalfieldkit.lineage

Article №29 inference Foundation 02 May 2026 ~2 hours — most of it watching vLLM 0.20 build inside an NGC PyTorch container; the runtime+drift diagnosis that follows is the short, sharp half

Frontier Scout

Test-Time Distilling on Spark — Same Compute Envelope, Wider Semantic Reach

ESamp adds a tiny test-time-trained probe to vLLM that converts decoding from lexical resampling into semantic exploration. The runtime is vLLM-native — and that is a Spark catalog-gap story before it is a benchmark.

uses fieldkit.evalfieldkit.capabilities