Tag

#literature-search

Articles tagged "literature-search" — 1 entry.

Article №28 observability NIM ~3 hours — 30 min plumbing, ~20 min for the runs themselves, the rest is reading what they show
Frontier Scout

AutoResearchBench on Spark — Two NIMs, One Bench, Two Failure Modes

Two Spark-tuned NIMs run AutoResearchBench's three Deep-Research example questions. Llama-3.1-8B crashes by turn 5-6 on its 8K context; Nemotron-Nano-9B-v2 finishes cleanly at 128K. Both score 0% Accuracy@1 — for completely different reasons.

uses fieldkit.nimfieldkit.evalfieldkit.capabilities