Quantizations.
Open GGUF quants of vertical-finetuned base models — each shipped with a Spark-measured four-axis card so a downstream reader picks the right variant without re-running the eval. 06 published.
kepler-gguf
Quantization of Qwen/Qwen3-8B. Best on astro-bench v0.1 held-out (n=44, \boxed ±2%): Q8_0 (0.89).
patent-strategist-v3-nemo-gguf
Offline patent-prosecution reasoning on Spark-class hardware
Quantization of deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.
ii-medical-8b-gguf
An 8B medical-reasoning model with a visible think-chain, quantized for offline clinical Q&A
Quantization of Intelligent-Internet/II-Medical-8B. Best on MedMCQA (n=50, mcq_letter): Q5_K_M (0.52).
securityllm-gguf
A 7B cybersecurity chat model, quantized to run offline on a consumer GPU
Quantization of ZySec-AI/SecurityLLM. Best on CyberMetric (n=50, mcq_letter): Q4_K_M (0.40).
saul-7b-instruct-v1-gguf
A 7B legal-domain chat model, quantized to run offline on a consumer GPU
Quantization of Equall/Saul-7B-Instruct-v1. Best on LegalBench (n=50, contains): Q5_K_M (0.72).
finance-chat-gguf
A finance-specialized 7B chat model, quantized to run offline on a 4 GB consumer GPU
Quantization of AdaptLLM/finance-chat. Best on FinanceBench (n=50, numeric_match): Q6_K (0.16).