Quantizations.

Open GGUF quants of vertical-finetuned base models — each shipped with a Spark-measured four-axis card so a downstream reader picks the right variant without re-running the eval. 06 published.

Quant GGUF 4 variants

kepler-gguf

Quantization of Qwen/Qwen3-8B. Best on astro-bench v0.1 held-out (n=44, \boxed ±2%): Q8_0 (0.89).

free apache-2.0
Quant trained with nemo GGUF 4 variants

patent-strategist-v3-nemo-gguf

Offline patent-prosecution reasoning on Spark-class hardware

Quantization of deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.

free apache-2.0
Quant GGUF 5 variants

ii-medical-8b-gguf

An 8B medical-reasoning model with a visible think-chain, quantized for offline clinical Q&A

Quantization of Intelligent-Internet/II-Medical-8B. Best on MedMCQA (n=50, mcq_letter): Q5_K_M (0.52).

free apache-2.0
Quant GGUF 5 variants

securityllm-gguf

A 7B cybersecurity chat model, quantized to run offline on a consumer GPU

Quantization of ZySec-AI/SecurityLLM. Best on CyberMetric (n=50, mcq_letter): Q4_K_M (0.40).

free apache-2.0
Quant GGUF 5 variants

saul-7b-instruct-v1-gguf

A 7B legal-domain chat model, quantized to run offline on a consumer GPU

Quantization of Equall/Saul-7B-Instruct-v1. Best on LegalBench (n=50, contains): Q5_K_M (0.72).

free mit
Quant GGUF 5 variants

finance-chat-gguf

A finance-specialized 7B chat model, quantized to run offline on a 4 GB consumer GPU

Quantization of AdaptLLM/finance-chat. Best on FinanceBench (n=50, numeric_match): Q6_K (0.16).

free
More artifacts in preparation End of Jun 2026