Quantizations — Artifacts

Quant GGUF 2 variants 10 Jun 2026

advisor-gguf

A governed 4B advisor over your corpus — exact source-id citations, trusted refusals, local on a DGX Spark

Quantization of nvidia/NVIDIA-Nemotron-3-Nano-4B-BF16. Best on advisor curveball-v0.2, frozen OOD bench (n=21, scored==strict; refusals 9/9, 0 private-state risk): Q4_K_M (0.86).

free other

Quant GGUF 4 variants 05 Jun 2026

kepler-gguf

A numeric astrodynamics reasoner — one verifiable boxed number out, served local on a DGX Spark for $0 a query

Quantization of Qwen/Qwen3-8B. Best on astro-bench v0.1 held-out (n=44, \boxed ±2%): Q8_0 (0.89).

free apache-2.0

Quant trained with nemo GGUF 4 variants 22 May 2026

patent-strategist-v3-nemo-gguf

Offline patent-prosecution reasoning on Spark-class hardware

Quantization of deepseek-ai/DeepSeek-R1-0528-Qwen3-8B.

free apache-2.0

Quant GGUF 5 variants 16 May 2026

ii-medical-8b-gguf

An 8B medical-reasoning model with a visible think-chain, quantized for offline clinical Q&A

Quantization of Intelligent-Internet/II-Medical-8B. Best on MedMCQA (n=50, mcq_letter): Q5_K_M (0.52).

free apache-2.0

Quant GGUF 5 variants 15 May 2026

securityllm-gguf

A 7B cybersecurity chat model, quantized to run offline on a consumer GPU

Quantization of ZySec-AI/SecurityLLM. Best on CyberMetric (n=50, mcq_letter): Q4_K_M (0.40).

free apache-2.0

Quant GGUF 5 variants 14 May 2026

saul-7b-instruct-v1-gguf

A 7B legal-domain chat model, quantized to run offline on a consumer GPU

Quantization of Equall/Saul-7B-Instruct-v1. Best on LegalBench (n=50, contains): Q5_K_M (0.72).

free mit

Quant GGUF 5 variants 14 May 2026

finance-chat-gguf

A finance-specialized 7B chat model, quantized to run offline on a 4 GB consumer GPU

Quantization of AdaptLLM/finance-chat. Best on FinanceBench (n=50, numeric_match): Q6_K (0.16).

free