Quant · GGUF · 5 variants
ii-medical-8b-gguf
Quantization of Intelligent-Internet/II-Medical-8B .
Spec matrix
Ranks within each column drive the heatmap. Lower perplexity, higher throughput, higher vertical eval — the sweet-spot row balances all three.
| Variant | Perplexity ↓ | Spark tok/s ↑ | Vertical eval ↑ |
|---|---|---|---|
| Q4_K_M | 16.5500 | 43.57 | 0.42 |
| Q5_K_M Sweet spot | 16.2418 | 36.36 | 0.52 |
| Q6_K | 16.0139 | 32.80 | 0.46 |
| Q8_0 | 16.2957 | 28.42 | 0.48 |
| F16 | 16.2676 | 15.94 | 0.48 |