quant · medical
ii-medical-8b-gguf
| Variant | Perplexity | tok/s | MedMCQA (n=50, mcq_letter) |
|---|---|---|---|
| Q4_K_M | 16.550 | 43.6 | 0.42 |
| Q5_K_M sweet spot | 16.242 | 36.4 | 0.52 |
| Q6_K | 16.014 | 32.8 | 0.46 |
| Q8_0 | 16.296 | 28.4 | 0.48 |
| F16 | 16.268 | 15.9 | 0.48 |
Perplexity lower = better; tok/s measured on the DGX Spark (GB10, 128 GB unified).