Quant · GGUF · 5 variants
finance-chat-gguf
Quantization of AdaptLLM/finance-chat .
Spec matrix
Ranks within each column drive the heatmap. Lower perplexity, higher throughput, higher vertical eval — the sweet-spot row balances all three.
| Variant | Perplexity ↓ | Spark tok/s ↑ | Vertical eval ↑ |
|---|---|---|---|
| Q4_K_M | 6.2215 | 31.09 | 0.14 |
| Q5_K_M | 6.1641 | 26.95 | 0.16 |
| Q6_K Sweet spot | 6.1468 | 23.86 | 0.16 |
| Q8_0 | 6.1373 | 8.87 | 0.18 |
| F16 | 6.1373 | 11.51 | 0.18 |