quant · finance
finance-chat-gguf
| Variant | Perplexity | tok/s | FinanceBench (n=50, numeric_match) |
|---|---|---|---|
| Q4_K_M | 6.221 | 31.1 | 0.14 |
| Q5_K_M | 6.164 | 26.9 | 0.16 |
| Q6_K | 6.147 | 23.9 | 0.16 |
| Q8_0 | 6.137 | 8.9 | 0.18 |
| F16 sweet spot | 6.137 | 11.5 | 0.18 |
Perplexity lower = better; tok/s measured on the DGX Spark (GB10, 128 GB unified).