quant · legal
saul-7b-instruct-v1-gguf
| Variant | Perplexity | tok/s | LegalBench (n=50, contains) |
|---|---|---|---|
| Q4_K_M | 5.986 | 29.4 | 0.62 |
| Q5_K_M sweet spot | 5.938 | 20.2 | 0.72 |
| Q6_K | 5.925 | 22.4 | 0.68 |
| Q8_0 | 5.914 | 7.3 | 0.66 |
| F16 | 5.917 | 10.9 | 0.68 |
Perplexity lower = better; tok/s measured on the DGX Spark (GB10, 128 GB unified).