kepler-gguf
Quantization of Qwen/Qwen3-8B .
What this model does
Orbital-mechanics questions ("a satellite orbits at 7,000 km altitude — what's its period?") want a single checkable number, and a frontier API is an expensive way to get one. Kepler is a greenfield Qwen3-8B vertical built gate-first on one Spark: the recommended Q8_0 scores 88.6% on the frozen astro-bench held-out (n=44, boxed ±2%) and 84.1% on the no-hint curveball set, with the SFT-only method chosen by cheap inference-time gates before any RL spend — the build story behind the weights.
Use cases
- Local numeric Q&A over astrodynamics and quantitative astrophysics — periods, transfers, orbital elements — with boxed, verifiable answers
- A worked example of gate-driven method selection (base preflight → SFT gate → RL-headroom gate) for greenfield verticals
- A measured four-variant quant ladder (Q4_K_M→Q8_0) for quality-vs-throughput tradeoffs on unified-memory hardware
Audience — Spark operators who want a $0 local astro reasoner with bench receipts, and builders studying how to decide SFT vs RLVR before spending the run.
Spec matrix
Ranks within each column drive the heatmap. Lower perplexity, higher throughput, higher vertical eval — the sweet-spot row balances all three.
| Variant | Perplexity ↓ | Spark tok/s ↑ | Vertical eval ↑ |
|---|---|---|---|
| Q4_K_M | — | 33.06 | 0.75 |
| Q5_K_M | — | 28.06 | 0.75 |
| Q6_K | — | 24.60 | 0.84 |
| Q8_0 Sweet spot | — | 21.07 | 0.89 |