← Quantizations
Quant · GGUF · 4 variants

kepler-gguf

Quantization of Qwen/Qwen3-8B .

HF Orionfold/Kepler-GGUF License free apache-2.0 Published

What this model does

Orbital-mechanics questions ("a satellite orbits at 7,000 km altitude — what's its period?") want a single checkable number, and a frontier API is an expensive way to get one. Kepler is a greenfield Qwen3-8B vertical built gate-first on one Spark: the recommended Q8_0 scores 88.6% on the frozen astro-bench held-out (n=44, boxed ±2%) and 84.1% on the no-hint curveball set, with the SFT-only method chosen by cheap inference-time gates before any RL spend — the build story behind the weights.

Use cases

  • Local numeric Q&A over astrodynamics and quantitative astrophysics — periods, transfers, orbital elements — with boxed, verifiable answers
  • A worked example of gate-driven method selection (base preflight → SFT gate → RL-headroom gate) for greenfield verticals
  • A measured four-variant quant ladder (Q4_K_M→Q8_0) for quality-vs-throughput tradeoffs on unified-memory hardware

Audience — Spark operators who want a $0 local astro reasoner with bench receipts, and builders studying how to decide SFT vs RLVR before spending the run.

Spec matrix

Ranks within each column drive the heatmap. Lower perplexity, higher throughput, higher vertical eval — the sweet-spot row balances all three.

Vertical bench: astro-bench v0.1 held-out (n=44, \boxed ±2%)
Variant Perplexity Spark tok/s Vertical eval
Q4_K_M 33.06 0.75
Q5_K_M 28.06 0.75
Q6_K 24.60 0.84
Q8_0 Sweet spot 21.07 0.89

Methods

Read the field note The Gate Before the GPU — Deciding SFT vs RL vs RLVR Before You Spend the Run Building Kepler — a numeric astrodynamics reasoner — from scratch on one Spark. The method choice (SFT vs RL vs RLVR) is decided by cheap gates before any GPU run: a base preflight, an SFT gate, and a Goldilocks headroom gate. A flawless RLVR run that changed nothing is the proof. Open article