← Notebooks
Notebook · IPYNB ·builder · user

saul-7b-instruct-v1-notebooks

Build the Saul-7B quant — and call the legal model — on a Spark or a free cloud GPU

Notebook saul-7b-instruct-v1-notebooks — builder · user
Notebookbuilder · useron Saul-7B-Instruct-v1
Build it
Use it

What this notebook does

The artifact → card → article loop sells the outcome but offers no runnable on-ramp: a researcher who wants to reproduce the five-variant quant, or a developer who wants to call the model, has to reconstruct the journey from prose. These two notebooks close that gap. The builder notebook walks the feasibility → quantize → measure → publish journey as typed fieldkit API calls; the user notebook calls Saul on real legal-classification tasks. Both are one-click via Open in Colab / Open in Kaggle and run offline on a DGX Spark — privileged legal text never leaves the box.

Use cases

Audience — AI researchers and engineers who want to reproduce the quant, and legal-tech developers, compliance teams, and litigators who want a private offline legal assistant — on Spark-class hardware (GB10, 128 GB unified memory) or a free cloud GPU.

Choosing the variant

Two facets of the same notebook — pick by your goal.

builder
Walks the build journey on Spark — fieldkit API calls replacing ad-hoc scripts; surfaces speed, feasibility, and viability.
user
Demonstrates the published model on realistic domain tasks — runtime-detected, runs on Spark or on a free Colab/Kaggle GPU.

Methods

Read the field note Orionfold/Saul-7B-Instruct-v1-GGUF on Spark — five legal variants, LegalBench mini-eval, four-axis measurement card Five GGUF variants of Equall/Saul-7B-Instruct-v1 measured on a DGX Spark — Q5_K_M scores 72% on LegalBench (n=50, contains) at 20 tok/s and 4.8 GB. Each card carries perplexity, sustained tok/s, thermal envelope, and a 5-task LegalBench subset score. Open article

Known drift

Bounded limitations — Colab/Kaggle runs use the published quant; reasoning quality may differ from the BF16 weights on Spark. Each entry carries an explicit bound.

Cloud (Colab / Kaggle) path serves the Q4_K_M quant; the Spark path serves Q5_K_M
One quant level apart, and the legal bench is the wider gap — Q4_K_M scores 62% on the LegalBench n=50 mini-eval vs Q5_K_M's 72% (10 points; recoverable in a review-in-the-loop flow); both run the identical code path. See the sibling GGUF card.
The builder notebook's quantize + publish steps render the recorded Spark run, not a live re-execution
2 recorded Spark-only cells (the quantize sweep and the publish dry-run); the remaining cells — feasibility envelope, the spark_quad panel, and the variants table — run live on any runtime from the manifest.
The user notebook's live model-chat cells are not captured in the published marketing snapshot
4 use-case cells call the model live on any runtime; the snapshot captures the deterministic charts + banners and describes the chat output rather than screenshotting it.

Sibling artifacts

The model this notebook targets, plus other variants in the same family.