Datasets.
Curated and synthesized training and eval datasets — each one bound to its source corpus, shape-counted, and license-tagged. 00 published.
No standalone dataset artifacts yet.
The benchmarks double as eval datasets — see the per-bench detail pages for shape, samples, and how-to-load.