Open Pangram Collection Open models and datasets based on Pangram's ICLR 2026 EditLens paper licensed for noncommercial use ONLY under CC BY-NC-SA 4.0 • 4 items • Updated 8 days ago • 13
Running on CPU Upgrade Featured 3.14k The Smol Training Playbook 📚 3.14k The secrets to building world-class LLMs
Suri Collection Models and dataset for Suri: Multi-Constraint Instruction Following for Long-form Generation • 4 items • Updated Oct 3, 2025 • 1
CLIPPER Collection Models and datasets for CLIPPER: Compression enables long-context synthetic data generation • 7 items • Updated Oct 3, 2025 • 5
Frankentext: Stitching random text fragments into long-form narratives Paper • 2505.18128 • Published May 23, 2025 • 4
Frankentext: Stitching random text fragments into long-form narratives Paper • 2505.18128 • Published May 23, 2025 • 4 • 2
From Hours to Minutes: Lossless Acceleration of Ultra Long Sequence Generation up to 100K Tokens Paper • 2502.18890 • Published Feb 26, 2025 • 30
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published Feb 20, 2025 • 11
CLIPPER: Compression enables long-context synthetic data generation Paper • 2502.14854 • Published Feb 20, 2025 • 11 • 2