The Reward Was in Your Data All Along: Correcting Flow Matching with Discriminator-Guided RL Paper • 2606.19162 • Published 6 days ago • 20
electricsheepafrica/africa-owid-eat-lancet-diet-comparison Viewer • Updated 19 days ago • 180 • 35 • 1
On the Scaling of PEFT: Towards Million Personal Models of Trillion Parameters Paper • 2606.02437 • Published 22 days ago • 232
Measuring the Depth of LLM Unlearning via Activation Patching Paper • 2605.24614 • Published May 23 • 8
OpenComputer: Verifiable Software Worlds for Computer-Use Agents Paper • 2605.19769 • Published May 19 • 85
SOD: Step-wise On-policy Distillation for Small Language Model Agents Paper • 2605.07725 • Published May 8 • 25
LINGESH-7/tinyllama-bnb-4bit-FT-on-yahma-alpaca-cleaned Text Generation • Updated about 1 month ago • 5 • 1
OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization Paper • 2605.17757 • Published May 18 • 65
EvolveMem:Self-Evolving Memory Architecture via AutoResearch for LLM Agents Paper • 2605.13941 • Published May 13 • 24
Rethinking Reasoning-Intensive Retrieval: Evaluating and Advancing Retrievers in Agentic Search Systems Paper • 2605.04018 • Published May 5 • 41