sashaboguraev/pythia-1b-ppt-control_nca_steps250_1b-seed324 Text Generation • 1B • Updated 11 days ago • 27 • 1
How LoRA Remembers? A Parametric Memory Law for LLM Finetuning Paper • 2605.30260 • Published 18 days ago • 42
OR-Space: A Full-Lifecycle Workspace Benchmark for Industrial Optimization Agents Paper • 2605.28158 • Published 19 days ago • 6
DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Paper • 2605.21467 • Published 26 days ago • 204
Omni-DuplexEval: Evaluating Real-time Duplex Omni-modal Interaction Paper • 2605.17360 • Published 29 days ago • 4
kairawal/Gemma-3-1B-IT-EL-SynthDolly-r16alpha128-E5-S73 Text Generation • 1.0B • Updated 24 days ago • 34 • 1
Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information Paper • 2605.11609 • Published May 12 • 195
IndusAgent: Reinforcing Open-Vocabulary Industrial Anomaly Detection with Agentic Tools Paper • 2605.20682 • Published 26 days ago • 83
CiteVQA: Benchmarking Evidence Attribution for Trustworthy Document Intelligence Paper • 2605.12882 • Published May 13 • 271
HERMES++: Toward a Unified Driving World Model for 3D Scene Understanding and Generation Paper • 2604.28196 • Published Apr 30 • 72
Online Self-Calibration Against Hallucination in Vision-Language Models Paper • 2605.00323 • Published May 1 • 3
RationalRewards: Reasoning Rewards Scale Visual Generation Both Training and Test Time Paper • 2604.11626 • Published Apr 13 • 102