k-steering Collection Collecting datasets used for our paper on multi-attribute steering using gradient descent. • 7 items • Updated Nov 3, 2025 • 2
Beyond Length Scaling: Synergizing Breadth and Depth for Generative Reward Models Paper • 2603.01571 • Published Mar 2 • 34
RubricBench: Aligning Model-Generated Rubrics with Human Standards Paper • 2603.01562 • Published Mar 2 • 64
Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper • 2606.05922 • Published 8 days ago • 52
OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper • 2606.02031 • Published 11 days ago • 20
RUBRIC-ARROW: Alternating Pointwise Rubric Reward Modeling for LLM Post-training in Non-verifiable Domains Paper • 2605.29156 • Published 16 days ago • 14
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories Paper • 2504.08942 • Published Apr 11, 2025 • 29