ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use Paper • 2504.07981 • Published Apr 4, 2025 • 5
view article Article Did GPT 5.2 make a breakthrough discovery in theoretical physics? 28 days ago • 61
view article Article The First Healthcare Robotics Dataset and Foundational Physical AI Models for Healthcare Robotics 3 days ago • 20
Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections Paper • 2603.12180 • Published 7 days ago • 61
NVIDIA Nemotron v3 Collection Open, Production-ready Enterprise Models • 15 items • Updated 3 days ago • 220
SWE-rebench V2: Language-Agnostic SWE Task Collection at Scale Paper • 2602.23866 • Published 20 days ago • 87
view article Article GGML and llama.cpp join HF to ensure the long-term progress of Local AI +4 28 days ago • 488
TermiGen: High-Fidelity Environment and Robust Trajectory Synthesis for Terminal Agents Paper • 2602.07274 • Published Feb 6 • 208
view article Article Community Evals: Because we're done trusting black-box leaderboards over the community +5 Feb 4 • 88
view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 Nov 21, 2025 • 26
DeepPlanning: Benchmarking Long-Horizon Agentic Planning with Verifiable Constraints Paper • 2601.18137 • Published Jan 26 • 35