Automating Safety Enhancement for LLM-based Agents with Synthetic Risk Scenarios Paper • 2505.17735 • Published May 23, 2025 • 3
MMMR: Benchmarking Massive Multi-Modal Reasoning Tasks Paper • 2505.16459 • Published May 22, 2025 • 45