SPICE: Self-Play In Corpus Environments Improves Reasoning Paper • 2510.24684 • Published Oct 28 • 15
The Era of Real-World Human Interaction: RL from User Conversations Paper • 2509.25137 • Published Sep 29 • 18
SoMi-ToM: Evaluating Multi-Perspective Theory of Mind in Embodied Social Interactions Paper • 2506.23046 • Published Jun 29 • 1
Do Vision-Language Models Have Internal World Models? Towards an Atomic Evaluation Paper • 2506.21876 • Published Jun 27 • 28
Beyond the Binary: Capturing Diverse Preferences With Reward Regularization Paper • 2412.03822 • Published Dec 5, 2024
AutoToM: Automated Bayesian Inverse Planning and Model Discovery for Open-ended Theory of Mind Paper • 2502.15676 • Published Feb 21 • 3
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis Paper • 2412.19723 • Published Dec 27, 2024 • 87
Neural Amortized Inference for Nested Multi-agent Reasoning Paper • 2308.11071 • Published Aug 21, 2023 • 3
MMToM-QA: Multimodal Theory of Mind Question Answering Paper • 2401.08743 • Published Jan 16, 2024 • 1