QuantiPhy: A Quantitative Benchmark Evaluating Physical Reasoning Abilities of Vision-Language Models Paper • 2512.19526 • Published Dec 22, 2025 • 12
MatSpray: Fusing 2D Material World Knowledge on 3D Geometry Paper • 2512.18314 • Published Dec 20, 2025 • 9
Physics of Language Models: Part 4.1, Architecture Design and the Magic of Canon Layers Paper • 2512.17351 • Published Dec 19, 2025 • 28
MobileWorld: Benchmarking Autonomous Mobile Agents in Agent-User Interactive, and MCP-Augmented Environments Paper • 2512.19432 • Published Dec 22, 2025 • 13
Probing Scientific General Intelligence of LLMs with Scientist-Aligned Workflows Paper • 2512.16969 • Published Dec 18, 2025 • 119
LongVideoAgent: Multi-Agent Reasoning with Long Videos Paper • 2512.20618 • Published Dec 23, 2025 • 54