Improving Token-Based World Models with Parallel Observation Prediction Paper • 2402.05643 • Published Feb 8, 2024 • 1
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Paper • 2411.02359 • Published Nov 4, 2024 • 13
Classification Done Right for Vision-Language Pre-Training Paper • 2411.03313 • Published Nov 5, 2024
Towards Generalist Robot Policies: What Matters in Building Vision-Language-Action Models Paper • 2412.14058 • Published Dec 18, 2024 • 1
Image Understanding Makes for A Good Tokenizer for Image Generation Paper • 2411.04406 • Published Nov 7, 2024
$\text{M}^{\text{3}}$: A Modular World Model over Streams of Tokens Paper • 2502.11537 • Published Feb 17
Improving and Benchmarking Offline Reinforcement Learning Algorithms Paper • 2306.00972 • Published Jun 1, 2023
Decoupling Representation and Classifier for Long-Tailed Recognition Paper • 1910.09217 • Published Oct 21, 2019
Trace Anything: Representing Any Video in 4D via Trajectory Fields Paper • 2510.13802 • Published Oct 15 • 30
Depth Anything 3: Recovering the Visual Space from Any Views Paper • 2511.10647 • Published Nov 13 • 95
Manipulation as in Simulation: Enabling Accurate Geometry Perception in Robots Paper • 2509.02530 • Published Sep 2 • 10
GoT-R1: Unleashing Reasoning Capability of MLLM for Visual Generation with Reinforcement Learning Paper • 2505.17022 • Published May 22 • 27
DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution Paper • 2411.02359 • Published Nov 4, 2024 • 13
Improving and Benchmarking Offline Reinforcement Learning Algorithms Paper • 2306.00972 • Published Jun 1, 2023
Absolute Zero: Reinforced Self-play Reasoning with Zero Data Paper • 2505.03335 • Published May 6 • 188
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model? Paper • 2504.13837 • Published Apr 18 • 139
Seaweed-7B: Cost-Effective Training of Video Generation Foundation Model Paper • 2504.08685 • Published Apr 11 • 130
Bridging Continuous and Discrete Tokens for Autoregressive Visual Generation Paper • 2503.16430 • Published Mar 20 • 34