Collections
Discover the best community collections!
Collections including paper arxiv:2507.14683
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 26 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
Magistral
Paper • 2506.10910 • Published • 65 -
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
Paper • 2506.15882 • Published • 2 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134 -
The Invisible Leash: Why RLVR May Not Escape Its Origin
Paper • 2507.14843 • Published • 85
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 7.7k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 140 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 238 -
Thinking with Nothinking Calibration: A New In-Context Learning Paradigm in Reasoning Large Language Models
Paper • 2508.03363 • Published • 1 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134
-
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 72 -
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications
Paper • 2506.12594 • Published • 2 -
Towards an AI co-scientist
Paper • 2502.18864 • Published • 51 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 5 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 143 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 88
-
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Paper • 2507.07996 • Published • 34 -
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23
-
SSRL: Self-Search Reinforcement Learning
Paper • 2508.10874 • Published • 97 -
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens
Paper • 2508.01191 • Published • 238 -
Thinking with Nothinking Calibration: A New In-Context Learning Paradigm in Reasoning Large Language Models
Paper • 2508.03363 • Published • 1 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134
-
WorldVLA: Towards Autoregressive Action World Model
Paper • 2506.21539 • Published • 40 -
Fast and Simplex: 2-Simplicial Attention in Triton
Paper • 2507.02754 • Published • 26 -
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction
Paper • 2507.02025 • Published • 35 -
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact
Paper • 2507.00951 • Published • 24
-
DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents
Paper • 2506.11763 • Published • 72 -
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications
Paper • 2506.12594 • Published • 2 -
Towards an AI co-scientist
Paper • 2502.18864 • Published • 51 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134
-
Magistral
Paper • 2506.10910 • Published • 65 -
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute
Paper • 2506.15882 • Published • 2 -
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization
Paper • 2507.14683 • Published • 134 -
The Invisible Leash: Why RLVR May Not Escape Its Origin
Paper • 2507.14843 • Published • 85
-
GR00T N1: An Open Foundation Model for Generalist Humanoid Robots
Paper • 2503.14734 • Published • 5 -
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation
Paper • 2401.02117 • Published • 33 -
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics
Paper • 2506.01844 • Published • 143 -
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding
Paper • 2506.16035 • Published • 88
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 7.7k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 140 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization
Paper • 2503.12937 • Published • 30 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 262 -
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs
Paper • 2507.07996 • Published • 34 -
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23