Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2507.14683

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15 • 14
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26 • 40
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 26
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2 • 35
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1 • 24

Magistral

Paper • 2506.10910 • Published Jun 12 • 65
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

Paper • 2506.15882 • Published Jun 18 • 2
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134
The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 85

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 7.7k • 1.22k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 140 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 97
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238
Thinking with Nothinking Calibration: A New In-Context Learning Paradigm in Reasoning Large Language Models

Paper • 2508.03363 • Published Aug 5 • 1
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

A fully open-source series of mathematical reasoning language models

miromind-ai/MiroMind-M1-SFT-719K

Viewer • Updated Jul 22 • 719k • 563 • 14
miromind-ai/MiroMind-M1-RL-62K

Viewer • Updated Jul 22 • 62.1k • 484 • 11
miromind-ai/MiroMind-M1-RL-32B

Text Generation • 33B • Updated Aug 11 • 14 • 7
miromind-ai/MiroMind-M1-RL-7B

8B • Updated Jul 22 • 4 • 13

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 72
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications

Paper • 2506.12594 • Published Jun 14 • 2
Towards an AI co-scientist

Paper • 2502.18864 • Published Feb 26 • 51
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Paper • 2503.14734 • Published Mar 18 • 5
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Paper • 2401.02117 • Published Jan 4, 2024 • 33
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 143
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19 • 88

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 30
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10 • 34
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17 • 23

MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

SSRL: Self-Search Reinforcement Learning

Paper • 2508.10874 • Published Aug 14 • 97
Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238
Thinking with Nothinking Calibration: A New In-Context Learning Paradigm in Reasoning Large Language Models

Paper • 2508.03363 • Published Aug 5 • 1
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

Streaming 4D Visual Geometry Transformer

Paper • 2507.11539 • Published Jul 15 • 14
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

A fully open-source series of mathematical reasoning language models

miromind-ai/MiroMind-M1-SFT-719K

Viewer • Updated Jul 22 • 719k • 563 • 14
miromind-ai/MiroMind-M1-RL-62K

Viewer • Updated Jul 22 • 62.1k • 484 • 11
miromind-ai/MiroMind-M1-RL-32B

Text Generation • 33B • Updated Aug 11 • 14 • 7
miromind-ai/MiroMind-M1-RL-7B

8B • Updated Jul 22 • 4 • 13

WorldVLA: Towards Autoregressive Action World Model

Paper • 2506.21539 • Published Jun 26 • 40
Fast and Simplex: 2-Simplicial Attention in Triton

Paper • 2507.02754 • Published Jul 3 • 26
IntFold: A Controllable Foundation Model for General and Specialized Biomolecular Structure Prediction

Paper • 2507.02025 • Published Jul 2 • 35
Thinking Beyond Tokens: From Brain-Inspired Intelligence to Cognitive Foundations for Artificial General Intelligence and its Societal Impact

Paper • 2507.00951 • Published Jul 1 • 24

DeepResearch Bench: A Comprehensive Benchmark for Deep Research Agents

Paper • 2506.11763 • Published Jun 13 • 72
A Comprehensive Survey of Deep Research: Systems, Methodologies, and Applications

Paper • 2506.12594 • Published Jun 14 • 2
Towards an AI co-scientist

Paper • 2502.18864 • Published Feb 26 • 51
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134

Magistral

Paper • 2506.10910 • Published Jun 12 • 65
Fractional Reasoning via Latent Steering Vectors Improves Inference Time Compute

Paper • 2506.15882 • Published Jun 18 • 2
MiroMind-M1: An Open-Source Advancement in Mathematical Reasoning via Context-Aware Multi-Stage Policy Optimization

Paper • 2507.14683 • Published Jul 19 • 134
The Invisible Leash: Why RLVR May Not Escape Its Origin

Paper • 2507.14843 • Published Jul 20 • 85

GR00T N1: An Open Foundation Model for Generalist Humanoid Robots

Paper • 2503.14734 • Published Mar 18 • 5
Mobile ALOHA: Learning Bimanual Mobile Manipulation with Low-Cost Whole-Body Teleoperation

Paper • 2401.02117 • Published Jan 4, 2024 • 33
SmolVLA: A Vision-Language-Action Model for Affordable and Efficient Robotics

Paper • 2506.01844 • Published Jun 2 • 143
Vision-Guided Chunking Is All You Need: Enhancing RAG with Multimodal Document Understanding

Paper • 2506.16035 • Published Jun 19 • 88

microsoft/bitnet-b1.58-2B-4T

Text Generation • 0.8B • Updated May 1 • 7.7k • 1.22k
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models

Paper • 2504.10449 • Published Apr 14 • 15
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct

Text Generation • 8B • Updated Apr 17 • 140 • 15
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs

Paper • 2504.11536 • Published Apr 15 • 63

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Paper • 2503.12937 • Published Mar 17 • 30
Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262
Skip a Layer or Loop it? Test-Time Depth Adaptation of Pretrained LLMs

Paper • 2507.07996 • Published Jul 10 • 34
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities

Paper • 2507.13158 • Published Jul 17 • 23

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs