Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.04871

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 138
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Paper • 2510.22037 • Published Oct 24 • 19
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13 • 100

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16 • 47
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Paper • 2505.23359 • Published May 29 • 39
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3 • 35
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 142

Interested paper

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published 25 days ago • 92
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Paper • 2511.19365 • Published 13 days ago • 63
Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published 12 days ago • 111
Video Generation Models Are Good Latent Reward Models

Paper • 2511.21541 • Published 11 days ago • 44

Test Collection

Only used in E2E tests

huggingfacejs/tasks

Preview • Updated 3 days ago • 54.2k • 4
Test Collection

Collection

Only used in E2E tests • 5 items • Updated 30 days ago
Running

1

Push Model From Web

🌎

1

Upload ML models to Hugging Face Hub from your browser
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

ArvutiMinisteerium

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 696
DeepSeek-R1

Collection

10 items • Updated 10 days ago • 820
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 872
Qwen3

Collection

84 items • Updated Aug 6 • 1.47k

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25 • 345
Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

Towards General-Purpose Model-Free Reinforcement Learning

Paper • 2501.16142 • Published Jan 27 • 30
DAPO: An Open-Source LLM Reinforcement Learning System at Scale

Paper • 2503.14476 • Published Mar 18 • 142
Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?

Paper • 2504.13837 • Published Apr 18 • 138
Learning to Reason under Off-Policy Guidance

Paper • 2504.14945 • Published Apr 21 • 88

Interested paper

DoPE: Denoising Rotary Position Embedding

Paper • 2511.09146 • Published 25 days ago • 92
DeCo: Frequency-Decoupled Pixel Diffusion for End-to-End Image Generation

Paper • 2511.19365 • Published 13 days ago • 63
Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published 12 days ago • 111
Video Generation Models Are Good Latent Reward Models

Paper • 2511.21541 • Published 11 days ago • 44

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

Test Collection

Only used in E2E tests

huggingfacejs/tasks

Preview • Updated 3 days ago • 54.2k • 4
Test Collection

Collection

Only used in E2E tests • 5 items • Updated 30 days ago
Running

1

Push Model From Web

🌎

1

Upload ML models to Hugging Face Hub from your browser
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
Agent Learning via Early Experience

Paper • 2510.08558 • Published Oct 9 • 266
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models

Paper • 2510.04618 • Published Oct 6 • 123

ArvutiMinisteerium

Llama 3.1

Collection

This collection hosts the transformers and original repos of the Llama 3.1, Llama Guard 3 and Prompt Guard models • 11 items • Updated Dec 6, 2024 • 696
DeepSeek-R1

Collection

10 items • Updated 10 days ago • 820
Meta Llama 3

Collection

This collection hosts the transformers and original repos of the Meta Llama 3 and Llama Guard 2 releases • 5 items • Updated Dec 6, 2024 • 872
Qwen3

Collection

84 items • Updated Aug 6 • 1.47k

ATLAS: Adaptive Transfer Scaling Laws for Multilingual Pretraining, Finetuning, and Decoding the Curse of Multilinguality

Paper • 2510.22037 • Published Oct 24 • 19
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Scaling Language-Centric Omnimodal Representation Learning

Paper • 2510.11693 • Published Oct 13 • 100

The Dragon Hatchling: The Missing Link between the Transformer and Models of the Brain

Paper • 2509.26507 • Published Sep 30 • 535
Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491
A.S.E: A Repository-Level Benchmark for Evaluating Security in AI-Generated Code

Paper • 2508.18106 • Published Aug 25 • 345
Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 256

Reasoning with Sampling: Your Base Model is Smarter Than You Think

Paper • 2510.14901 • Published Oct 16 • 47
VideoReasonBench: Can MLLMs Perform Vision-Centric Complex Video Reasoning?

Paper • 2505.23359 • Published May 29 • 39
OThink-R1: Intrinsic Fast/Slow Thinking Mode Switching for Over-Reasoning Mitigation

Paper • 2506.02397 • Published Jun 3 • 35
ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 142

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published Oct 6 • 491

Previous
1
2
3
...
9
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs