Evangelinejy
/

adaptive_difficulty_prediction_deepscaler

data-efficient-llm-rl

model_hub_mixin

pytorch_model_hub_mixin

Model card Files Files and versions

This model has been pushed to the Hub using the PytorchModelHubMixin integration:

Library: https://github.com/ASTRAL-Group/data-efficient-llm-rl
Paper: Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Downloads last month: 2

Safetensors

Model size

5.98M params

Tensor type

F32

·

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Evangelinejy/adaptive_difficulty_prediction_deepscaler

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Paper • 2506.05316 • Published Jun 5, 2025 • 1