KABI's picture

KABI

dongguanting

·

https://dongguanting.github.io/

AI & ML interests

Reasoning and Alignment for Large Language Models

Recent Activity

upvoted a paper 3 days ago

From Code Foundation Models to Agents and Applications: A Practical Guide to Code Intelligence

upvoted a paper 5 days ago

Latent Collaboration in Multi-Agent Systems

upvoted a paper 11 days ago

DR Tulu: Reinforcement Learning with Evolving Rubrics for Deep Research

View all activity

Organizations

Collections 4

View 4 collections

Papers 37

arxiv:2510.21618

arxiv:2510.17354

arxiv:2510.14545

arxiv:2509.23285

models 14

dongguanting/aepo_light

8B • Updated Nov 3 • 3

dongguanting/Qwen2.5-7B-AEPO

Text Generation • 8B • Updated Oct 27 • 18 • 1

dongguanting/Qwen3-8B-AEPO-DeepSearch

Text Generation • 8B • Updated Oct 27 • 9 • 1

dongguanting/Qwen3-14B-AEPO-DeepSearch

Robotics • 15B • Updated Oct 21 • 8 • 1

dongguanting/Qwen2.5-7B-ARPO

Text Generation • 8B • Updated Aug 19 • 918 • 2

dongguanting/Llama3.1-8B-ARPO

Text Generation • 8B • Updated Aug 12 • 16 • 1

dongguanting/Qwen2.5-3B-ARPO

Text Generation • 3B • Updated Aug 12 • 13 • 3

dongguanting/Qwen3-14B-ARPO-DeepSearch

Text Generation • 15B • Updated Aug 12 • 15 • 5

dongguanting/Qwen3-8B-ARPO-DeepSearch

8B • Updated Jul 29 • 9 • 2

dongguanting/Tool-Star-Qwen-7B

Text Generation • 8B • Updated Jun 30 • 70 • 2

datasets 11

dongguanting/ARPO-RL-DeepSearch-1K

Viewer • Updated Oct 17 • 1.07k • 112 • 5

dongguanting/ARPO-RL-Reasoning-10K

Viewer • Updated Oct 17 • 10k • 119 • 3

dongguanting/ARPO-SFT-54K

Viewer • Updated Oct 17 • 54.6k • 210 • 14

dongguanting/RAG-Error-Critic-100K

Viewer • Updated Jun 28 • 100k • 49 • 2

dongguanting/Tool-Star-SFT-54K

Viewer • Updated May 29 • 54k • 125 • 10

dongguanting/Multi-Tool-RL-10K

Viewer • Updated May 25 • 10k • 91 • 4

dongguanting/RAG-QA-40K

Viewer • Updated Dec 27, 2024 • 32.8k • 39 • 2

dongguanting/ShareGPT-12K

Viewer • Updated Dec 27, 2024 • 12.9k • 26 • 1

dongguanting/VIF-RAG-QA-110K

Viewer • Updated Dec 27, 2024 • 111k • 36 • 7

dongguanting/DotamathQA

Viewer • Updated Dec 26, 2024 • 574k • 17 • 2

View 11 datasets