Zikun Li's picture

155 9

Zikun Li

zikun-li

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 1 day ago

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

upvoted a paper 1 day ago

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

upvoted a paper 1 day ago

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

View all activity

Organizations

None yet

upvoted 3 papers 1 day ago

Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning

Paper • 2601.07641 • Published 7 days ago • 42

Collaborative Multi-Agent Test-Time Reinforcement Learning for Reasoning

Paper • 2601.09667 • Published 5 days ago • 75

Rewarding the Rare: Uniqueness-Aware RL for Creative Problem Solving in LLMs

Paper • 2601.08763 • Published 6 days ago • 128

upvoted 2 papers 3 days ago

ArenaRL: Scaling RL for Open-Ended Agents via Tournament-based Relative Ranking

Paper • 2601.06487 • Published 9 days ago • 48

Distribution-Aligned Sequence Distillation for Superior Long-CoT Reasoning

Paper • 2601.09088 • Published 5 days ago • 54

upvoted 6 papers 5 days ago

OpenTinker: Separating Concerns in Agentic Reinforcement Learning

Paper • 2601.07376 • Published 7 days ago • 5

Dr. Zero: Self-Evolving Search Agents without Training Data

Paper • 2601.07055 • Published 7 days ago • 15

GlimpRouter: Efficient Collaborative Inference by Glimpsing One Token of Thoughts

Paper • 2601.05110 • Published 11 days ago • 27

MegaFlow: Large-Scale Distributed Orchestration System for the Agentic Era

Paper • 2601.07526 • Published 7 days ago • 19

BabyVision: Visual Reasoning Beyond Language

Paper • 2601.06521 • Published 9 days ago • 184

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published 10 days ago • 77

upvoted a paper 9 days ago

GDPO: Group reward-Decoupled Normalization Policy Optimization for Multi-reward RL Optimization

Paper • 2601.05242 • Published 11 days ago • 194

upvoted a paper 11 days ago

Entropy-Adaptive Fine-Tuning: Resolving Confident Conflicts to Mitigate Forgetting

Paper • 2601.02151 • Published 14 days ago • 98

upvoted 2 papers 13 days ago

NextFlow: Unified Sequential Modeling Activates Multimodal Understanding and Generation

Paper • 2601.02204 • Published 14 days ago • 56

VAR RL Done Right: Tackling Asynchronous Policy Conflicts in Visual Autoregressive Generation

Paper • 2601.02256 • Published 14 days ago • 32

upvoted 5 papers 3 months ago

The Art of Scaling Reinforcement Learning Compute for LLMs

Paper • 2510.13786 • Published Oct 15, 2025 • 31

Every Attention Matters: An Efficient Hybrid Architecture for Long-Context Reasoning

Paper • 2510.19338 • Published Oct 22, 2025 • 114

Efficient Long-context Language Model Training by Core Attention Disaggregation

Paper • 2510.18121 • Published Oct 20, 2025 • 122

Seedream 4.0: Toward Next-generation Multimodal Image Generation

Paper • 2509.20427 • Published Sep 24, 2025 • 82

Self-Forcing++: Towards Minute-Scale High-Quality Video Generation

Paper • 2510.02283 • Published Oct 2, 2025 • 96