Hongru Wang's picture

2 23 1

Hongru Wang

Merlin-Hongru

·

https://hrwise-nlp.github.io/

AI & ML interests

None yet

Recent Activity

updated a model 9 days ago

Merlin-Hongru/webshop-qwen2.5-1.5b-no-sum-latest3

published a model 9 days ago

Merlin-Hongru/webshop-qwen2.5-1.5b-no-sum-latest3

submitted a paper 11 days ago

From Word to World: Can Large Language Models be Implicit Text-based World Models?

View all activity

Organizations

upvoted a paper 12 days ago

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Paper • 2512.18832 • Published 15 days ago • 11

upvoted a paper 3 months ago

Explore to Evolve: Scaling Evolved Aggregation Logic via Proactive Online Exploration for Deep Research Agents

Paper • 2510.14438 • Published Oct 16, 2025 • 13

upvoted 4 papers 4 months ago

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 228

Sample-efficient Integration of New Modalities into Large Language Models

Paper • 2509.04606 • Published Sep 4, 2025 • 8

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2, 2025 • 124

CODA: Coordinating the Cerebrum and Cerebellum for a Dual-Brain Computer Use Agent with Decoupled Reinforcement Learning

Paper • 2508.20096 • Published Aug 27, 2025 • 36

upvoted 3 papers 5 months ago

The Law of Knowledge Overshadowing: Towards Understanding, Predicting, and Preventing LLM Hallucination

Paper • 2502.16143 • Published Feb 22, 2025 • 6

Cognitive Kernel-Pro: A Framework for Deep Research Agents and Agent Foundation Models Training

Paper • 2508.00414 • Published Aug 1, 2025 • 93

A Survey of Self-Evolving Agents: On Path to Artificial Super Intelligence

Paper • 2507.21046 • Published Jul 28, 2025 • 82

upvoted a paper 6 months ago

Thinking with Images for Multimodal Reasoning: Foundations, Methods, and Future Frontiers

Paper • 2506.23918 • Published Jun 30, 2025 • 89

upvoted 4 papers 7 months ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published May 30, 2025 • 15

ToMAP: Training Opponent-Aware LLM Persuaders with Theory of Mind

Paper • 2505.22961 • Published May 29, 2025 • 8

Alita: Generalist Agent Enabling Scalable Agentic Reasoning with Minimal Predefinition and Maximal Self-Evolution

Paper • 2505.20286 • Published May 26, 2025 • 8

AdaCtrl: Towards Adaptive and Controllable Reasoning via Difficulty-Aware Budgeting

Paper • 2505.18822 • Published May 24, 2025 • 15

upvoted 4 papers 8 months ago

Time-R1: Towards Comprehensive Temporal Reasoning in LLMs

Paper • 2505.13508 • Published May 16, 2025 • 15

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11, 2025 • 154

Optimizing Chain-of-Thought Reasoners via Gradient Variance Minimization in Rejection Sampling and RL

Paper • 2505.02391 • Published May 5, 2025 • 25

RM-R1: Reward Modeling as Reasoning

Paper • 2505.02387 • Published May 5, 2025 • 79

upvoted 2 papers 9 months ago

OTC: Optimal Tool Calls via Reinforcement Learning

Paper • 2504.14870 • Published Apr 21, 2025 • 35

ToolRL: Reward is All Tool Learning Needs

Paper • 2504.13958 • Published Apr 16, 2025 • 48