high-git-star
updated
Packing Input Frame Context in Next-Frame Prediction Models for Video
Generation
Paper
•
2504.12626
•
Published
•
51
Paper
•
2505.09388
•
Published
•
321
Qwen-Image Technical Report
Paper
•
2508.02324
•
Published
•
267
Paper
•
2508.10104
•
Published
•
291
InternVL3.5: Advancing Open-Source Multimodal Models in Versatility,
Reasoning, and Efficiency
Paper
•
2508.18265
•
Published
•
211
WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent
Paper
•
2508.05748
•
Published
•
141
VibeVoice Technical Report
Paper
•
2508.19205
•
Published
•
139
Mobile-Agent-v3: Foundamental Agents for GUI Automation
Paper
•
2508.15144
•
Published
•
64
Prompt Orchestration Markup Language
Paper
•
2508.13948
•
Published
•
48
WebSailor: Navigating Super-human Reasoning for Web Agent
Paper
•
2507.02592
•
Published
•
123
Easy Dataset: A Unified and Extensible Framework for Synthesizing LLM
Fine-Tuning Data from Unstructured Documents
Paper
•
2507.04009
•
Published
•
51
MiniCPM4: Ultra-Efficient LLMs on End Devices
Paper
•
2506.07900
•
Published
•
93
OmniGen2: Exploration to Advanced Multimodal Generation
Paper
•
2506.18871
•
Published
•
78
SpatialLM: Training Large Language Models for Structured Indoor Modeling
Paper
•
2506.07491
•
Published
•
50
InternVL3: Exploring Advanced Training and Test-Time Recipes for
Open-Source Multimodal Models
Paper
•
2504.10479
•
Published
•
306
Paper2Code: Automating Code Generation from Scientific Papers in Machine
Learning
Paper
•
2504.17192
•
Published
•
120
Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought
Paper
•
2504.05599
•
Published
•
85
Qwen2.5-Omni Technical Report
Paper
•
2503.20215
•
Published
•
168
YuE: Scaling Open Foundation Models for Long-Form Music Generation
Paper
•
2503.08638
•
Published
•
71
VisualPRM: An Effective Process Reward Model for Multimodal Reasoning
Paper
•
2503.10291
•
Published
•
36
Search-R1: Training LLMs to Reason and Leverage Search Engines with
Reinforcement Learning
Paper
•
2503.09516
•
Published
•
36
SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for
Open Base Models in the Wild
Paper
•
2503.18892
•
Published
•
31
REINFORCE++: A Simple and Efficient Approach for Aligning Large Language
Models
Paper
•
2501.03262
•
Published
•
103
UI-TARS: Pioneering Automated GUI Interaction with Native Agents
Paper
•
2501.12326
•
Published
•
64