ZeroGPU Explorers

community

AI & ML interests

None defined yet.

Recent Activity

Doubiiu authored a paper 7 days ago

DreamVideo-Omni: Omni-Motion Controlled Multi-Subject Video Customization with Latent Identity Reinforcement Learning

Zhengyi authored a paper 25 days ago

NANO3D: A Training-Free Approach for Efficient 3D Editing Without Masks

MaziyarPanahi authored a paper 28 days ago

Arcee Trinity Large Technical Report

View all activity

authored a paper 2 days ago

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published 3 days ago • 102

submitted a paper to Daily Papers 2 days ago

SocialOmni: Benchmarking Audio-Visual Social Interactivity in Omni Models

Paper • 2603.16859 • Published 3 days ago • 102

submitted a paper to Daily Papers 11 days ago

Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey

Paper • 2603.04445 • Published 24 days ago • 4

authored a paper 11 days ago

Dynamic Model Routing and Cascading for Efficient LLM Inference: A Survey

Paper • 2603.04445 • Published 24 days ago • 4

authored a paper 15 days ago

Thoth: Mid-Training Bridges LLMs to Time Series Understanding

Paper • 2603.01042 • Published 19 days ago

authored a paper about 1 month ago

AfriNLLB: Efficient Translation Models for African Languages

Paper • 2602.09373 • Published Feb 10 • 2

authored 2 papers about 1 month ago

Vision-DeepResearch Benchmark: Rethinking Visual and Textual Search for Multimodal Large Language Models

Paper • 2602.02185 • Published Feb 2 • 117

Vision-DeepResearch: Incentivizing DeepResearch Capability in Multimodal Large Language Models

Paper • 2601.22060 • Published Jan 29 • 155

authored 2 papers about 2 months ago

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Paper • 2601.19834 • Published Jan 27 • 25

Audio-Visual World Models: Towards Multisensory Imagination in Sight and Sound

Paper • 2512.00883 • Published Nov 30, 2025

submitted a paper to Daily Papers about 2 months ago

Visual Generation Unlocks Human-Like Reasoning through Multimodal World Models

Paper • 2601.19834 • Published Jan 27 • 25

submitted a paper to Daily Papers about 2 months ago

iFSQ: Improving FSQ for Image Generation with 1 Line of Code

Paper • 2601.17124 • Published Jan 23 • 33

LXT

submitted a paper to Daily Papers about 2 months ago

SAMTok: Representing Any Mask with Two Words

Paper • 2601.16093 • Published Jan 22 • 43

authored a paper 2 months ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 49

submitted a paper to Daily Papers 2 months ago

UniCorn: Towards Self-Improving Unified Multimodal Models through Self-Generated Supervision

Paper • 2601.03193 • Published Jan 6 • 49

LXT

authored 5 papers 3 months ago

DenseWorld-1M: Towards Detailed Dense Grounded Caption in the Real World

Paper • 2506.24102 • Published Jun 30, 2025

One Flight Over the Gap: A Survey from Perspective to Panoramic Vision

Paper • 2509.04444 • Published Sep 4, 2025

VimoRAG: Video-based Retrieval-augmented 3D Motion Generation for Motion Language Models

Paper • 2508.12081 • Published Aug 16, 2025

DiT360: High-Fidelity Panoramic Image Generation via Hybrid Training

Paper • 2510.11712 • Published Oct 13, 2025 • 31

Grasp Any Region: Towards Precise, Contextual Pixel Understanding for Multimodal LLMs

Paper • 2510.18876 • Published Oct 21, 2025 • 37