luokai's picture

57 270

luokai

iamluokai

·

iamluokai

AI & ML interests

None yet

Recent Activity

upvoted a paper about 22 hours ago

WonderZoom: Multi-Scale 3D World Generation

liked a model 12 days ago

AIImageStudio/ReversalFilmGravure_z_Image_turbo

liked a model 14 days ago

Tongyi-MAI/Z-Image-Turbo

View all activity

Organizations

upvoted a paper about 22 hours ago

WonderZoom: Multi-Scale 3D World Generation

Paper • 2512.09164 • Published 4 days ago • 10

upvoted a paper about 1 month ago

BindWeave: Subject-Consistent Video Generation via Cross-Modal Integration

Paper • 2510.00438 • Published Oct 1 • 8

upvoted a collection 3 months ago

MobileCLIP2

MobileCLIP2: Mobile-friendly image-text models with SOTA zero-shot capabilities trained on DFNDR-2B • 37 items • Updated Sep 18 • 56

upvoted a collection 4 months ago

FastVLM

Efficient Vision Encoding for Vision Language Models • 9 items • Updated Sep 2 • 105

upvoted 3 papers 4 months ago

ToonComposer: Streamlining Cartoon Production with Generative Post-Keyframing

Paper • 2508.10881 • Published Aug 14 • 52

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 238

EarthCrafter: Scalable 3D Earth Generation via Dual-Sparse Latent Diffusion

Paper • 2507.16535 • Published Jul 22 • 20

upvoted a collection 5 months ago

Seed-X

A powerful open-source multilingual translation language model series, including instruction and reasoning models. • 8 items • Updated Aug 22 • 65

upvoted a paper 5 months ago

RoboBrain 2.0 Technical Report

Paper • 2507.02029 • Published Jul 2 • 33

upvoted a paper 6 months ago

XVerse: Consistent Multi-Subject Control of Identity and Semantic Attributes via DiT Modulation

Paper • 2506.21416 • Published Jun 26 • 28

upvoted a collection 6 months ago

ERNIE 4.5

collection of ERNIE 4.5 models. • 27 items • Updated Nov 11 • 180

upvoted an article 6 months ago

Article

🤔👀🎬🖥️📖 Kimi-VL-A3B-Thinking-2506: A Quick Navigation

Jun 21

•

74

upvoted a collection 6 months ago

MedGemma Release

Collection of Gemma 3 variants for performance on medical text and image comprehension to accelerate building healthcare-based AI applications. • 7 items • Updated Jul 11 • 358

upvoted a collection 7 months ago

Qwen2.5-Omni

End-to-End Omni (text, audio, image, video, and natural speech interaction) model based Qwen2.5 • 7 items • Updated Jul 21 • 160

upvoted 2 collections 8 months ago

Qwen3

84 items • Updated Aug 6 • 1.49k

InternVL3

34 items • Updated Sep 28 • 83

upvoted a paper 8 months ago

SkyReels-A2: Compose Anything in Video Diffusion Transformers

Paper • 2504.02436 • Published Apr 3 • 39

upvoted a paper 9 months ago

ReCamMaster: Camera-Controlled Generative Rendering from A Single Video

Paper • 2503.11647 • Published Mar 14 • 145

upvoted a collection 9 months ago

Wan2.1 14B 480p I2V LoRAs

A collection of Remade's Wan2.1 14B 480p I2V LoRAs • 49 items • Updated May 24 • 208

upvoted a collection 10 months ago

olmOCR

olmOCR is a document recognition pipeline for efficiently converting documents into plain text. olmocr.allenai.org • 12 items • Updated 4 days ago • 140