hjkim

hojie11

hojie11

AI & ML interests

Computer Vision, 3D Vision, Anomaly Detection

Recent Activity

upvoted a paper 4 days ago

Qwen3-VL Technical Report

upvoted a paper 4 days ago

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

upvoted a paper 4 days ago

PixelDiT: Pixel Diffusion Transformers for Image Generation

View all activity

Organizations

None yet

upvoted 5 papers 4 days ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 11 days ago • 108

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Paper • 2512.03534 • Published 5 days ago • 18

PixelDiT: Pixel Diffusion Transformers for Image Generation

Paper • 2511.20645 • Published 12 days ago • 25

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Paper • 2512.03041 • Published 5 days ago • 57

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 11 days ago • 96

upvoted a paper 5 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 6 days ago • 172

upvoted 3 papers 6 days ago

DiP: Taming Diffusion Models in Pixel Space

Paper • 2511.18822 • Published 14 days ago • 25

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published 9 days ago • 41

Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

Paper • 2511.20649 • Published 12 days ago • 43

upvoted 2 papers 17 days ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published 17 days ago • 106

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published 18 days ago • 52

upvoted a paper 18 days ago

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 119

upvoted a paper 19 days ago

A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published 24 days ago • 60

upvoted a paper 20 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 25 days ago • 68

upvoted 2 papers 21 days ago

One Small Step in Latent, One Giant Leap for Pixels: Fast Latent Upscale Adapter for Your Diffusion Models

Paper • 2511.10629 • Published 24 days ago • 122

Depth Anything 3: Recovering the Visual Space from Any Views

Paper • 2511.10647 • Published 24 days ago • 92

upvoted 2 papers 25 days ago

Time-to-Move: Training-Free Motion Controlled Video Generation via Dual-Clock Denoising

Paper • 2511.08633 • Published 28 days ago • 53

TiDAR: Think in Diffusion, Talk in Autoregression

Paper • 2511.08923 • Published 26 days ago • 111

upvoted a paper 26 days ago

Tiny Model, Big Logic: Diversity-Driven Optimization Elicits Large-Model Reasoning Ability in VibeThinker-1.5B

Paper • 2511.06221 • Published 29 days ago • 128

upvoted a paper 27 days ago

Visual Spatial Tuning

Paper • 2511.05491 • Published about 1 month ago • 49

hjkim

AI & ML interests

Recent Activity

Organizations

hojie11's activity