hjkim's picture

454 8

hjkim

hojie11

·

hojie11

AI & ML interests

Computer Vision, 3D Vision, Anomaly Detection

Recent Activity

upvoted a paper 1 day ago

NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

upvoted a paper 1 day ago

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

upvoted a paper 1 day ago

DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling

View all activity

Organizations

None yet

upvoted 6 papers 1 day ago

NeuralRemaster: Phase-Preserving Diffusion for Structure-Aligned Generation

Paper • 2512.05106 • Published 5 days ago • 13

Reward Forcing: Efficient Streaming Video Generation with Rewarded Distribution Matching Distillation

Paper • 2512.04678 • Published 5 days ago • 38

DynamicVerse: A Physically-Aware Multimodal Framework for 4D World Modeling

Paper • 2512.03000 • Published 7 days ago • 33

DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle

Paper • 2512.04324 • Published 6 days ago • 142

Live Avatar: Streaming Real-time Audio-Driven Avatar Generation with Infinite Length

Paper • 2512.04677 • Published 5 days ago • 157

ProPhy: Progressive Physical Alignment for Dynamic World Simulation

Paper • 2512.05564 • Published 4 days ago • 3

upvoted a paper 5 days ago

Qwen3-VL Technical Report

Paper • 2511.21631 • Published 13 days ago • 116

upvoted 4 papers 6 days ago

Rethinking Prompt Design for Inference-time Scaling in Text-to-Visual Generation

Paper • 2512.03534 • Published 6 days ago • 18

PixelDiT: Pixel Diffusion Transformers for Image Generation

Paper • 2511.20645 • Published 14 days ago • 25

MultiShotMaster: A Controllable Multi-Shot Video Generation Framework

Paper • 2512.03041 • Published 7 days ago • 59

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Paper • 2511.21689 • Published 13 days ago • 97

upvoted 4 papers 7 days ago

DeepSeek-V3.2: Pushing the Frontier of Open Large Language Models

Paper • 2512.02556 • Published 7 days ago • 186

DiP: Taming Diffusion Models in Pixel Space

Paper • 2511.18822 • Published 15 days ago • 25

Vision Bridge Transformer at Scale

Paper • 2511.23199 • Published 11 days ago • 43

Infinity-RoPE: Action-Controllable Infinite Video Generation Emerges From Autoregressive Self-Rollout

Paper • 2511.20649 • Published 14 days ago • 44

upvoted 2 papers 18 days ago

SAM 3D: 3Dfy Anything in Images

Paper • 2511.16624 • Published 19 days ago • 106

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published 20 days ago • 52

upvoted 2 papers 20 days ago

SAM 2: Segment Anything in Images and Videos

Paper • 2408.00714 • Published Aug 1, 2024 • 119

A Style is Worth One Code: Unlocking Code-to-Style Image Generation with Discrete Style Space

Paper • 2511.10555 • Published 26 days ago • 60

upvoted a paper 21 days ago

MMaDA-Parallel: Multimodal Large Diffusion Language Models for Thinking-Aware Editing and Generation

Paper • 2511.09611 • Published 27 days ago • 68