Tianheng Cheng's picture

Tianheng Cheng

wondervictor

·

https://github.com/wondervictor

AI & ML interests

Computer vision, visual perception, multimodal models

Organizations

authored a paper 7 months ago

Seed1.5-VL Technical Report

Paper • 2505.07062 • Published May 11 • 153

authored a paper 9 months ago

GroundingSuite: Measuring Complex Multi-Granular Pixel Grounding

Paper • 2503.10596 • Published Mar 13 • 18

authored 3 papers 10 months ago

Knowledge Mining with Scene Text for Fine-Grained Recognition

Paper • 2203.14215 • Published Mar 27, 2022

GaussTR: Foundation Model-Aligned Gaussian Transformer for Self-Supervised 3D Spatial Understanding

Paper • 2412.13193 • Published Dec 17, 2024 • 1

Multimodal Mamba: Decoder-only Multimodal State Space Model via Quadratic to Linear Distillation

Paper • 2502.13145 • Published Feb 18 • 37

authored a paper about 1 year ago

ControlAR: Controllable Image Generation with Autoregressive Models

Paper • 2410.02705 • Published Oct 3, 2024 • 11

authored 2 papers over 1 year ago

Deep High-Resolution Representation Learning for Visual Recognition

Paper • 1908.07919 • Published Aug 20, 2019 • 2

EVF-SAM: Early Vision-Language Fusion for Text-Prompted Segment Anything Model

Paper • 2406.20076 • Published Jun 28, 2024 • 10

authored a paper almost 2 years ago

YOLO-World: Real-Time Open-Vocabulary Object Detection

Paper • 2401.17270 • Published Jan 30, 2024 • 42