Egbe Chidiebere
EAustino
·
AI & ML interests
None yet
Organizations
None yet
LLM
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 65 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 189 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27
LMM
-
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper • 2401.00849 • Published • 17 -
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 16 -
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models
Paper • 2312.17661 • Published • 15 -
A Vision Check-up for Language Models
Paper • 2401.01862 • Published • 11
Speech generation
Open source LLM
Medical Al paper
3D modeling
Video creation
-
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper • 2401.01647 • Published • 13 -
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Paper • 2401.01827 • Published • 18 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21 -
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
Paper • 2401.00896 • Published • 15
Image generation
-
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper • 2401.00935 • Published • 18 -
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
Paper • 2401.00909 • Published • 10 -
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image
Paper • 2401.01117 • Published • 10 -
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Paper • 2401.01173 • Published • 12
Agents
Robotic agents
Vision Transformer
Al safety
Conversational Avatar( photorealistic)
Video creation
-
SIGNeRF: Scene Integrated Generation for Neural Radiance Fields
Paper • 2401.01647 • Published • 13 -
Moonshot: Towards Controllable Video Generation and Editing with Multimodal Conditions
Paper • 2401.01827 • Published • 18 -
VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM
Paper • 2401.01256 • Published • 21 -
TrailBlazer: Trajectory Control for Diffusion-Based Video Generation
Paper • 2401.00896 • Published • 15
LLM
-
Understanding LLMs: A Comprehensive Overview from Training to Inference
Paper • 2401.02038 • Published • 65 -
DocLLM: A layout-aware generative language model for multimodal document understanding
Paper • 2401.00908 • Published • 189 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27
Image generation
-
Boundary Attention: Learning to Find Faint Boundaries at Any Resolution
Paper • 2401.00935 • Published • 18 -
Taming Mode Collapse in Score Distillation for Text-to-3D Generation
Paper • 2401.00909 • Published • 10 -
Q-Refine: A Perceptual Quality Refiner for AI-Generated Image
Paper • 2401.01117 • Published • 10 -
En3D: An Enhanced Generative Model for Sculpting 3D Humans from 2D Synthetic Data
Paper • 2401.01173 • Published • 12
LMM
-
COSMO: COntrastive Streamlined MultimOdal Model with Interleaved Pre-Training
Paper • 2401.00849 • Published • 17 -
Learning Vision from Models Rivals Learning Vision from Data
Paper • 2312.17742 • Published • 16 -
Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models
Paper • 2312.17661 • Published • 15 -
A Vision Check-up for Language Models
Paper • 2401.01862 • Published • 11
Agents
Speech generation
Robotic agents
Open source LLM
Vision Transformer
Medical Al paper
Al safety
3D modeling