Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
melsiddieg 's Collections
DiffusionLLMs
Arudi
Biomedical
from_scratch_pretrain
bert and friends
Audiovisual
Research and Optimization
Visual and OCR
finetune_datasets

Audiovisual

updated Oct 22, 2025
Upvote
-

  • microsoft/VibeVoice-1.5B

    Text-to-Speech • 3B • Updated Sep 1, 2025 • 605k • 2.13k

  • ibm-granite/granite-docling-258M

    Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 195k • 1.07k

  • deepseek-ai/DeepSeek-OCR

    Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 3.47M • 3.03k

  • Qwen/Qwen3-VL-2B-Thinking

    Image-Text-to-Text • 2B • Updated Oct 20, 2025 • 36.6k • 97

  • datalab-to/chandra

    Image-to-Text • 9B • Updated Oct 21, 2025 • 155k • 445

  • Qwen/Qwen3-VL-2B-Instruct

    Image-Text-to-Text • 2B • Updated Oct 23, 2025 • 1.22M • 251

  • PokeeAI/pokee_research_7b

    Text Generation • 8B • Updated Oct 23, 2025 • 306 • 100
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs