Audiovisual - a melsiddieg Collection

melsiddieg 's Collections

Arudi

from_scratch_pretrain

bert and friends

Research and Optimization

finetune_datasets

Audiovisual

updated Oct 22, 2025

microsoft/VibeVoice-1.5B

Text-to-Speech • 3B • Updated Sep 1, 2025 • 605k • 2.13k
ibm-granite/granite-docling-258M

Image-Text-to-Text • 0.3B • Updated Sep 23, 2025 • 195k • 1.07k
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated Nov 4, 2025 • 3.47M • 3.03k
Qwen/Qwen3-VL-2B-Thinking

Image-Text-to-Text • 2B • Updated Oct 20, 2025 • 36.6k • 97
datalab-to/chandra

Image-to-Text • 9B • Updated Oct 21, 2025 • 155k • 445
Qwen/Qwen3-VL-2B-Instruct

Image-Text-to-Text • 2B • Updated Oct 23, 2025 • 1.22M • 251
PokeeAI/pokee_research_7b

Text Generation • 8B • Updated Oct 23, 2025 • 306 • 100