Parallel Sentences Datasets Collection These datasets all have "english" and "non_english" columns for numerous datasets. They can be used to make embedding models multilingual. • 14 items • Updated Feb 25 • 20
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 4 days ago • 112
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 6 days ago • 223
Multimodal Implementations Collection Comprehensive Demo of Multimodal VLMs on the Hub • 20 items • Updated about 3 hours ago • 8
view article Article Open ASR Leaderboard: Trends and Insights with New Multilingual & Long-Form Tracks +2 16 days ago • 19
Open ASR Leaderboard: Towards Reproducible and Transparent Multilingual and Long-Form Speech Recognition Evaluation Paper • 2510.06961 • Published Oct 8 • 9
view article Article Introducing AnyLanguageModel: One API for Local and Remote LLMs on Apple Platforms 17 days ago • 29
Meta CLIP 1 Collection Scaling CLIP data with transparent training distribution from an end-to-end pipeline. • 7 items • Updated 12 days ago • 21
view article Article Introducing AuraFace: Open-Source Face Recognition and Identity Preservation Models Aug 26, 2024 • 52
OlmoEarth Collection OlmoEarth pre-trained and fine-tuned foundation models for remote sensing • 10 items • Updated 7 days ago • 14