AI & ML interests

Hugging Face Inference Endpoints Images repository allows AI Builders to collaborate and engage creating awesome inference deployments

Recent Activity

alvarobartt 
posted an update 7 days ago
view post
Post
3201
Learn how to deploy Microsoft Research VibeVoice ASR on Microsoft Azure Foundry with Hugging Face to generate rich audio transcriptions with Who, When, and What! 💥

> 🕒 60-minute single-pass processing, no chunking or stitching
> 👤 Customized hotwords to guide recognition on domain-specific content
> 📝 Rich transcription: joint ASR + diarization + timestamping in one pass
> 🌍 50+ languages with automatic detection and code-switching support
> 🤗 Deployed on Microsoft Foundry via an OpenAI-compatible Chat Completions API

https://huggingface.co/docs/microsoft-azure/foundry/examples/deploy-vibevoice-asr
qgallouedec 
posted an update 22 days ago
view post
Post
2742
@CohereLabs just released 🌿 Tiny Aya: a fully open-source 3B parameter model that speaks 70+ languages 🌍! But there’s a catch:

Tiny Aya is just a language model. It doesn’t support tool calling, the key capability that turns frontier models into powerful *agents*.
So the real question is:

How hard is it to turn Tiny Aya into an agent?

Turns out… it’s simple, thanks to Hugging Face TRL.
We’re sharing a hands-on example showing how to train Tiny Aya to turn it into a tool-calling agent using TRL, unlocking what could become the first *massively multilingual open agent*.

Small model. Global reach. Agent capabilities.

👉 https://github.com/huggingface/trl/blob/main/examples/notebooks/sft_tool_calling.ipynb
  • 1 reply
·
AdinaY 
posted an update 28 days ago
view post
Post
3284
MiniMax M2.5 is now available on the hub 🚀

MiniMaxAI/MiniMax-M2.5

✨ 229B - Modified MIT license
✨37% faster than M2.1
✨ ~$1/hour at 100 TPS
  • 2 replies
·
AdinaY 
posted an update 29 days ago
AdinaY 
posted an update 29 days ago
view post
Post
3674
Game on 🎮🚀

While Seedance 2.0’s videos are all over the timeline, DeepSeek quietly pushed a new model update in its app.

GLM-5 from Z.ai adds more momentum.

Ming-flash-omni from Ant Group , MiniCPM-SALA from OpenBMB
, and the upcoming MiniMax M2.5 keep the heat on 🔥

Spring Festival is around the corner,
no one’s sleeping!

✨ More releases coming, stay tuned
https://huggingface.co/collections/zh-ai-community/2026-february-china-open-source-highlights
AdinaY 
posted an update 30 days ago
view post
Post
3891
Ming-flash-omni 2.0 🚀 New open omni-MLLM released by Ant Group

inclusionAI/Ming-flash-omni-2.0

✨ MIT license
✨ MoE - 100B/6B active
✨ Zero-shot voice cloning + controllable audio
✨ Fine-grained visual knowledge grounding
  • 2 replies
·
AdinaY 
posted an update about 1 month ago
view post
Post
748
LLaDA 2.1 is out 🔥 A new series of MoE diffusion language model released by AntGroup

inclusionAI/LLaDA2.1-mini
inclusionAI/LLaDA2.1-flash

✨LLaDA2.1-mini: 16B - Apache2.0
✨LLaDA2.1-flash: 100B - Apache2.0
✨Both delivers editable generation, RL-trained diffusion reasoning and fast inference
  • 2 replies
·
AdinaY 
posted an update about 1 month ago
view post
Post
2582
AI for science is moving fast🚀

Intern-S1-Pro 🔬 a MoE multimodal scientific reasoning model from Shanghai AI Lab

internlm/Intern-S1-Pro

✨ 1T total / 22B active
✨ Apache 2.0
✨ SoTA scientific reasoning performance
✨ FoPE enables scalable modeling of long physical time series (10⁰–10⁶)
  • 2 replies
·
AdinaY 
posted an update about 1 month ago
view post
Post
1378
✨ China’s open source AI ecosystem has entered a new phase

https://huggingface.co/blog/huggingface/one-year-since-the-deepseek-moment-blog-3

One year after the “DeepSeek Moment,” open source has become the default. Models, research, infrastructure, and deployment are increasingly shared to support large-scale, system-level integration.

This final blog examines how leading Chinese AI organizations are evolving ,and what this implies for the future of open source.
AdinaY 
posted an update about 1 month ago
view post
Post
394
GLM just entered the OCR field🔥

zai-org/GLM-OCR

✨ 0.9B
✨ MIT licensed
✨ Multimodal GLM-V architecture
✨ #1 on OmniDocBench v1.5 (94.62)
AdinaY 
posted an update about 1 month ago
AdinaY 
posted an update about 1 month ago
view post
Post
1123
What a week 🤯

Following DeepSeek, Kimi, Qwen, Baidu, and Ant Group, Unitree Robotics
has now released a VLA model on the hub too!

unitreerobotics/UnifoLM-VLA-Base
alvarobartt 
posted an update about 1 month ago
view post
Post
3139
💥 hf-mem v0.4.1 now also estimates KV cache memory requirements for any context length and batch size with the --experimental flag!

uvx hf-mem --model-id ... --experimental will automatically pull the required information from the Hugging Face Hub to include the KV cache estimation, when applicable.

💡 Alternatively, you can also set the --max-model-len, --batch-size and --kv-cache-dtype arguments (à la vLLM) manually if preferred.
  • 1 reply
·
AdinaY 
posted an update about 1 month ago
view post
Post
305
LongCat-Flash-Lite🔥 a non-thinking MoE model released by Meituan LongCat team.

meituan-longcat/LongCat-Flash-Lite

✨ Total 68.5B / 3B active - MIT license
✨ 256k context
✨ Faster inference with N-gram embeddings
AdinaY 
posted an update about 1 month ago
view post
Post
284
Ant Group is going big on robotics 🤖

They just dropped their first VLA and depth perception foundation model on huggingface.

✨ LingBot-VLA :
- Trained on 20k hours of real-world robot data
- 9 robot embodiments
- Clear no-saturation scaling laws
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-vla
Paper:
A Pragmatic VLA Foundation Model (2601.18692)

✨ LingBot-Depth:
- Metric-accurate 3D from noisy, incomplete depth
- Masked Depth Modeling (self-supervised)
- RGB–depth alignment, works with <5% sparse depth
- Apache 2.0

Model: https://huggingface.co/collections/robbyant/lingbot-depth
Paper:
Masked Depth Modeling for Spatial Perception (2601.17895)
AdinaY 
posted an update about 1 month ago
AdinaY 
posted an update about 1 month ago
AdinaY 
posted an update about 1 month ago
view post
Post
907
Kimi K2.5 from Moonshot AI is more than just another large model🤯

https://huggingface.co/collections/moonshotai/kimi-k25

✨ Native multimodality : image + video + language + agents 💥
✨1T MoE / 32B active
✨ 256K context
✨ Modified MIT license
✨ Agent Swarm execution
✨ Open weights + open infra mindset
  • 1 reply
·