Audio Spaces
-
π71
-
Seamless M4T
π949 -
MusicGen
π΅5.07kGenerate music from text descriptions and optional melodies
-
Audioldm Text To Audio Generation
π814Generate audio from text descriptions
-
AudioLDM2 Text2Audio Text2Music Generation
π307Generate audio and waveform video from text
-
AudioSep
π222 -
Lp Music Caps
π΅170Generate captions for music audio
-
Tortoise Tts
π’315ExpressivText-to-Speech
-
All In One
π22 -
XTTS
πΈ2.77kGenerate speech from text using a reference voice
-
Coqui Bark Voice Cloning
πΈ189 -
VALL E X
π365Generate audio from text using voice prompts
-
WavJourney
π₯193 -
Music To Image
πΆ264 -
MMS
π277Transform and identify speech with MMS
-
ElevenLabs TTS
π£620Generate voice from text using ElevenLabs
-
AudioGPT
π289 -
Bark
πΆ2.37kGenerate realistic audio from text
-
SpeechT5 Speech Recognition Demo
π©36 -
CoquiTTS (Official)
πΈ173 -
Whisper
π2.68kTranscribe audio files or YouTube videos into text
-
Moe TTS
π663Generate and convert voice using text and audio inputs
-
YourTTS
π₯17 -
Talking Face Generation with Multilingual TTS
π559Generate a talking face video from text in multiple languages
-
OpenAI TTS New
π561 -
Mustango
π’167 -
OWSM Demo
π55 -
StyleTTS 2
π£719Efficient, fast, and natural text to speech with StyleTTS 2!
-
HierSpeech++ (Zero-shot TTS)
β‘398Generate high-quality speech from text using a prompt audio
-
Video2music
π21Generate music for a video based on its content and key
-
Whisper Large V2
π€«188 -
Musicgen Prompt Upsampling
π64Generate music from text prompts πΆ
-
Seamless M4T v2
π517Translate speech and text between languages
-
Seamless Streaming
π322Translate text between languages
-
Matcha TTS
π΅53Generate speech from text with speaker selection
-
MusicGen Streaming
π₯284Generate music from text prompts in real-time
-
Resemble Enhance
π441Enhance and denoise your audio files
-
Singing Voice Conversion
πΌ261Transform your voice into a singer's
-
NaturalSpeech2
π§52Generate speech with cloned timbre
-
Create Your Own TTS Dataset
π₯22 -
Podcast Transcription
π’ -
OpenVoice
π€1.12kGenerate customized speech from text using a reference audio
-
M2UGen Demo
π»94 -
Pheme
π68 -
ESPnet2 TTS
π6Convert text to speech in English, Chinese, or Japanese
-
Whisper-WebUI
π39Generate subtitles and translate audio files
-
Image2SFX Comparison
π176Generates audio environment from an image
-
WhisperSpeech
π¬379 -
MetaVoice 1B
π£144A demo of MetaVoice 1B, a new TTS model by MetaVoice.
-
TTS Arena V2
π927Vote on the latest TTS models!
-
Whisper Speech X DreamTalk
π½176Combine voice cloning and portrait lipsync animation
-
Canary 1b
π€197Transcribe and translate audio into text
-
SALMONN Audio Questioning
β‘83Deeply interrogate audio file content
-
MeloTTS
π£471Fast, efficient, & multilingual text-to-speech
-
Audio Editing
π§318Edit audios with text prompts
-
ChatMusician
π»18 -
xVASynth TTS
π§73CPU powered, low RTF, emotional, multilingual TTS
-
NaturalSpeech3 FACodec
π179Convert and reconstruct speech files
-
Hey Gemma
β25 -
Ratchet + Whisper
π£70Convert audio to text
-
AutoSubs
π3Automatically add on-screen subs to your videos
-
VoiceCraft
π161 -
TangoFlux
π324Text to Audio (Sound SFX) Generator
-
Parler-TTS
π₯842High-fidelity Text-To-Speech
-
Sing an idea β‘οΈ Music
π₯184Bring song ideas to life
-
Musicgen Songstarter Demo
π75Generate music using descriptions and optional melody audio
-
Whisper JAX
π145Transcribe or translate audio from microphone, file, or YouTube
-
AudioLCM
π’23Generate audio from text
-
Stable Audio Live Multiplayer
π»160Generate audio from text prompts
-
Stable Audio Open Zero
π₯452Generate audio from text prompts
-
Make An Audio 3
π14Generate audio from text prompts
-
Mars5 Space
π60 -
Tango Music AF
π΅5Text to Music Generator
-
Jam
π16Generate a song from lyrics and style reference
-
BigVGAN
π113Generate high-quality audio from input audio
-
SenseVoice
π89Transcribe audio with emotions and events
-
PicoAudio
π28Generate audio from text descriptions with timestamps
-
Audio Flamingo Demo
π7 -
MusiConGen
πͺ©29 -
Mms Zeroshot
π20Transcribe audio in any language using text data
-
GPT SoVITS V2 Pro Plus
π€220Generate speech from text using reference audio
-
EzAudio
π£275Generate and edit audio from text prompts
-
OpenMusic
πΆ214Generate music from text descriptions
-
Midi Music Generator
πΌ558Generate MIDI music from prompts
-
Whisper Turbo
π€―1kTranscribe audio or YouTube videos into text
-
Realtime Whisper Turbo
π€―346Realtime implementation of Whisper large turbo
-
Whisper Large V3 Turbo WebGPU
π170ML-powered speech recognition directly in your browser
-
OpenAudio S1
π691Generate speech from text
-
TTS Spaces Arena
π€459Blind vote on HF TTS models!
-
Diva Realtime Chat
π£19Generate text responses from audio input
-
F5-TTS
π£2.78kF5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
-
MaskGCT TTS Demo
π»260MaskGCT TTS Demo
-
MelodyFlow
π΅150Generate music from text descriptions
-
Fish Agent
π¬147An end-to-end (e2e) Voice Language Model by Fish Audio.
-
Nexa Omni Demo
π§65Generate text from audio input
-
Kokoro TTS
β€3.15kUpgraded to v1.0!
-
Make Custom Voices With KokoroTTS
β‘130Make Custom Voices With KokoroTTS
-
Llasa 3b Tts
π₯313Zero Shot voice cloning with llasa 3b (Unofficial Demo)
-
Llasa 1b Multilingual TTS
π12Generate speech from text with or without cloning a voice
-
Kokoro Text-to-Speech (WebGPU)
π£352High-quality speech synthesis powered by Kokoro TTS
-
Hibiki Simple
π42High-Fidelity Simultaneous Speech-To-Speech Translation
-
Zonos
π411Generate audio from text with customizable emotions and settings
-
Kokoro Web
π£78ML-powered speech synthesis directly in your browser
-
DiβͺβͺRhythm
πΆ676Blazingly Fast and Embarrassingly Simple Song Generation
-
Audiobox Aesthetics
π22Demo for audiobox-aesthetics
-
Spark TTS
π229A text-to-speech model powered by SparkAudio and Mobvoi.
-
Sesame CSM
π±857Conversational speech generation
-
Orpheus TTS
π243Try Orpheus TTS here
-
Canary 1B Flash
π€43Canary 1B Flash demo
-
IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System
π216Generate speech from text using a reference audio
-
AudioMorphix
π6Prepare environment and run Gradio app
-
MegaTTS3 Demo
π93 -
AudioX
π161Generate audio from text and video prompts
-
Vevo for Zero-shot VC, TTS, and More
π100Controllable Zero-Shot Voice Imitation
-
Dia 1.6B
π―1.75kGenerate realistic dialogue from a script, using Dia!
-
Aero 1 Audio Demo
π¬43Demo for Aero-1-Audio
-
Voila Demo
π»43Chat with a voice-clone AI
-
ACE Step
π»625A Step Towards Music Generation Foundation Model
-
Audio Difficulty Estimator
πΉ2Estimate piano difficulty from audio
-
TIGER Audio Extractor
β110Extraction & Reconstruction for Efficient Speech Separation
-
Music2emo
π17Towards Unified Music Emotion Recognition across Dimensional
-
SonicVerse
πΌ13Generate detailed music descriptions from audio clips
-
Auffusion
π»43Audio Gen, Audio Style Transfer and Audio InPainting
-
Chatterbox TTS
πΏ1.69kExpressive Zeroshot TTS
-
PlayDiffusion
π¨120Generate modified audio from text and voice
-
Voice Clone Arena
π2Vote on the latest Voice Clone TTS models!
-
Conversational WebGPU
π231 -
Song Generation
π΅607Generate custom songs from lyrics and prompts
-
NotaGen
π63Generate classical sheet music in ABC notation
-
Audio Flamingo 3 Demo
π93Audio Flamingo 3 Demo
-
Audio Flamingo 3 Chat
π32Audio Flamingo 3 demo for multi-turn multi-audio chat
-
MSR UTMOS
π’6Multiple sampling rate MOS prediction with SFI conv
-
Higgs Audio Demo
π€396Higgs Audio Demo
-
sidon_demo_beta
π22Speech restoration demo of Sidon.
-
Canary 1b V2
π€69Transcribe and Translate in 25 European Languages
-
SonicMaster β Text-Guided Music Restoration & Mastering
π§26Enhance audio quality using text prompts
-
OLMoASR
π6Open Models and Data for Training Robust Speech Recognition
-
VibeVoice-Large
π85Generate a podcast audio from a script and voice samples
-
TaDiCodec TTS AR Qwen2.5 0.5B
π10Generate speech from text with voice cloning
-
EchoX
π₯8An end-to-end speech large language model.
-
VoxCPM 0.5B
π’43Generate expressive speech from text with optional voice cloning
-
FireRedTTS2
π₯35Long-form multi-speaker dialogue generation
-
FireRedASR
π6FireRedASR Demo
-
IndexTTS 2 Demo
π’719Generate expressive voice from text using audio reference
-
SongFormer
π΅17State-of-the-art music analysis with multi-scale datasets
-
Voice Acting TTS
π24TTS for any emotion, now with non-verbal sounds!
-
Omnilingual ASR Media Transcription
π229Transcribe audio or video into text in any language
-
Music Flamingo
π΅111Upload music or YouTube videos and ask detailed questions about them
-
Maya1
π117Demo of our new open source model maya1
-
Supertonic (TTS)
β‘212Lightning-Fast, On-Device TTS
-
Dia2 2B
π¨70Streaming conversational audio in realtime
-
VibeVoice-Realtime-0.5B
π¨166Generate natural-sounding speech from text
-
Count The Notes
π΅1Convert audio to MIDI
-
SpeechJudge GRM
π1Evaluate naturalness of two audio files
-
Chatterbox Turbo Demo
β‘458Chatterbox Turbo Demo
-
Soprano TTS
π£134Now with upgraded v1.1 model!
-
Qwen3-TTS Demo
π631Convert text to speech with custom voices and cloning