Audio Spaces

hysts 's Collections

Diffusion model Spaces

LLM Spaces

Audio Spaces

updated 1 day ago

Upvote

Runtime error

Agents

Featured

71

Whisper vs Distil-Whisper

📈

71
Runtime error

Agents

Featured

949

Seamless M4T

📞

949
Running on A10G

Agents

Featured

5.07k

MusicGen

🎵

5.07k

Generate music from text and optional melody
Runtime error

Agents

Featured

816

Audioldm Text To Audio Generation

🔊

816

Generate audio from text descriptions
Runtime error

Agents

Featured

307

AudioLDM2 Text2Audio Text2Music Generation

🔊

307

Generate audio and waveform video from text
Runtime error

Agents

Featured

221

AudioSep

🐠

221
Running

Agents

Featured

172

Lp Music Caps

🎵

172

Generate captions for music audio
Runtime error

Agents

315

Tortoise Tts

🐢

315

ExpressivText-to-Speech
Runtime error

Agents

22

All In One

📊

22
Runtime error

Agents

Featured

2.77k

XTTS

🐸

2.77k

Generate speech from text using a reference voice
Paused

Agents

189

Coqui Bark Voice Cloning

🐸

189
Build error

Agents

365

VALL E X

🎙

365

Generate audio from text using voice prompts
Sleeping

Featured

193

WavJourney

🔥

193
Paused

Agents

Featured

263

Music To Image

🎶

263
Runtime error

Agents

Featured

277

MMS

🌍

277

Transform and identify speech with MMS
Running

Agents

Featured

624

ElevenLabs TTS

🗣

624

Generate spoken audio from text using selectable voices
Build error

Agents

288

AudioGPT

🚀

288
Build error

Agents

Featured

2.37k

Bark

🐶

2.37k

Generate realistic audio from text
Runtime error

Agents

37

SpeechT5 Speech Recognition Demo

👩

37
Runtime error

Agents

172

CoquiTTS (Official)

🐸

172
Running on Zero

Agents

Featured

2.76k

Whisper

📉

2.76k

Transcribe audio files into text
Running on CPU Upgrade

Agents

669

Moe TTS

😊

669

Generate and convert voice using text and audio inputs
Build error

Agents

17

YourTTS

🔥

17
Running

Agents

Featured

564

Talking Face Generation with Multilingual TTS

👄

564

Generate multilingual talking-face videos from your text
Runtime error

Agents

560

OpenAI TTS New

📊

560
Build error

Agents

Featured

166

Mustango

🐢

166
Runtime error

Agents

Featured

55

OWSM Demo

🔊

55
Running on L4

Agents

Featured

731

StyleTTS 2

🗣

731

Efficient, fast, and natural text to speech with StyleTTS 2!
Running on T4

Agents

396

HierSpeech++ (Zero-shot TTS)

⚡

396

Generate high-quality speech from text using a prompt audio
Runtime error

Agents

21

Video2music

📚

21

Generate music for a video based on its content and key
Runtime error

Agents

187

Whisper Large V2

🤫

187
Paused

Agents

63

Musicgen Prompt Upsampling

🌖

63

Generate music from text prompts 🎶
Runtime error

Featured

517

Seamless M4T v2

📞

517

Translate speech and text between languages
Paused

326

Seamless Streaming

📞

326

Translate text between languages
Build error

Agents

53

Matcha TTS

🍵

53

Generate speech from text with speaker selection
Running on Zero

MCP

Featured

295

MusicGen Streaming

🔥

295

Generate music from text descriptions in real-time
Running on T4

Agents

470

Resemble Enhance

🚀

470

Enhance and denoise your audio files
Runtime error

Agents

262

Singing Voice Conversion

🎼

262

Transform your voice into a singer's
Build error

Agents

52

NaturalSpeech2

🎧

52

Generate speech with cloned timbre
Paused

Agents

22

Create Your Own TTS Dataset

🔥

22
Runtime error

Agents

Podcast Transcription

🐢
Running

Agents

Featured

1.13k

OpenVoice

🤗

1.13k

Generate speech in a cloned voice from a short audio sample
Runtime error

Agents

Featured

94

M2UGen Demo

💻

94
Runtime error

Agents

Featured

68

Pheme

📊

68
Build error

Agents

6

ESPnet2 TTS

📈

6

Convert text to speech in English, Chinese, or Japanese
Running

Agents

44

Whisper-WebUI

🚀

44

Generate subtitles and translate audio files
Running

MCP

Featured

177

Image2SFX Comparison

👂

177

Generates audio environment from an image
Runtime error

Agents

Featured

379

WhisperSpeech

🌬

379
Build error

Featured

144

MetaVoice 1B

🗣

144

A demo of MetaVoice 1B, a new TTS model by MetaVoice.
Running on CPU Upgrade

Agents

Featured

954

TTS Arena V2

🏆

954

Vote on the latest TTS models!
Running

Agents

Featured

179

Whisper Speech X DreamTalk

😽

179

Combine voice cloning and portrait lipsync animation
Runtime error

Agents

Featured

197

Canary 1b

🐤

197

Transcribe and translate audio into text
Running on Zero

MCP

Featured

83

SALMONN Audio Questioning

⚡

83

Deeply interrogate audio file content
Running on Zero

Agents

Featured

476

MeloTTS

🗣

476

Fast, efficient, & multilingual text-to-speech
Runtime error

Agents

Featured

328

Audio Editing

🎧

328

Edit audios with text prompts
Runtime error

Agents

18

ChatMusician

💻

18
Runtime error

74

xVASynth TTS

🧝

74

CPU powered, low RTF, emotional, multilingual TTS
Configuration error

Agents

Featured

178

NaturalSpeech3 FACodec

🏃

178

Convert and reconstruct speech files
Runtime error

Agents

25

Hey Gemma

☎

25
Running

70

Ratchet + Whisper

🗣

70

Convert audio to text
Runtime error

Agents

3

AutoSubs

📜

3

Automatically add on-screen subs to your videos
Build error

Agents

161

VoiceCraft

📈

161
Running on Zero

Agents

326

TangoFlux

🚀

326

Text to Audio (Sound SFX) Generator
Running on Zero

Agents

Featured

848

Parler-TTS

🥖

848

High-fidelity Text-To-Speech
Runtime error

Agents

Featured

184

Sing an idea ➡️ Music

🔥

184

Bring song ideas to life
Configuration error

Agents

75

Musicgen Songstarter Demo

👁

75

Generate music from text and optional melody
Runtime error

Agents

145

Whisper JAX

👀

145

Transcribe or translate audio from microphone, file, or YouTube
Runtime error

Agents

23

AudioLCM

🏢

23

Generate audio from text
Running on Zero

Agents

Featured

164

Stable Audio Live Multiplayer

💻

164

Generate custom audio from text prompts
Running on Zero

Agents

468

Stable Audio Open Zero

🔥

468

Generate immersive audio from text prompts
Build error

Agents

14

Make An Audio 3

🐠

14

Generate audio from text prompts
Paused

Agents

60

Mars5 Space

📉

60
Configuration error

Agents

5

Tango Music AF

🎵

5

Text to Music Generator
Runtime error

Agents

16

Jam

🐠

16

Generate a song from lyrics and style reference
Runtime error

Agents

Featured

114

BigVGAN

🔊

114

Generate high‑quality audio from your input file with BigVGAN
Runtime error

Agents

90

SenseVoice

🐠

90

Transcribe audio with emotions and events
Runtime error

Agents

27

PicoAudio

📈

27

Generate audio from text descriptions with timestamps
Build error

Agents

7

Audio Flamingo Demo

📚

7
Runtime error

Agents

29

MusiConGen

🪩

29
Runtime error

Agents

20

Mms Zeroshot

🌍

20

Transcribe audio in any language using text data
Running on Zero

Agents

237

GPT SoVITS V2 Pro Plus

🤗

237

Generate speech from text using a reference voice
Runtime error

Agents

275

EzAudio

🟣

275

Generate or edit realistic audio from text prompts
Build error

Agents

214

OpenMusic

🎶

214

Generate music from text descriptions
Running on Zero

Agents

Featured

582

Midi Music Generator

🎼

582

Generate MIDI music from prompts
Running on Zero

Agents

1.02k

Whisper Turbo

🤯

1.02k

Transcribe or translate audio and YouTube videos to text
Runtime error

Agents

Featured

346

Realtime Whisper Turbo

🤯

346

Realtime implementation of Whisper large turbo
Running

173

Whisper Large V3 Turbo WebGPU

🚀

173

ML-powered speech recognition directly in your browser
Runtime error

Agents

Featured

697

Fish Audio S1

🏆

697

Convert text to natural-sounding speech audio
Running

Agents

484

TTS Spaces Arena

🤗

484

Blind vote on HF TTS models!
Paused

Agents

19

Diva Realtime Chat

🗣

19

Generate text responses from audio input
Running on Zero

Agents

Featured

2.87k

F5-TTS

🗣

2.87k

F5-TTS & E2-TTS: Zero-Shot Voice Cloning (Unofficial Demo)
Configuration error

Agents

259

MaskGCT TTS Demo

😻

259

MaskGCT TTS Demo
Configuration error

Agents

162

MelodyFlow

🎵

162

Generate or edit music from text and optional audio
Running on L40S

Agents

Featured

148

Fish Agent

💬

148

An end-to-end (e2e) Voice Language Model by Fish Audio.
Running

Agents

67

Nexa Omni Demo

🎧

67

Generate text from uploaded or recorded audio
Running on Zero

Agents

Featured

3.32k

Kokoro TTS

❤

3.32k

Upgraded to v1.0!
Running

Agents

133

Make Custom Voices With KokoroTTS

⚡

133

Make Custom Voices With KokoroTTS
Running on Zero

Agents

314

Llasa 3b Tts

🔥

314

Zero Shot voice cloning with llasa 3b (Unofficial Demo)
Runtime error

Agents

12

Llasa 1b Multilingual TTS

🌍

12

Generate speech from text with or without cloning a voice
Running

Featured

360

Kokoro Text-to-Speech (WebGPU)

🗣

360

High-quality speech synthesis powered by Kokoro TTS
Running on Zero

MCP

Featured

42

Hibiki Simple

👄

42

High-Fidelity Simultaneous Speech-To-Speech Translation
Configuration error

Agents

Featured

413

Zonos

🌍

413

Generate expressive speech audio from text with custom voice
Running

88

Kokoro Web

🗣

88

ML-powered speech synthesis directly in your browser
Running on Zero

Agents

Featured

688

Di♪♪Rhythm

🎶

688

Blazingly Fast and Embarrassingly Simple Song Generation
Running

Agents

23

Audiobox Aesthetics

📚

23

Demo for audiobox-aesthetics
Paused

Agents

Featured

229

Spark TTS

🌖

229

A text-to-speech model powered by SparkAudio and Mobvoi.
Configuration error

Agents

Featured

862

Sesame CSM

🌱

862

Conversational speech generation
Running on Zero

Agents

Featured

248

Orpheus TTS

🚀

248

Try Orpheus TTS here
Running on Zero

Agents

44

Canary 1B Flash

🐤

44

Canary 1B Flash demo
Runtime error

Agents

216

IndexTTS: An Industrial-Level Controllable and Efficient Zero-Shot Text-To-Speech System

🎙

216

Generate speech from text using a reference audio
Paused

Agents

6

AudioMorphix

🌊

6

Prepare environment and run Gradio app
Runtime error

Agents

93

MegaTTS3 Demo

👋

93
Running on Zero

Agents

Featured

170

AudioX

👀

170

Generate audio from text, video, or audio prompts
Runtime error

Agents

Featured

100

Vevo for Zero-shot VC, TTS, and More

🐠

100

Controllable Zero-Shot Voice Imitation
Running on Zero

Agents

Featured

1.78k

Dia 1.6B

👯

1.78k

Generate realistic dialogue from a script, using Dia!
Runtime error

Agents

43

Aero 1 Audio Demo

💬

43

Demo for Aero-1-Audio
Runtime error

Agents

44

Voila Demo

💻

44

Chat with a voice-clone AI
Running on Zero

Agents

Featured

660

ACE Step

😻

660

A Step Towards Music Generation Foundation Model
Configuration error

Agents

2

Audio Difficulty Estimator

🎹

2

Estimate piano difficulty from audio
Running on Zero

MCP

Featured

119

TIGER Audio Extractor

✂

119

Extraction & Reconstruction for Efficient Speech Separation
Configuration error

Agents

18

Music2emo

📊

18

Towards Unified Music Emotion Recognition across Dimensional
Runtime error

Agents

13

SonicVerse

🖼

13

Generate detailed music descriptions from audio clips
Running on Zero

MCP

Featured

46

Auffusion

😻

46

Audio Gen, Audio Style Transfer and Audio InPainting
Running on Zero

MCP

Featured

1.74k

Chatterbox TTS

🍿

1.74k

Expressive Zeroshot TTS
Paused

Agents

120

PlayDiffusion

🎨

120

Generate modified audio from text and voice
Paused

Agents

2

Voice Clone Arena

🏆

2

Vote on the latest Voice Clone TTS models!
Running

Featured

234

Conversational WebGPU

🚀

234
Running on L40S

Featured

733

Song Generation

🎵

733

Generate a song from your lyrics and description
Runtime error

Agents

74

NotaGen

📊

74

Generate classical sheet music in ABC notation
Build error

Agents

Featured

101

Audio Flamingo 3 Demo

🚀

101

Audio Flamingo 3 Demo
Paused

Agents

Featured

33

Audio Flamingo 3 Chat

🐠

33

Audio Flamingo 3 demo for multi-turn multi-audio chat
Running on Zero

Agents

6

MSR UTMOS

🐢

6

Multiple sampling rate MOS prediction with SFI conv
Configuration error

MCP

Featured

399

Higgs Audio Demo

🎤

399

Higgs Audio Demo
Running on Zero

Agents

31

sidon_demo_beta

🐋

31

Speech restoration demo of Sidon.
Paused

Agents

Featured

68

Canary 1b V2

🐤

68

Transcribe and Translate in 25 European Languages
Running on Zero

Agents

30

SonicMaster – Text-Guided Music Restoration & Mastering

🎧

30

Enhance audio with text‑guided restoration and mastering
Runtime error

Agents

6

OLMoASR

🌍

6

Open Models and Data for Training Robust Speech Recognition
Runtime error

Agents

Featured

85

VibeVoice-Large

🏃

85

Generate a podcast audio from a script and voice samples
Running on T4

Agents

10

TaDiCodec TTS AR Qwen2.5 0.5B

📚

10

Generate speech from text with voice cloning
Sleeping

Agents

8

EchoX

🔥

8

An end-to-end speech large language model.
Running on Zero

Agents

44

VoxCPM 0.5B

🐢

44

Generate expressive speech from text with optional voice cloning
Sleeping

Agents

34

FireRedTTS2

🔥

34

Long-form multi-speaker dialogue generation
Running on Zero

Agents

13

FireRedASR

🚀

13

FireRedASR Demo
Running on Zero

Agents

800

IndexTTS 2 Demo

🏢

800

Generate expressive speech from text and voice prompts
Configuration error

Agents

22

SongFormer

🎵

22

State-of-the-art music analysis with multi-scale datasets
Configuration error

Agents

26

Voice Acting TTS

🎭

26

TTS for any emotion, now with non-verbal sounds!
Paused

240

Omnilingual ASR Media Transcription

🌍

240

Transcribe audio/video files into text instantly
Running on Zero

Agents

184

Music Flamingo

🎵

184

Analyze music and answer questions from audio or YouTube links
Paused

MCP

Featured

118

Maya1

📉

118

Demo of our new open source model maya1
Running

Featured

220

Supertonic (TTS)

⚡

220

Lightning-Fast, On-Device TTS
Running on Zero

Agents

Featured

76

Dia2 2B

💨

76

Streaming conversational audio in realtime
Running on Zero

Agents

Featured

185

VibeVoice-Realtime-0.5B

🐨

185

Generate natural speech from text with selectable voices
Sleeping

Agents

1

Count The Notes

🎵

1

Convert audio to MIDI
Runtime error

Agents

1

SpeechJudge GRM

📈

1

Evaluate naturalness of two audio files
Running on Zero

MCP

Featured

498

Chatterbox Turbo Demo

⚡

498

Chatterbox Turbo Demo
Running on Zero

MCP

Featured

147

Soprano TTS

🗣

147

Now with upgraded v1.1 model!
Running on Zero

Agents

Featured

1.93k

Qwen3-TTS Demo

🎙

1.93k

Generate speech from text via voice design, cloning, or presets
Running on Zero

Agents

Featured

136

Qwen3-ASR Demo

🎙

136

Transcribe audio to text with multilingual support
Paused

Agents

Featured

155

Voxtral Mini Realtime

🎤

155

Transcribe speech to text instantly in real time
Running on Zero

Agents

Featured

544

ACE-Step v1.5

🎵

544

Music Generation Foundation Model v1.5
Running

Featured

93

Parakeet STT Progressive Transcription

🎤

93

Transcribe speech to text instantly with WebGPU acceleration
Running on A10G

Featured

237

faster-qwen3-tts

🎙

237

Generate natural speech from text using custom or cloned voices
Running on Zero

Agents

Featured

157

Fish Audio S2 Pro

🐟

157

Zero GPU Text-to-Speech using Fish Audio S2 Pro
Running on Zero

Agents

85

TADA

🎵

85

Generate speech in a chosen voice from text and audio prompt
Running

Featured

126

Voxtral Realtime WebGPU

💬

126

Real-time speech transcription, entirely in your browser.
Running on Zero

Agents

Featured

26

TADA — Text-Acoustic Dual Alignment for Speech

🗯

26

Speech generation from text and acoustic reference
Running on Zero

Agents

Featured

67

Foundation 1

🚀

67

Generate custom music clips from text prompts
Running on Zero

Agents

Featured

37

LongCat AudioDiT 3.5B

🐱

37

Generate speech from text and clone voices instantly
Running

Agents

Featured

477

VoxCPM Demo

🎙

477

VoxCPM2 Nano-vLLM Demo
Running on Zero

Agents

Featured

896

OmniVoice

🌍

896

High-quality voice cloning TTS for 600+ languages
Running on Zero

Agents

Featured

34

Audio Flamingo Next

🔊

34

Answer questions about uploaded audio or YouTube videos
Running on A100

Featured

35

MiMo V2.5 ASR

🦀

35

Leading ASR models from Xiaomi MiMo
Running

Featured

173

Supertonic 3 (TTS)

⚡

173

Lightning-Fast, On-Device, Multilingual, Accurate TTS
Running on Zero

Agents

3

GibbsTTS Demo

🎙

3

Zero-shot voice cloning TTS (EN/ZH) — GibbsTTS demo
Running on Zero

Agents

78

DramaBox

🎭

78

Expressive TTS with voice cloning — DramaBox demo
Running on Zero

Agents

24

Stable Audio 3

🎵

24

Text-to-audio with SA3 Medium / Small Music / Small SFX.

Upvote

Collection guide
Browse collections

Whisper vs Distil-Whisper

Seamless M4T

MusicGen

Audioldm Text To Audio Generation

AudioLDM2 Text2Audio Text2Music Generation

AudioSep

Lp Music Caps

Tortoise Tts

All In One

XTTS

Coqui Bark Voice Cloning

VALL E X

WavJourney

Music To Image

MMS

ElevenLabs TTS

AudioGPT

Bark

SpeechT5 Speech Recognition Demo

CoquiTTS (Official)

Whisper

Moe TTS

YourTTS

Talking Face Generation with Multilingual TTS

OpenAI TTS New

Mustango

OWSM Demo

StyleTTS 2

HierSpeech++ (Zero-shot TTS)

Video2music

Whisper Large V2

Musicgen Prompt Upsampling

Seamless M4T v2

Seamless Streaming

Matcha TTS

MusicGen Streaming

Resemble Enhance

Singing Voice Conversion

NaturalSpeech2

Create Your Own TTS Dataset

Podcast Transcription

OpenVoice

M2UGen Demo

Pheme

ESPnet2 TTS

Whisper-WebUI

Image2SFX Comparison

WhisperSpeech

MetaVoice 1B

TTS Arena V2

Whisper Speech X DreamTalk

Canary 1b

SALMONN Audio Questioning

MeloTTS

Audio Editing

ChatMusician

xVASynth TTS

NaturalSpeech3 FACodec

Hey Gemma

Ratchet + Whisper

AutoSubs

VoiceCraft

TangoFlux

Parler-TTS

Sing an idea ➡️ Music

Musicgen Songstarter Demo

Whisper JAX

AudioLCM

Stable Audio Live Multiplayer

Stable Audio Open Zero

Make An Audio 3

Mars5 Space

Tango Music AF

Jam

BigVGAN

SenseVoice

PicoAudio

Audio Flamingo Demo

MusiConGen