YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
π£οΈ StutteredSpeechASR Research Demo
A Gradio-based research demonstration showcasing StutteredSpeechASR, a Whisper model fine-tuned specifically for stuttered speech recognition (Mandarin). Compare its performance against baseline Whisper models to see the improvement on stuttered speech patterns.
π― Features
- StutteredSpeechASR Research: Showcases fine-tuned Whisper model specifically designed for stuttered speech
- Comparative Analysis: Side-by-side comparison with baseline Whisper models
- Audio Input Flexibility: Record via microphone or upload audio files
- Specialized for Stuttered Speech: Better handling of repetitions, prolongations, and blocks
- Clean Interface: Organized model cards with clear transcription results
- Lightweight Deployment: All inference via Hugging Face APIs - no GPU required
π€ Models Included
| Model | Type | Description |
|---|---|---|
| π£οΈ StutteredSpeechASR | Fine-tuned Research Model | Whisper fine-tuned specifically for stuttered speech (Mandarin) |
| ποΈ Whisper Large V3 | Baseline Model | OpenAI's Whisper Large V3 model via HF Inference API |
| π Whisper Large V3 Turbo | Baseline Model | OpenAI's Whisper Large V3 Turbo (faster) via HF Inference API |
π Requirements
- Python 3.9+
- Hugging Face API key
- Docker (optional, for containerized deployment)
π Environment Setup
Create a .env file in the project root with your Hugging Face credentials:
HF_ENDPOINT=https://your-endpoint-url.aws.endpoints.huggingface.cloud
HF_API_KEY=hf_your_api_key_here
| Variable | Description |
|---|---|
HF_ENDPOINT |
Your dedicated Hugging Face Inference Endpoint URL for StutteredSpeechASR |
HF_API_KEY |
Your Hugging Face API token (get one at huggingface.co/settings/tokens) |
π Quick Start
Option 1: Run with Docker (Recommended)
Create your
.envfile with HuggingFace credentials (see above)Build and run with Docker Compose
docker compose up --buildOpen your browser and navigate to
http://localhost:7860
Option 2: Run Locally
Clone the repository
git clone <your-repo-url> cd asr_demoCreate a virtual environment (recommended)
python -m venv venv # Windows venv\Scripts\activate # Linux/macOS source venv/bin/activateInstall dependencies
pip install -r requirements.txtCreate your
.envfile with HuggingFace credentials (see Environment Setup above)Run the application
python app.pyOpen your browser and navigate to
http://localhost:7860
π§ͺ Research Notes
- Target Language: The StutteredSpeechASR model is specifically trained for Mandarin Chinese
- Use Cases: Research demonstration, stuttered speech analysis, comparative ASR evaluation
- Best Results: Use clear audio recordings for optimal model performance
- Baseline Comparison: The Whisper models may struggle with stuttered speech patterns that StutteredSpeechASR handles well
π References
- Downloads last month
- 16
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support