whisper-large-v3-gl / README.md
kostissz's picture
Upload README.md with huggingface_hub
05b0983 verified
---
base_model: openai/whisper-large-v3
datasets:
- gl
language: gl
library_name: transformers
license: apache-2.0
model-index:
- name: Finetuned openai/whisper-large-v3 on Galician
results:
- task:
type: automatic-speech-recognition
name: Speech-to-Text
dataset:
name: Common Voice (Galician)
type: common_voice
metrics:
- type: wer
value: 5.143
---
# Finetuned penai/whisper-large-v3 on 116954 Galician training audio samples from cv-corpus-21.0-2025-03-14/gl.
This model was created from the Mozilla.ai Blueprint:
[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).
## Evaluation results on 29239 audio samples of Galician:
### Baseline model (before finetuning) on Galician
- Word Error Rate (Normalized): 20.140
- Word Error Rate (Orthographic): 25.293
- Character Error Rate (Normalized): 7.427
- Character Error Rate (Orthographic): 6.224
- Loss: 1.905
### Finetuned model (after finetuning) on Galician
- Word Error Rate (Normalized): 5.143
- Word Error Rate (Orthographic): 8.320
- Character Error Rate (Normalized): 1.865
- Character Error Rate (Orthographic): 2.446
- Loss: 0.126
"""
### Finetuned model (after finetuning) on the Galician FLEURS test set (total of 927 samples)
- Word Error Rate (Normalized): 9.804
- Word Error Rate (Orthographic): 13.147
- Character Error Rate (Normalized): 5.827
- Character Error Rate (Orthographic): 5.007
- Loss: 0.383