mozilla-ai
/

whisper-large-v3-gl

Automatic Speech Recognition

Model card Files Files and versions

Metrics Training metrics Community

whisper-large-v3-gl / README.md

kostissz's picture

Upload README.md with huggingface_hub

05b0983 verified 8 months ago

|

history blame contribute delete

1.45 kB

	---
	base_model: openai/whisper-large-v3
	datasets:
	- gl
	language: gl
	library_name: transformers
	license: apache-2.0
	model-index:
	- name: Finetuned openai/whisper-large-v3 on Galician
	results:
	- task:
	type: automatic-speech-recognition
	name: Speech-to-Text
	dataset:
	name: Common Voice (Galician)
	type: common_voice
	metrics:
	- type: wer
	value: 5.143
	---

	# Finetuned penai/whisper-large-v3 on 116954 Galician training audio samples from cv-corpus-21.0-2025-03-14/gl.

	This model was created from the Mozilla.ai Blueprint:
	[speech-to-text-finetune](https://github.com/mozilla-ai/speech-to-text-finetune).

	## Evaluation results on 29239 audio samples of Galician:

	### Baseline model (before finetuning) on Galician
	- Word Error Rate (Normalized): 20.140
	- Word Error Rate (Orthographic): 25.293
	- Character Error Rate (Normalized): 7.427
	- Character Error Rate (Orthographic): 6.224
	- Loss: 1.905

	### Finetuned model (after finetuning) on Galician
	- Word Error Rate (Normalized): 5.143
	- Word Error Rate (Orthographic): 8.320
	- Character Error Rate (Normalized): 1.865
	- Character Error Rate (Orthographic): 2.446
	- Loss: 0.126
	"""
	### Finetuned model (after finetuning) on the Galician FLEURS test set (total of 927 samples)
	- Word Error Rate (Normalized): 9.804
	- Word Error Rate (Orthographic): 13.147
	- Character Error Rate (Normalized): 5.827
	- Character Error Rate (Orthographic): 5.007
	- Loss: 0.383