speecht5_tts_en_emotion

This model is a fine-tuned version of microsoft/speecht5_tts on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.0001
train_batch_size: 24
eval_batch_size: 24
seed: 3407
gradient_accumulation_steps: 4
total_train_batch_size: 96
optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
lr_scheduler_warmup_steps: 1000
training_steps: 10000
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
0.4114	5.32	500	0.0788
0.3822	10.64	1000	0.0693
0.3408	15.96	1500	0.0687
0.3143	21.2773	2000	0.0647
0.3192	26.5973	2500	0.0644
0.3198	31.9173	3000	0.0601
0.3057	37.2347	3500	0.0600
0.2992	42.5547	4000	0.0605
0.305	47.8747	4500	0.0598
0.2823	53.192	5000	0.0566
0.2524	58.512	5500	0.0567
0.2573	63.832	6000	0.0563
0.2582	69.1493	6500	0.0556
0.2591	74.4693	7000	0.0558
0.2508	79.7893	7500	0.0540
0.2494	85.1067	8000	0.0531
0.2404	90.4267	8500	0.0536
0.2235	95.7467	9000	0.0532
0.235	101.064	9500	0.0531
0.2421	106.384	10000	0.0527

Downloads last month: -; Downloads are not tracked for this model. How to track

Safetensors

Model size

0.1B params

Tensor type

F32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Finetuned

(1290)

this model