9892c69b8ce660332f64cf05e2154f17
This model is a fine-tuned version of google-t5/t5-3b on the Helsinki-NLP/opus_books [en-es] dataset. It achieves the following results on the evaluation set:
- Loss: 1.0776
- Data Size: 1.0
- Epoch Runtime: 1021.1650
- Bleu: 15.6416
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 32
- total_eval_batch_size: 32
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: constant
- num_epochs: 50
Training results
| Training Loss | Epoch | Step | Validation Loss | Data Size | Epoch Runtime | Bleu |
|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.2426 | 0 | 70.9220 | 4.7059 |
| No log | 1 | 2336 | 1.6327 | 0.0078 | 80.3382 | 12.1482 |
| 0.0262 | 2 | 4672 | 1.5494 | 0.0156 | 89.5344 | 13.7689 |
| 0.0377 | 3 | 7008 | 1.4785 | 0.0312 | 106.3581 | 13.5768 |
| 1.5506 | 4 | 9344 | 1.4069 | 0.0625 | 135.9375 | 14.5184 |
| 1.4786 | 5 | 11680 | 1.3373 | 0.125 | 200.6290 | 14.4053 |
| 1.358 | 6 | 14016 | 1.2428 | 0.25 | 316.6712 | 14.7558 |
| 1.2637 | 7 | 16352 | 1.1627 | 0.5 | 564.0895 | 15.3167 |
| 1.1517 | 8.0 | 18688 | 1.0868 | 1.0 | 1026.8957 | 15.4860 |
| 1.0093 | 9.0 | 21024 | 1.0508 | 1.0 | 1030.7444 | 15.4056 |
| 0.917 | 10.0 | 23360 | 1.0370 | 1.0 | 1020.0794 | 15.9829 |
| 0.848 | 11.0 | 25696 | 1.0415 | 1.0 | 1046.6410 | 15.7640 |
| 0.7567 | 12.0 | 28032 | 1.0429 | 1.0 | 1053.9459 | 15.8170 |
| 0.6874 | 13.0 | 30368 | 1.0575 | 1.0 | 1072.1412 | 15.5004 |
| 0.643 | 14.0 | 32704 | 1.0776 | 1.0 | 1021.1650 | 15.6416 |
Framework versions
- Transformers 4.57.0
- Pytorch 2.8.0+cu128
- Datasets 4.2.0
- Tokenizers 0.22.1
- Downloads last month
- -
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support
Model tree for contemmcm/9892c69b8ce660332f64cf05e2154f17
Base model
google-t5/t5-3b