secret-model-stage-1-0.6B-32
This model was trained from scratch on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.9642
- Centroid Acc: 0.8491
- Centroid Macro F1: 0.8510
- Knn Acc: 0.8868
- Knn Macro F1: 0.8922
- Alignment: 0.6861
- Uniformity: -3.0154
- Combined Score: 0.8647
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 0.001
- train_batch_size: 16
- eval_batch_size: 64
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- lr_scheduler_warmup_ratio: 0.06
- num_epochs: 100.0
Training results
| Training Loss | Epoch | Step | Validation Loss | Centroid Acc | Centroid Macro F1 | Knn Acc | Knn Macro F1 | Alignment | Uniformity | Combined Score |
|---|---|---|---|---|---|---|---|---|---|---|
| No log | 0 | 0 | 2.5120 | 0.4906 | 0.4741 | 0.7170 | 0.7022 | 0.4456 | -0.9957 | 0.5501 |
| 1.8667 | 3.125 | 100 | 1.7566 | 0.6981 | 0.6999 | 0.8679 | 0.8777 | 0.4242 | -1.4639 | 0.7592 |
| 1.4456 | 6.25 | 200 | 1.1371 | 0.8491 | 0.8500 | 0.8491 | 0.8399 | 0.4276 | -1.8594 | 0.8466 |
| 1.1259 | 9.375 | 300 | 1.0450 | 0.8302 | 0.8326 | 0.8302 | 0.8259 | 0.3959 | -1.8022 | 0.8304 |
| 0.6033 | 12.5 | 400 | 0.9749 | 0.8113 | 0.8014 | 0.8113 | 0.8032 | 0.6055 | -2.6825 | 0.8020 |
| 0.6079 | 15.625 | 500 | 0.9847 | 0.8491 | 0.8498 | 0.8868 | 0.8816 | 0.5269 | -2.3363 | 0.8604 |
| 0.4846 | 18.75 | 600 | 0.9544 | 0.8302 | 0.8332 | 0.8491 | 0.8410 | 0.5448 | -2.5200 | 0.8358 |
| 0.3492 | 21.875 | 700 | 0.9976 | 0.8491 | 0.8516 | 0.8113 | 0.8113 | 0.6177 | -2.6891 | 0.8382 |
| 0.2924 | 25.0 | 800 | 1.0358 | 0.8302 | 0.8371 | 0.8302 | 0.8292 | 0.6377 | -2.7912 | 0.8345 |
| 0.2924 | 25.0 | 800 | 1.0358 | 0.8302 | 0.8371 | 0.8302 | 0.8292 | 0.6377 | -2.7912 | 0.8345 |
| 0.2142 | 28.125 | 900 | 1.0408 | 0.8491 | 0.8468 | 0.8491 | 0.8468 | 0.6262 | -2.8568 | 0.8468 |
| 0.1433 | 31.25 | 1000 | 0.9725 | 0.8491 | 0.8519 | 0.8868 | 0.8848 | 0.6383 | -2.9037 | 0.8629 |
| 0.1468 | 34.375 | 1100 | 1.0977 | 0.8491 | 0.8393 | 0.8302 | 0.8201 | 0.6942 | -2.9770 | 0.8329 |
| 0.1229 | 37.5 | 1200 | 1.1407 | 0.7925 | 0.7804 | 0.8491 | 0.8376 | 0.6696 | -2.8946 | 0.7995 |
| 0.0275 | 40.625 | 1300 | 0.8793 | 0.8868 | 0.8853 | 0.8679 | 0.8690 | 0.6394 | -2.8780 | 0.8799 |
| 0.0293 | 43.75 | 1400 | 0.8398 | 0.8679 | 0.8690 | 0.8491 | 0.8527 | 0.6248 | -2.8809 | 0.8636 |
| 0.0189 | 46.875 | 1500 | 0.9692 | 0.8679 | 0.8727 | 0.8868 | 0.8893 | 0.6852 | -3.0108 | 0.8782 |
| 0.0089 | 50.0 | 1600 | 0.9862 | 0.8302 | 0.8414 | 0.8491 | 0.8563 | 0.6540 | -2.9116 | 0.8464 |
| 0.0089 | 50.0 | 1600 | 0.9862 | 0.8302 | 0.8414 | 0.8491 | 0.8563 | 0.6540 | -2.9116 | 0.8464 |
| 0.054 | 53.125 | 1700 | 0.9374 | 0.8679 | 0.8730 | 0.8491 | 0.8563 | 0.6712 | -2.9751 | 0.8674 |
| 0.0051 | 56.25 | 1800 | 1.0472 | 0.8302 | 0.8308 | 0.8491 | 0.8563 | 0.6784 | -2.9614 | 0.8393 |
| 0.0118 | 59.375 | 1900 | 1.0015 | 0.8491 | 0.8527 | 0.8491 | 0.8563 | 0.6757 | -2.9613 | 0.8539 |
| 0.0311 | 62.5 | 2000 | 0.8517 | 0.8491 | 0.8527 | 0.8491 | 0.8563 | 0.6774 | -3.0056 | 0.8539 |
| 0.0026 | 65.625 | 2100 | 0.9519 | 0.8679 | 0.8730 | 0.8491 | 0.8563 | 0.6728 | -2.9874 | 0.8674 |
| 0.0017 | 68.75 | 2200 | 0.9554 | 0.8491 | 0.8510 | 0.8491 | 0.8563 | 0.6738 | -2.9841 | 0.8528 |
| 0.0016 | 71.875 | 2300 | 0.9851 | 0.8491 | 0.8510 | 0.8491 | 0.8563 | 0.6742 | -2.9753 | 0.8528 |
| 0.0015 | 75.0 | 2400 | 0.9575 | 0.8491 | 0.8510 | 0.8491 | 0.8563 | 0.6742 | -2.9841 | 0.8528 |
| 0.0015 | 75.0 | 2400 | 0.9575 | 0.8491 | 0.8510 | 0.8491 | 0.8563 | 0.6742 | -2.9841 | 0.8528 |
| 0.0021 | 78.125 | 2500 | 0.9687 | 0.8491 | 0.8510 | 0.8679 | 0.8756 | 0.6788 | -2.9943 | 0.8592 |
| 0.0019 | 81.25 | 2600 | 0.9789 | 0.8679 | 0.8730 | 0.8868 | 0.8922 | 0.6788 | -2.9937 | 0.8794 |
| 0.0091 | 84.375 | 2700 | 0.9718 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6807 | -3.0014 | 0.8647 |
| 0.0013 | 87.5 | 2800 | 0.9700 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6837 | -3.0070 | 0.8647 |
| 0.0013 | 90.625 | 2900 | 0.9731 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6883 | -3.0182 | 0.8647 |
| 0.0015 | 93.75 | 3000 | 0.9667 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6875 | -3.0176 | 0.8647 |
| 0.0389 | 96.875 | 3100 | 0.9678 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6868 | -3.0167 | 0.8647 |
| 0.0009 | 100.0 | 3200 | 0.9642 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6861 | -3.0154 | 0.8647 |
| 0.0009 | 100.0 | 3200 | 0.9642 | 0.8491 | 0.8510 | 0.8868 | 0.8922 | 0.6861 | -3.0154 | 0.8647 |
Framework versions
- Transformers 4.56.0
- Pytorch 2.8.0+cu128
- Datasets 4.0.0
- Tokenizers 0.22.0
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support