train_codealpacapy_456_1765330670

This model is a fine-tuned version of meta-llama/Meta-Llama-3-8B-Instruct on the codealpacapy dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4603
  • Num Input Tokens Seen: 24973864

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 456
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.1
  • num_epochs: 20

Training results

Training Loss Epoch Step Validation Loss Input Tokens Seen
0.7179 1.0 1908 0.6614 1246832
0.6218 2.0 3816 0.5375 2497936
0.5057 3.0 5724 0.5058 3743760
0.4925 4.0 7632 0.4908 4991472
1.4152 5.0 9540 0.4824 6239608
0.4578 6.0 11448 0.4763 7485248
0.6984 7.0 13356 0.4722 8733024
0.668 8.0 15264 0.4691 9983720
0.3939 9.0 17172 0.4667 11229792
0.4199 10.0 19080 0.4649 12476552
0.4601 11.0 20988 0.4637 13725560
0.4708 12.0 22896 0.4627 14977976
0.4481 13.0 24804 0.4620 16225896
0.5275 14.0 26712 0.4614 17477224
0.4814 15.0 28620 0.4610 18726216
0.3421 16.0 30528 0.4605 19973408
0.4392 17.0 32436 0.4605 21226656
0.4993 18.0 34344 0.4605 22472696
0.4855 19.0 36252 0.4604 23722376
0.9468 20.0 38160 0.4603 24973864

Framework versions

  • PEFT 0.15.2
  • Transformers 4.51.3
  • Pytorch 2.8.0+cu128
  • Datasets 3.6.0
  • Tokenizers 0.21.1
Downloads last month
96
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for rbelanec/train_codealpacapy_456_1765330670

Adapter
(2089)
this model

Evaluation results