File size: 1,847 Bytes
23c6a7e 93711ce 23c6a7e f4cbe08 e721c5e 93711ce f4cbe08 e721c5e 23c6a7e c7f41e5 e721c5e c7f41e5 a4e859e e721c5e c7f41e5 f4cbe08 c7f41e5 e721c5e c7f41e5 e721c5e c7f41e5 e721c5e c7f41e5 e721c5e e6f4315 e721c5e e6f4315 e721c5e e6f4315 e721c5e e6f4315 e721c5e e6f4315 e721c5e f4cbe08 e721c5e f4cbe08 93711ce 4af8a4f e721c5e ebde282 93711ce |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 |
---
library_name: peft
license: apache-2.0
base_model: unsloth/SmolLM2-135M-Instruct
tags:
- unsloth
- trl
- sft
- generated_from_trainer
model-index:
- name: SmolLM2-135M-Instruct-TaiwanChat
results: []
---
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-135M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
[<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="200" height="32"/>](https://wandb.ai/pesi/SmolLM2-135M-Instruct-TaiwanChat_CLOUD/runs/9fnxruem)
# SmolLM2-135M-Instruct-TaiwanChat
This model is a fine-tuned version of [unsloth/SmolLM2-135M-Instruct](https://huggingface.co/unsloth/SmolLM2-135M-Instruct) on an unknown dataset.
## Model description
More information needed
## Intended uses & limitations
More information needed
## Training and evaluation data
More information needed
## Training procedure
### Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 5e-05
- train_batch_size: 1
- eval_batch_size: 1
- seed: 3407
- gradient_accumulation_steps: 4
- total_train_batch_size: 4
- optimizer: Use adamw_8bit with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- lr_scheduler_warmup_ratio: 0.1
- lr_scheduler_warmup_steps: 10
- training_steps: 60
- mixed_precision_training: Native AMP
### Framework versions
- PEFT 0.14.0
- Transformers 4.47.1
- Pytorch 2.5.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0 |