Ministral-3-14B-writer

LoRA fine-tune of Ministral 3 14B for fiction writing.

Training Data

Dataset Samples Words Vocabulary
Primary 68,119 808M 738K
Secondary 12,635 148M 194K
Total 80,754 956M โ€”
  • ~15k tokens average per sample
  • Dialogue-heavy prose (~87-89%)
  • Mean sentence length: 14-16 words
  • Mixed first/third person POV
  • No sample packing

Training

  • 4ร—H100 80GB
  • LoRA rank 512, alpha 512 (rsLoRA)
  • 16k context
  • BF16 base, FP32 Adam
  • ~34 hours
Metric Value
Steps 2100 / 2500
Learning rate 1e-5 โ†’ 1e-6 (cosine)
Train loss 2.42 โ†’ 2.16
Eval loss 2.26 โ†’ 2.02
Grad norm 3.2 avg (clipped at 10)

Aborted at step 2100 โ€” val loss plateaued due to overly aggressive LR decay.

See main_dataset_analysis.md and secondary_dataset_analysis.md for detailed statistics.

Trained with ministral3-fsdp-lora-loop. Dataset analysis via dataset-analyzer.

Downloads last month
3
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for thestarfarer/Ministral-3-14B-writer

Adapter
(1)
this model
Adapters
1 model