Qwen2.5-3B-Instruct Fine-tuned on Urdu GSM8K

Model Dataset License

Model Description

This model is a fine-tuned version of Qwen/Qwen2.5-3B-Instruct specifically trained on Urdu mathematical reasoning tasks. The model was trained on the PuristanLabs1/GSM8K_Urdu dataset, enabling it to solve grade school math problems with step by step reasoning in Urdu (اردو).

  • Base Model: Qwen2.5-3B-Instruct
  • Fine-tuning Method: QLoRA (4-bit quantization)
  • Dataset: PuristanLabs1/GSM8K_Urdu (~6,365 examples)
  • Training Duration: 3 epochs, 1,074 steps
  • Language: Urdu (اردو)
  • Task: Mathematical reasoning and problem solving

Training Details

Training Configuration

Parameter Value
Base Model Qwen2.5-3B-Instruct
Training Method QLoRA (4-bit)
LoRA Rank (r) 16
LoRA Alpha 16
LoRA Dropout 0.0
Learning Rate 2e-4
Scheduler Cosine with warmup
Warmup Ratio 0.1
Batch Size 2 per device
Gradient Accumulation 8 steps
Effective Batch Size 16
Max Sequence Length 1024 tokens
Optimizer AdamW 8-bit
Training Epochs 3
Total Steps 1,074
Training Time ~15 hours

Training Metrics

Metric Initial Final Improvement
Training Loss 0.8758 0.5461 ↓ 37.6%
Validation Loss 0.8272 0.5502 ↓ 33.5%
Best Validation Loss - 0.5502 @ step 400

Dataset Statistics

  • Total Examples: 6,365
  • Training Set: 5,728 examples (90%)
  • Validation Set: 637 examples (10%)
  • Average Question Length: 242 characters
  • Average Reasoning Length: 265 characters

Trainable Parameters

  • Total Parameters: 3,115,872,256
  • Trainable Parameters: 29,933,568 (0.96%)
  • Training Method: Parameter-efficient fine-tuning with LoRA

Usage

Installation

pip install unsloth transformers accelerate

Basic Usage

from unsloth import FastLanguageModel

# Load model
model, tokenizer = FastLanguageModel.from_pretrained(
    "PuristanLabs1/qwen2.5-3B-GSM8K-urdu",
    max_seq_length=1024,
    load_in_4bit=True,
)

# Enable inference mode
FastLanguageModel.for_inference(model)

# Prepare your question
question = "احمد کے پاس 15 سیب ہیں۔ وہ اپنے 3 دوستوں میں برابر تقسیم کرنا چاہتا ہے۔ ہر دوست کو کتنے سیب ملیں گے؟"

# Format prompt
prompt = f"""<|im_start|>system
آپ ایک ریاضی کے ماہر ہیں جو اردو میں مسائل حل کرتے ہیں۔ ہر مسئلے کو قدم بہ قدم حل کریں۔<|im_end|>
<|im_start|>user
{question}<|im_end|>
<|im_start|>assistant
"""

# Generate response
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    eos_token_id=tokenizer.encode("<|im_end|>", add_special_tokens=False)[0],
)

# Extract answer
response = tokenizer.decode(outputs[0], skip_special_tokens=False)
answer = response.split("<|im_start|>assistant")[-1].split("<|im_end|>")[0].strip()
print(answer)

Output:

ہر دوست کو 15/3=<<15/3=5>>5 سیب ملتے ہیں۔

Advanced Usage with Custom Parameters

# For more creative responses
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.8,
    top_p=0.95,
    do_sample=True,
    repetition_penalty=1.1,
)

# For more deterministic responses
outputs = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.3,
    top_p=0.9,
    do_sample=True,
)

Example Outputs

Example 1: Simple Division

Question: احمد کے پاس 15 سیب ہیں۔ وہ اپنے 3 دوستوں میں برابر تقسیم کرنا چاہتا ہے۔ ہر دوست کو کتنے سیب ملیں گے؟

Model Output:

ہر دوست کو 15/3=<<15/3=5>>5 سیب ملتے ہیں۔

Example 2: Multi-step Problem

Question: ایک دکان میں 5 قلم 125 روپے میں ملتے ہیں۔ اگر میں 12 قلم خریدوں تو کتنے روپے خرچ ہوں گے؟

Model Output:

ایک قلم کی قیمت 125/5 = <<125/5=25>>25 ڈالر ہے۔
لہذا، 12 قلم کی قیمت 25*12 = <<25*12=300>>300 ڈالر ہے۔

Example 3: Complex Word Problem

Question: احمد کے پاس 50 روپے تھے۔ اس نے 15 روپے کی کتاب اور 12 روپے کا قلم خریدا۔ پھر اس کے والد نے اسے 30 روپے دیے۔ اب احمد کے پاس کتنے روپے ہیں؟

Model Output:

اُس نے کتابوں اور قلم پر $15 + $12 = $<<15+12=27>>27 خرچ کیے۔
اُس کے پاس $50 - $27 = $<<50-27=23>>23 باقی بچے۔
اُس کے والد نے اُسے $30 مزید ملے، تو اب اُس کے پاس $23 + $30 = $<<23+30=53>>53 ہیں۔

Performance

Accuracy on Test Set

  • Mathematical Correctness: 100% on tested examples
  • Step-by-step Reasoning: Excellent
  • Urdu Fluency: Very Good
  • Multi-step Problems: Handles well

Strengths

Accurate Calculations - Performs arithmetic operations correctly
Step-by-step Reasoning - Shows work using <<calculation>> format
Multi-step Problems - Handles complex word problems with multiple operations
Urdu Fluency - Generates natural Urdu text
Consistent Format - Follows GSM8K-style reasoning format

Known Limitations

Currency Symbol Inconsistency

The model sometimes uses "$" or "ڈالر" (dollar) instead of "روپے" (rupees) in responses, even when the question uses "روپے". This is an artifact from the original GSM8K dataset which uses dollars.

Impact: This does not affect mathematical accuracy, only the currency symbol used in the output.

Planned Fix: This will be addressed in the next version.

Real-world Constraints

The model may not always recognize practical constraints (e.g., calculating 7.5 students per group when dividing 45 students into 6 groups). It provides mathematically correct answers but may not account for real-world impossibilities.

Other Limitations

  • Trained on grade-school level math (GSM8K difficulty)
  • May struggle with very advanced mathematical concepts
  • Limited to problems that can be solved with basic arithmetic
  • Best performance on problems similar to training data

Intended Use

Primary Use Cases

✅ Educational tools for Urdu-speaking students
✅ Math tutoring applications
✅ Automated homework assistance
✅ Mathematical reasoning research
✅ Urdu NLP benchmarking

Out of Scope

❌ Advanced mathematics (calculus, linear algebra, etc.)
❌ Financial calculations requiring precision
❌ Real-time production systems without validation
❌ Medical or safety-critical applications

Ethical Considerations

  • Educational Aid: This model is designed to assist learning, not replace teachers
  • Verification Required: Always verify model outputs, especially in educational settings
  • Language Preservation: Contributes to Urdu language technology development
  • Accessibility: Makes mathematical reasoning tools available in Urdu

Future Improvements

The following improvements are planned for v2:

  1. Currency Symbol Fix - Replace "$" with "روپے" in outputs
  2. Extended Training - More epochs for better convergence
  3. Larger Dataset - Include more diverse Urdu math problems
  4. Real-world Constraints - Add training data for practical limitations
  5. Advanced Math - Expand to higher-level mathematical concepts

Model Card Authors

PuristanLabs

Citation

If you use this model in your research or applications, please cite:

@misc{qwen25-math-urdu-2025,
  author = {PuristanLabs},
  title = {Qwen2.5-Math-7B Fine-tuned on Urdu GSM8K},
  year = {2025},
  publisher = {HuggingFace},
  howpublished = {\url{https://huggingface.co/PuristanLabs1/qwen2.5-math-7b-GSM8K-urdu}},
}

Acknowledgments

  • Base Model: Qwen Team for Qwen2.5-Math-7B-Instruct
  • Dataset: GSM8K by OpenAI

License

This model is released under the Apache 2.0 License, consistent with the base Qwen2.5-Math model.

Contact

For questions, issues, or collaborations:


Made with ❤️ for the Urdu-speaking community

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for PuristanLabs1/qwen2.5-3B-GSM8K-urdu

Base model

Qwen/Qwen2.5-3B
Finetuned
(857)
this model

Dataset used to train PuristanLabs1/qwen2.5-3B-GSM8K-urdu

Space using PuristanLabs1/qwen2.5-3B-GSM8K-urdu 1