🇮🇳 AI Hindi to Kurukh (Oraon) Model

Model Description

This is a fine-tuned Google mT5 (Multilingual Text-to-Text Transfer Transformer) model optimized for translating from Hindi to Kurukh (Kurux), a vital language spoken by communities in Jharkhand, Chhattisgarh, and other Indian states.

This model serves as a resource for bridging communication gaps and documenting the Kurukh language using modern AI techniques.

  • Developed by: [ankitklakra]
  • Model Type: Encoder-Decoder Transformer (mT5-small)
  • Primary Direction: Hindi (hi) ↔ Kurukh (kru)
  • Fine-tuned from: google/mt5-small

Intended Uses & Limitations

Intended Use

  • Communication: Allowing Hindi speakers to translate simple queries into Kurukh.
  • Education: Assisting in the creation of educational materials for Kurukh learners.
  • Research: Establishing a benchmark for future work in low-resource Indian languages.

Limitations

  • Data Scarcity: The model was trained on a relatively small parallel corpus (~1,000 sentences). Accuracy may vary on complex, unseen, or technical terminology.
  • Context: Works best on short, common conversational phrases.

Training Data

The model was trained on a custom-curated parallel corpus containing daily conversation pairs, agricultural terms, and general vocabulary.

  • Optimization: Trained using Adafactor optimizer.
  • Training Epochs: 60.

How to Use

from transformers import pipeline

translator = pipeline("text2text-generation", model="ankitklakra/hindi-to-kurukh")
print(translator("तुम्हारा नाम क्या है?")) 
# Output: निघै नामे इन्द्रा हिकै?
Downloads last month
31
Safetensors
Model size
0.3B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Space using ankitklakra/hindi-to-kurukh 1