🇮🇳 AI Hindi to Kurukh (Oraon) Model
Model Description
This is a fine-tuned Google mT5 (Multilingual Text-to-Text Transfer Transformer) model optimized for translating from Hindi to Kurukh (Kurux), a vital language spoken by communities in Jharkhand, Chhattisgarh, and other Indian states.
This model serves as a resource for bridging communication gaps and documenting the Kurukh language using modern AI techniques.
- Developed by: [ankitklakra]
- Model Type: Encoder-Decoder Transformer (mT5-small)
- Primary Direction: Hindi (hi) ↔ Kurukh (kru)
- Fine-tuned from:
google/mt5-small
Intended Uses & Limitations
Intended Use
- Communication: Allowing Hindi speakers to translate simple queries into Kurukh.
- Education: Assisting in the creation of educational materials for Kurukh learners.
- Research: Establishing a benchmark for future work in low-resource Indian languages.
Limitations
- Data Scarcity: The model was trained on a relatively small parallel corpus (~1,000 sentences). Accuracy may vary on complex, unseen, or technical terminology.
- Context: Works best on short, common conversational phrases.
Training Data
The model was trained on a custom-curated parallel corpus containing daily conversation pairs, agricultural terms, and general vocabulary.
- Optimization: Trained using Adafactor optimizer.
- Training Epochs: 60.
How to Use
from transformers import pipeline
translator = pipeline("text2text-generation", model="ankitklakra/hindi-to-kurukh")
print(translator("तुम्हारा नाम क्या है?"))
# Output: निघै नामे इन्द्रा हिकै?
- Downloads last month
- 31