|
|
--- |
|
|
library_name: transformers |
|
|
tags: |
|
|
- LLM |
|
|
- Multilingual |
|
|
- Dual Transformer |
|
|
- Non-English |
|
|
- Tokenizer |
|
|
- SUTRA |
|
|
--- |
|
|
|
|
|
# Model Card for Model ID |
|
|
|
|
|
<!-- Provide a quick summary of what the model is/does. --> |
|
|
|
|
|
|
|
|
|
|
|
## Model Details |
|
|
|
|
|
### Model Description |
|
|
|
|
|
This is the model card of a 🤗 transformers model that has been pushed on the Hub. This model card has been automatically generated. |
|
|
|
|
|
- **Developed by:** Two Platforms |
|
|
- **Model type:** Tokenizer for SUTRA Models. SUTRA Models are dual transformer based Multilingual LLMs |
|
|
- **Language(s) (NLP):** 50+ Languages including English, Hindi, Gujarati, Bengali, Tamil, Korean, Arabic, Japanese, French, German etc. |
|
|
- **License:** Proprietary |
|
|
- **Paper:** [SUTRA: Scalable Multilingual Language Model Architecture](https://huggingface.co/papers/2405.06694) |
|
|
- **Demo:** [SUTRA tokenizer comparison](https://huggingface.co/spaces/TWO/sutra-tokenizer-comparison) |
|
|
|
|
|
## Citation |
|
|
|
|
|
**BibTeX:** |
|
|
|
|
|
``` |
|
|
@misc{bendale2023sutra, |
|
|
author = {Abhijit Bendale and Michael Sapienza and Steven Ripplinger and Simon Gibbs and Jaewon Lee and Pranav Mistry}, |
|
|
title = {SUTRA: Scalable Multilingual Language Model Architecture}, |
|
|
howpublished = {arXiv preprint arXiv:2405.06694}, |
|
|
year = {2024} |
|
|
} |
|
|
``` |