Instructions to use alsubari/aragpt2-mega-pos-msa with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use alsubari/aragpt2-mega-pos-msa with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="alsubari/aragpt2-mega-pos-msa")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("alsubari/aragpt2-mega-pos-msa")
model = AutoModelForCausalLM.from_pretrained("alsubari/aragpt2-mega-pos-msa")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use alsubari/aragpt2-mega-pos-msa with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "alsubari/aragpt2-mega-pos-msa"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "alsubari/aragpt2-mega-pos-msa",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/alsubari/aragpt2-mega-pos-msa

SGLang

How to use alsubari/aragpt2-mega-pos-msa with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "alsubari/aragpt2-mega-pos-msa" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "alsubari/aragpt2-mega-pos-msa",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "alsubari/aragpt2-mega-pos-msa" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "alsubari/aragpt2-mega-pos-msa",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use alsubari/aragpt2-mega-pos-msa with Docker Model Runner:
```
docker model run hf.co/alsubari/aragpt2-mega-pos-msa
```

Model Card for Model ID

Model Details

Model Description

Language(s) (NLP): [Arabic]
Finetuned from model : aragpt2-mega

Uses

pose tagging for arabic language and it may use for other languages
The model can be helpful for the arabic langauge students/researchers, since it provide the sentence anaylsis (اعراب الجملة ) in the context.
arabic word toknizer
it may use for translate the arabic dailects to MSA

Main Labels

{'حرف جر': 'preposition', 'اسم': 'noun', 'اسم علم': 'proper noun', 'لام التعريف': 'determiner', 'صفة': 'adjective', 'ضمير': 'personal pronoun', 'فعل': 'verb', 'حرف عطف': 'conjunction', 'اسم موصول': 'relative pronoun', 'حرف نفي': 'negative particle', 'حروف مقطعة': 'quranic initials', 'اسم اشارة': 'demonstrative pronoun', 'حرف استئنافية': 'resumption', 'حرف نصب': 'accusative particle', 'حرف تسوية': 'equalization particle', 'حرف حال': 'circumstantial particle', 'أداة حصر': 'restriction particle', 'ظرف زمان': 'time adverb', 'حرف نهي': 'prohibition particle', 'حرف كاف': 'preventive particle', 'حرف ابتداء': 'inceptive particle', 'حرف زائد': 'supplemental particle', 'حرف استدراك': 'amendment particle', 'حرف مصدري': 'subordinating conjunction', 'حرف استفهام': 'interrogative particle', 'ظرف مكان': 'location adverb', 'حرف شرط': 'conditional particle', 'لام التوكيد': 'emphatic', 'حرف نداء': 'vocative particle', 'حرف واقع في جواب الشرط': 'result particle', 'حرف تفصيل': 'explanation particle', 'أداة استثناء': 'exceptive particle', 'حرف سببية': 'particle of cause', 'التوكيد - النون الثقيلة': 'heavy noon emphesis', 'حرف استقبال': 'future particle', 'حرف تحقيق': 'particle of certainty', 'لام التعليل': 'purpose', 'حرف جواب': 'answer particle', 'حرف اضراب': 'retraction particle', 'حرف تحضيض': 'exhortation particle', 'حرف تفسير': 'particle of interpretation', 'لام الامر': 'imperative', 'واو المعية': 'comitative particle', 'حرف فجاءة': 'surprise particle', 'حرف ردع': 'aversion particle', 'اسم فعل أمر': 'imperative verbal noun'}

How to Get Started with the Model

from transformers import GPT2Tokenizer 
from pyarabic.araby import strip_diacritics,strip_tatweel
from arabert.aragpt2.grover.modeling_gpt2 import GPT2LMHeadModel
from transformers import pipeline
import re
model_name='alsubari/aragpt2-mega-pos-msa'


tokenizer = GPT2Tokenizer.from_pretrained('alsubari/aragpt2-mega-pos-msa')
model = GPT2LMHeadModel.from_pretrained('alsubari/aragpt2-mega-pos-msa').to("cuda")

generator = pipeline("text-generation",model=model,tokenizer=tokenizer,device=0)
def generate(text):
    prompt = f'<|startoftext|>Instruction: {text}<|pad|>Answer:'    
    pred_text=  generator(prompt,
      pad_token_id=tokenizer.eos_token_id,
      num_beams=20, 
      max_length=256,
      #min_length = 200,
      do_sample=False,
      top_p=0.5,
      top_k=1,
      repetition_penalty = 3.0,
      # temperature=0.8,
      no_repeat_ngram_size = 3)[0]['generated_text']
    try:
        pred_sentiment = re.findall("Answer:(.*)", pred_text,re.S)[-1]
    except:
        pred_sentiment = "None"   

    return pred_sentiment
text='تعلَّمْ من أخطائِكَ'
generate(strip_tatweel(strip_diacritics(text)))
#' تعلم ( تعلم : فعل ) من ( من : حرف جر ) أخطائك ( اخطاء : اسم ، ك : ضمير )'

Results

Epoch 1 Training Loss 0.108500 Validation Loss 0.082612

Downloads last month: 11