malkhuzanie
/

arabic-punctuation-checkpoints

Text Classification

punctuation-restoration

text-processing

Model card Files Files and versions

malkhuzanie commited on Jan 21

Commit

99e397f

·

verified ·

1 Parent(s): 80d3e16

Create README.md

Files changed (1) hide show

README.md +66 -0

README.md ADDED Viewed

	@@ -0,0 +1,66 @@

+---
+language:
+- ar
+tags:
+- punctuation-restoration
+- arabic
+- pytorch
+- bilstm
+- text-processing
+pipeline_tag: text-classification
+widget:
+- text: "هل تساءلت يوما عن معنى الحياة ما هي الأسئلة التي تشغل بالك"
+  example_title: "Question Example"
+- text: "الطقس جميل اليوم لا اعتقد انها ستمطر"
+  example_title: "Statement Example"
+---
+# Arabic Punctuation Restoration Model (BiLSTM)
+This is a **Bidirectional LSTM (BiLSTM)** model designed to restore punctuation marks in raw Arabic text. It takes unpunctuated Arabic text as input and inserts the appropriate punctuation marks.
+## Model Details
+- **Architecture:** BiLSTM (2 Layers, Hidden Dim 256)
+- **Embeddings:** AraVec (Twitter-CBOW 300d)
+- **Vocabulary Size:** ~50k words
+- **Input:** Raw Arabic text (with or without diacritics)
+- **Output:** Text with restored punctuation marks
+## Supported Punctuation Marks
+The model predicts the following punctuation marks:
+| ID | Mark | Name |
+|---|---|---|
+| 0 | (None) | No Punctuation |
+| 1 | **?** | Question Mark (؟) |
+| 2 | **،** | Arabic Comma |
+| 3 | **:** | Colon |
+| 4 | **؛** | Arabic Semicolon |
+| 5 | **!** | Exclamation Mark |
+| 6 | **.** | Period / Full Stop |
+## How to Use
+Since this is a custom PyTorch model, you need to load the model structure and vocabulary.
+### Method 1: Using the Inference Script (Recommended)
+Download the `inference.py` file from this repository to use the model easily.
+```python
+from huggingface_hub import hf_hub_download
+import importlib.util
+# 1. Download the script
+script_path = hf_hub_download(repo_id="malkhuzanie/arabic-punctuation-checkpoints", filename="inference.py")
+# 2. Load the script
+spec = importlib.util.spec_from_file_location("inference", script_path)
+inference = importlib.util.module_from_spec(spec)
+spec.loader.exec_module(inference)
+# 3. Initialize and Predict
+model = inference.PunctuationRestorer()
+text = "هل تساءلت يوما عن معنى الحياة ما هي الأسئلة التي تشغل بالك"
+print(model.predict(text))
+# Output: هل تساءلت يوماً عن معنى الحياة؟ ما هي الأسئلة التي تشغل بالك؟