You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

hwr_text_ocr_rus

Handwritten word-level OCR (HWR) model for Russian.

This model is intended for recognizing cropped text snippets / single words from handwritten notebook images (not full-page OCR, use e.g. kotmayyaka/hwr_text_detection_rus).

For best results, feed tight word crops (or short token crops) with minimal surrounding background.

What’s inside

Checkpoint: ocr_model.ckpt
Inference helper code:
- hwr_ocr.py — HWRTextOCR class (load + preprocess + decode)
- inference.py — CLI example

Intended use

✅ Word-level handwritten recognition (Russian)
✅ Small cropped regions of text (one token / short piece)
❌ Not a full-page OCR pipeline (you need word/line detection & cropping)
❌ Not guaranteed to generalize to very different handwriting styles, paper types, or scanning conditions

Quickstart (inference)

1) Install dependencies

pip install torch torchvision pillow

2) Run CLI inference

python inference_ocr.py --image /path/to/word_crop.png --checkpoint ocr_model.ckpt

3) Use from Python

from PIL import Image
from hwr_ocr import HWRTextOCR

ocr = HWRTextOCR(checkpoint_path="ocr_model.ckpt", device="cpu")

img = Image.open("word_crop.png").convert("RGB")
text = ocr.predict(img)

print(text)

Input recommendations

Prefer tight crops around a single word.
Avoid large margins; background clutter reduces accuracy.
If you have a full line/page image, run a detector/segmenter first and then recognize each crop.

Output

The model outputs a single string (recognized word/text snippet).

Evaluation

Metrics reported in the model card header were obtained on an internal mixed validation split based on:

ai-forever/school_notebooks_RU
ai-forever/school_notebooks_EN

License

Downloads last month: -; Downloads are not tracked for this model. How to track

Datasets used to train kotmayyaka/hwr_text_ocr_rus

Evaluation results

Character Error Rate (CER) on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

0.049
Word Error Rate (WER) on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

0.197
Loss on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

1.064
Average Accuracy on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

0.815
Fuzzy score on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

95.038
Normalized Levenshtein distance on ai-forever/school_notebooks_RU + ai-forever/school_notebooks_EN (validation mix)
validation set Internal evaluation on mixed validation set

0.255