Upload README.md with huggingface_hub

6338d70 verified 16 days ago

3.3 kB

license: mit
language:
  - en
tags:
  - computer-vision
  - object-detection
  - yolov8
  - document-analysis
  - historical-documents
  - heritage-ai
  - pytorch
  - ultralytics
pipeline_tag: object-detection

🏛️ YOLOv8 — Historical Document Ornament Detector

Automatic detection of typographic ornaments in 16th–18th century printed documents. Developed as part of the TypoRef project at PolyTech Tours.

📌 Model Summary

Property	Details
🏗️ Architecture	YOLOv8 (Ultralytics)
🎯 Task	Object Detection
📊 mAP@50	95%
🗂️ Dataset	50+ expert-annotated historical document pages
📅 Document period	16th – 18th century printed books
⚙️ Framework	PyTorch + Ultralytics
📉 Processing speedup	20% faster than manual workflow
📜 License	MIT

🧠 What This Model Does

This model detects and localizes typographic ornaments and decorative graphic elements in scanned pages of early modern European printed books.

It was built to replace a slow, fully manual cataloguing process for the TypoRef digital humanities project, enabling automated analysis of thousands of document pages that would otherwise require extensive expert annotation.

Detected classes: typographic ornaments, decorative initials, vignettes, and other graphic elements typical of 16th–18th century printing.

📈 Performance

Metric	Score
mAP@50	95%
Training duration	6 months iterative refinement
Annotations integrated	50+ pages in 2 months
Processing time reduction	20% vs previous pipeline

🚀 How to Use

from ultralytics import YOLO

# Load the model
model = YOLO("best.pt")

# Run inference on a document scan
results = model("your_document_scan.jpg", conf=0.35)

# Show results
results[0].show()

# Save annotated image
results[0].save("output.jpg")

🗂️ Training Data

Source: Historical printed books from the TypoRef corpus (16th–18th century)
Annotations: Expert-annotated by digital humanities researchers at PolyTech Tours
Volume: 50+ annotated document pages
Augmentation: Standard YOLOv8 augmentation pipeline

⚠️ Limitations

Optimized for black-and-white or greyscale document scans
Performance may degrade on very low-resolution scans (< 150 DPI)
Trained on Western European printing conventions — may generalize poorly to other traditions

🔗 Related Resources

🤗 Live Demo Space
💻 GitHub Repository

👤 Author

Martin Badrous — Computer Vision & Deep Learning Engineer