BMD Watermark Detector β€” bmd_watermark_n.pt

A lightweight YOLO11-nano model fine-tuned for detecting watermarks in images. Trained from scratch on a custom dataset of real-world watermarked images, designed to power the smart-crop watermark removal pipeline in DatasetStudio.

The n suffix denotes the nano variant β€” optimised for fast batch inference on large image datasets without sacrificing meaningful detection accuracy.


Model Details

Property Value
Architecture YOLO11n (nano)
Task Object Detection
Input RGB images (any resolution β€” resized to 640Γ—640 internally)
Output Bounding boxes (xyxy) + confidence scores
Classes 0: watermark
License AGPL-3.0

Intended Use

This model is intended to detect the location of watermarks in images so that a downstream cropping step can remove them cleanly. It is well-suited for:

  • Batch processing large image datasets to remove corner/edge watermarks
  • Automated dataset cleaning pipelines
  • Identifying watermark position (top-left, bottom-right corner, etc.)

This model is intended for legitimate dataset cleaning use cases (e.g. removing watermarks from your own content). Do not use it to strip copyright protections from images you do not have the rights to modify.


Usage

Requirements

pip install ultralytics pillow

Basic Inference

from ultralytics import YOLO

model = YOLO("bmd_watermark_n.pt")

results = model("your_image.jpg", conf=0.25)
for r in results:
    for box in r.boxes:
        print(f"Watermark detected at {box.xyxy[0].tolist()} (conf: {float(box.conf[0]):.2f})")

Batch Inference

from ultralytics import YOLO

model = YOLO("bmd_watermark_n.pt")

image_paths = ["img1.jpg", "img2.jpg", "img3.png"]
results = model(image_paths, conf=0.25, verbose=False)

for path, r in zip(image_paths, results):
    if len(r.boxes) > 0:
        print(f"{path}: watermark found")
    else:
        print(f"{path}: clean")

Smart Crop (remove watermark by cropping)

from ultralytics import YOLO
from PIL import Image

def crop_out_watermark(img_path, model, conf=0.25, padding=0.1):
    results = model(img_path, conf=conf, verbose=False)
    r = results[0]
    img_w, img_h = r.orig_shape[1], r.orig_shape[0]

    if len(r.boxes) == 0:
        return Image.open(img_path)  # No watermark, return as-is

    # Find largest detected box
    best_box = max(r.boxes, key=lambda b: (b.xyxy[0][2]-b.xyxy[0][0]) * (b.xyxy[0][3]-b.xyxy[0][1]))
    x1, y1, x2, y2 = best_box.xyxy[0].tolist()

    # Add padding
    pw = (x2 - x1) * padding
    ph = (y2 - y1) * padding
    x1, y1, x2, y2 = max(0,x1-pw), max(0,y1-ph), min(img_w,x2+pw), min(img_h,y2+ph)

    # Crop to the largest region not containing the watermark
    candidates = [
        (0, 0, img_w, int(y1)),        # above
        (0, int(y2), img_w, img_h),    # below
        (0, 0, int(x1), img_h),        # left
        (int(x2), 0, img_w, img_h),    # right
    ]
    best = max(candidates, key=lambda c: (c[2]-c[0]) * (c[3]-c[1]))

    img = Image.open(img_path)
    return img.crop(best)

model = YOLO("bmd_watermark_n.pt")
clean = crop_out_watermark("watermarked.jpg", model)
clean.save("clean.jpg")

Training

  • Base architecture: YOLO11n (Ultralytics)
  • Training data: Custom dataset of watermarked images with manual bounding box annotations
  • Annotation format: YOLO format (normalised class x_center y_center width height)
  • Hardware: GPU-accelerated training
  • Recommended confidence threshold: 0.25 for single-image preview, 0.5 for batch processing

Limitations

  • Optimised for corner and edge watermarks (bottom-right, bottom-left, top-right, top-left). Centered full-image watermarks (overlays) are out of scope.
  • Performance may degrade on very small watermarks (< ~3% of image area) or heavily blended semi-transparent watermarks.
  • The nano variant trades some accuracy for speed. For higher accuracy at the cost of inference time, consider training an s or m size variant.

License

This model is released under the AGPL-3.0 License, consistent with the Ultralytics YOLO11 framework used for training.

If you use this model in a commercial product or networked service, you must either comply with AGPL-3.0 (open-source your application) or obtain a separate commercial license from Ultralytics for the underlying framework.

Downloads last month
37
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support