HugoHE
/

m-hood

+---
+license: mit
+library_name: ultralytics
+tags:
+- object-detection
+- computer-vision
+- yolov10
+- faster-rcnn
+- pytorch
+- bdd100k
+- pascal-voc
+- kitti
+- autonomous-driving
+- hallucination-mitigation
+- out-of-distribution
+- BDD 100K
+- Pascal-VOC
+pipeline_tag: object-detection
+datasets:
+- bdd100k
+- pascal-voc
+- kitti
+widget:
+- src: https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/bounding-boxes-sample.png
+  example_title: "Sample Image"
+model-index:
+- name: m-hood
+  results:
+  - task:
+      type: object-detection
+    dataset:
+      type: multi-dataset
+      name: BDD 100K, Pascal VOC, KITTI
+    metrics:
+    - type: mean_average_precision
+      name: mAP
+      value: "TBD"
+---
+# M-Hood: Multi-Dataset Object Detection Model Collection
+**M-Hood** is a comprehensive collection of object detection models trained on multiple datasets using different architectures and training strategies. This unified repository contains both **YOLOv10** and **Faster R-CNN** models trained on **BDD 100K**, **Pascal VOC**, and **KITTI** datasets.
+The collection includes both **vanilla models** (trained from scratch) and **fine-tuned models** specifically designed to **mitigate hallucination on out-of-distribution data**.
+## 🎯 Key Features
+- **Dual Architecture Support**: Both YOLOv10 and Faster R-CNN models
+- **Multi-Dataset Training**: BDD 100K, Pascal VOC, and KITTI datasets
+- **Hallucination Mitigation**: Fine-tuned models for robust out-of-distribution performance
+- **Real-world Applications**: Autonomous driving and general object detection
+## 📊 Model Performance Overview
+### YOLOv10 Models
+| Model | Dataset | Training Type | Size | Description | Download |
+|-------|---------|---------------|------|-------------|----------|
+| **yolov10-bdd-vanilla.pt** | BDD 100K | Vanilla | 62MB | Real-time detection for autonomous driving | [Download](./yolov10-bdd-vanilla.pt) |
+| **yolov10-voc-vanilla.pt** | Pascal VOC | Vanilla | 63MB | General purpose object detection | [Download](./yolov10-voc-vanilla.pt) |
+| **yolov10-kitti-vanilla.pt** | KITTI | Vanilla | 16MB | Lightweight autonomous driving detection | [Download](./yolov10-kitti-vanilla.pt) |
+| **yolov10-bdd-finetune.pt** | BDD 100K | Fine-tuned | 62MB | OOD-robust autonomous driving detection | [Download](./yolov10-bdd-finetune.pt) |
+| **yolov10-voc-finetune.pt** | Pascal VOC | Fine-tuned | 94MB | OOD-robust general object detection | [Download](./yolov10-voc-finetune.pt) |
+| **yolov10-kitti-finetune.pt** | KITTI | Fine-tuned | 52MB | OOD-robust autonomous driving detection | [Download](./yolov10-kitti-finetune.pt) |
+### Faster R-CNN Models
+| Model | Dataset | Training Type | Size | Description | Download |
+|-------|---------|---------------|------|-------------|----------|
+| **faster-rcnn-bdd-vanilla.pth** | BDD 100K | Vanilla | 315MB | High-accuracy autonomous driving detection | [Download](./faster-rcnn-bdd-vanilla.pth) |
+| **faster-rcnn-voc-vanilla.pth** | Pascal VOC | Vanilla | 315MB | High-accuracy general object detection | [Download](./faster-rcnn-voc-vanilla.pth) |
+| **faster-rcnn-kitti-vanilla.pth** | KITTI | Vanilla | 315MB | High-accuracy autonomous driving detection | [Download](./faster-rcnn-kitti-vanilla.pth) |
+| **faster-rcnn-bdd-finetune.pth** | BDD 100K | Fine-tuned | 158MB | OOD-robust high-accuracy detection | [Download](./faster-rcnn-bdd-finetune.pth) |
+| **faster-rcnn-voc-finetune.pth** | Pascal VOC | Fine-tuned | 158MB | OOD-robust high-accuracy detection | [Download](./faster-rcnn-voc-finetune.pth) |
+| **faster-rcnn-kitti-finetune.pth** | KITTI | Fine-tuned | 158MB | OOD-robust high-accuracy detection | [Download](./faster-rcnn-kitti-finetune.pth) |
+## 🚀 Quick Start
+### YOLOv10 Usage
+```python
+from ultralytics import YOLO
+# Load a vanilla YOLOv10 model
+model = YOLO('yolov10-bdd-vanilla.pt')
+# Run inference
+results = model('path/to/image.jpg')
+# Process results
+for result in results:
+    boxes = result.boxes.xyxy   # bounding boxes
+    scores = result.boxes.conf  # confidence scores
+    classes = result.boxes.cls  # class predictions
+```
+### Faster R-CNN Usage
+```python
+import torch
+# Load a Faster R-CNN model
+model = torch.load('faster-rcnn-bdd-vanilla.pth')
+model.eval()
+# Run inference
+with torch.no_grad():
+    predictions = model(image_tensor)
+# Process results
+boxes = predictions[0]['boxes']
+scores = predictions[0]['scores']
+labels = predictions[0]['labels']
+```
+## 🎯 Fine-tuning Objective
+The **fine-tuned models** in this collection have been specifically trained to **mitigate hallucination on out-of-distribution (OOD) data**. This means:
+- **Improved Robustness**: Better performance when encountering images different from training distribution
+- **Reduced False Positives**: Lower tendency to detect objects that aren't actually present
+- **Enhanced Reliability**: More trustworthy predictions in real-world deployment scenarios
+## 📁 Dataset Information
+### BDD 100K (Berkeley DeepDrive)
+- **100,000+** driving images with diverse weather and lighting conditions
+- **Object Classes**: car, truck, bus, motorcycle, bicycle, person, traffic light, traffic sign, train, rider
+- **Application**: Autonomous driving scenarios
+### Pascal VOC (Visual Object Classes)
+- Standard benchmark dataset for object detection
+- **20 Object Classes**: aeroplane, bicycle, bird, boat, bottle, bus, car, cat, chair, cow, diningtable, dog, horse, motorbike, person, pottedplant, sheep, sofa, train, tvmonitor
+- **Application**: General computer vision applications
+### KITTI Object Detection
+- Real-world autonomous driving dataset
+- **Object Classes**: car, pedestrian, cyclist
+- **Application**: Autonomous driving with focus on urban scenarios
+## 🏗️ Architecture Comparison
+### YOLOv10 (Real-time Detection)
+- **Type**: Single-stage detector
+- **Speed**: High (real-time inference)
+- **Accuracy**: Good
+- **Use Case**: Real-time applications, edge deployment
+### Faster R-CNN (High-accuracy Detection)
+- **Type**: Two-stage detector
+- **Speed**: Moderate
+- **Accuracy**: High
+- **Use Case**: High-accuracy requirements, research applications
+## 📈 Model Selection Guide
+| Use Case | Recommended Model | Reason |
+|----------|-------------------|---------|
+| **Real-time autonomous driving** | `yolov10-bdd-finetune.pt` | Fast + OOD robust + driving-specific |
+| **High-accuracy autonomous driving** | `faster-rcnn-bdd-finetune.pth` | High accuracy + OOD robust + driving-specific |
+| **General object detection (fast)** | `yolov10-voc-finetune.pt` | Fast + OOD robust + general purpose |
+| **General object detection (accurate)** | `faster-rcnn-voc-finetune.pth` | High accuracy + OOD robust + general purpose |
+| **Research/Baseline** | Any vanilla model | Standard training baseline |
+## 🔬 Research Applications
+This model collection is particularly useful for research in:
+- **Out-of-distribution detection**
+- **Domain adaptation**
+- **Robust object detection**
+- **Autonomous driving perception**
+- **Multi-dataset learning**
+## 📄 Citations
+If you use these models in your research, please cite:
+```bibtex
+@article{yolov10,
+  title={YOLOv10: Real-Time End-to-End Object Detection},
+  author={Wang, Ao and Chen, Hui and Liu, Lihao and Chen, Kai and Lin, Zijia and Han, Jungong and Ding, Guiguang},
+  journal={arXiv preprint arXiv:2405.14458},
+  year={2024}
+}
+@article{ren2015faster,
+  title={Faster r-cnn: Towards real-time object detection with region proposal networks},
+  author={Ren, Shaoqing and He, Kaiming and Girshick, Ross and Sun, Jian},
+  journal={Advances in neural information processing systems},
+  volume={28},
+  year={2015}
+}
+```
+## 📜 License
+This model collection is released under the MIT License.
+## 🏷️ Keywords
+Object Detection, Computer Vision, YOLOv10, Faster R-CNN, BDD 100K, Pascal-VOC, KITTI, Autonomous Driving, Hallucination Mitigation, Out-of-Distribution, Deep Learning, PyTorch