Cervical Cancer Multimodal Classifier

Model Description

This is an advanced multimodal model that classifies cervical cancer using both:

  • Visual features from histopathological images (Vision Transformer)
  • Morphological features from tabular data (20 hand-crafted features)

Model Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Histopath.     β”‚         β”‚ Tabular Features β”‚
β”‚   Image (BMP)   β”‚         β”‚   (20 features)  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜         β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚                           β”‚
         β”‚                           β”‚
         β–Ό                           β–Ό
   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”           β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
   β”‚  ViT-base    β”‚           β”‚  MLP       β”‚
   β”‚  (768 dims)  β”‚           β”‚  (64 dims) β”‚
   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
          β”‚                        β”‚
          β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                   β”‚
                   β–Ό
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚  Fusion Layer   β”‚
            β”‚  (512 -> 256)   β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
            β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
            β”‚  Output (7)     β”‚
            β”‚  Classes        β”‚
            β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Supported Classes

  1. carcinoma_in_situ - Carcinoma in situ
  2. light_dysplastic - Light dysplastic
  3. moderate_dysplastic - Moderate dysplastic
  4. normal_columnar - Normal columnar
  5. normal_intermediate - Normal intermediate
  6. normal_superficiel - Normal superficial
  7. severe_dysplastic - Severe dysplastic

Performance

Metric Value
Test Accuracy 0.6594
Test F1-Score 0.6571
Weighted Precision 0.6558

Training Details

  • Dataset: Smear2005 (Herlev Colposcopy) [https://mde-lab.aegean.gr/index.php/downloads/]
  • Vision Backbone: google/vit-base-patch16-224
  • Training Epochs: 50 (with early stopping at 10)
  • Batch Size: 16
  • Learning Rate: 2e-5 (AdamW)
  • Scheduler: CosineAnnealingLR
  • Hardware: NVIDIA T4 GPU on Google Colab

num_epochs = 50 best_val_accuracy = 0.6376811594202898 patience = 10 patience_counter = 10

Tabular Features

The model uses 20 morphological features extracted from nuclei analysis:

  • Nucleus Area: Kerne_A
  • Cytoplasm Area: Cyto_A
  • Nucleus-Cytoplasm Ratio: K/C
  • Y-coordinates: Kerne_Ycol, Cyto_Ycol
  • Morphological indices: KerneShort, KerneLong, KerneElong, KerneRund
  • Perimeter: KernePeri, CytoPeri
  • Size ratios: KerneMax, KerneMin, CytoMax, CytoMin
  • Position: KernePos

Features are StandardScaler normalized using training set statistics.

Usage

Installation

pip install torch transformers pillow scikit-learn

Quick Start

import torch
from PIL import Image
import numpy as np
from sklearn.preprocessing import StandardScaler

# Load model
model = torch.load('multimodal_cervical_model.pt')

# Your image and tabular data
image = Image.open('sample.BMP')
tabular_features = {
    'Kerne_A': 803.5,
    'Cyto_A': 27804.125,
    # ... 18 more features
}

# Predict
predictions = predict_multimodal(image, tabular_features, ...)

Advantages

βœ… Multimodal Fusion: Combines spatial-visual features with quantitative morphological data
βœ… Robustness: Less prone to overfitting than single-modality models
βœ… Interpretability: Features are human-interpretable (sizes, ratios, etc.)
βœ… Scalability: Can add more modalities (ultrasound, genetic data, etc.)

Limitations

⚠️ Limited to 7 classes (specific dataset)
⚠️ Requires both image and tabular data for inference
⚠️ Image input must be histopathological cervical samples

Citation

If you use this model, please cite:

@misc{cervical_multimodal_2025,
  title = {Cervical Cancer Multimodal Classifier},
  author = {Sastelvio MANUEL},
  year = 2025,
  howpublished = {\url{https://huggingface.co/sastelvio/cervical-cancer-multimodal-vit}}
}

Disclaimer

⚠️ Medical Use Only Under Professional Supervision

This model is for research and educational purposes. It should NOT be used for clinical diagnosis without:

  • Validation by medical professionals
  • Proper regulatory approval
  • Thorough clinical testing
  • Integration with clinical workflows

Author

[Sastelvio MANUEL]
Portfolio: [https://github.com/sastelvio]

License

MIT License - See LICENSE file for details


Last updated: 19 December 2025

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Space using sastelvio/cervical-cancer-multimodal-vit 1