MiniLM Hate Speech Detector
Fine-tuned MiniLM-L12 model for binary hate speech detection, optimized for browser deployment.
Model Description
This model classifies text as either hate speech or not hate speech (binary classification). It was fine-tuned on the HateXplain dataset and exported to ONNX format with INT8 quantization for efficient on-device inference in browser extensions.
Training Data
- Dataset: HateXplain (~20K samples)
- Sources: Twitter, Gab
- Label Mapping:
hate(1): Hate speechnot_hate(0): Offensive + Normal content merged
Evaluation Results
| Metric | Score |
|---|---|
| F1 (Macro) | 0.8302 |
| Accuracy | 0.8567 |
| Precision | 0.8342 |
| Recall | 0.8266 |
Model Comparison (RQ3)
| Model | F1 | Size (Quantized) |
|---|---|---|
| MiniLM (this) | 0.830 | 32.6 MB |
| DistilBERT | 0.825 | 64 MB |
| TinyBERT | 0.816 | 15 MB |
Intended Use
- Primary: Browser extension for real-time hate speech detection
- Framework: Transformers.js for on-device inference
- Privacy: 100% client-side processing, no data sent to servers
Usage
With Transformers.js (Browser)
import { pipeline } from '@huggingface/transformers';
const classifier = await pipeline(
'text-classification',
'TaiwoOgun/minilm-hate-speech-onnx'
);
const result = await classifier('Some text to classify');
console.log(result);
// [{ label: 'hate', score: 0.85 }]
With Python
from transformers import pipeline
classifier = pipeline('text-classification', 'TaiwoOgun/minilm-hate-speech')
result = classifier('Some text to classify')
Limitations
- Language: English only
- Binary: Does not distinguish between types of hate speech
- Context: May miss context-dependent hate speech or sarcasm
- Dataset Bias: Trained on Twitter/Gab data, may not generalize to all platforms
Citation
If you use this model, please cite:
@misc{ogunbanwo2025hatespeech,
author = {Ogunbanwo, Taiwo},
title = {Real-Time Hate Speech Detection Browser Extension},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/TaiwoOgun/minilm-hate-speech-onnx}
}
License
MIT License
- Downloads last month
- 50