Llama-DrugDetector-8B-MFI
Model Description
Llama-DrugDetector-8B-MFI is a fine-tuned medical NLP model for detecting illicit drug use in clinical notes. It specializes in identifying:
- Methamphetamine (illicit use)
- Fentanyl (illicit use)
- Injection Drug Use (IDU/IVDU)
The model performs multi-task learning, simultaneously predicting:
- Illicit use detection (True/False/Unknown)
- Temporal classification (Current/Historical/Unknown/N/A)
Model Architecture
- Base Model: Meta Llama 3.1 8B
- Fine-tuning: LoRA (Low-Rank Adaptation)
- Rank: 16
- Alpha: 32
- Target modules: q_proj, v_proj, k_proj, o_proj
- Training Method: Multi-task instruction following
- Context Length: 2048 tokens
Performance
Evaluated on 159 clinical note sentences:
| Drug | F1 Score | Precision | Recall |
|---|---|---|---|
| Methamphetamine | 0.931 | 0.915 | 0.947 |
| Fentanyl | 0.950 | 0.927 | 0.974 |
| Injection Drug Use | 0.923 | 0.913 | 0.933 |
Comparison with Base Model
| Drug | Base F1 | Fine-tuned F1 | Improvement |
|---|---|---|---|
| Methamphetamine | 0.787 | 0.931 | +14.4pp |
| Fentanyl | 0.753 | 0.950 | +19.7pp |
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
# Load base model
base_model = "fabriceyhc/Llama-DrugDetector-8B"
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Load LoRA adapter
model = PeftModel.from_pretrained(model, "fabriceyhc/Llama-DrugDetector-8B-MFI")
# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("fabriceyhc/Llama-DrugDetector-8B-MFI")
# Create prompt
note_text = "Patient reports using meth daily for the past 2 weeks."
prompt = f"""### Task Description:
Please carefully review the following medical note and identify illicit drug use.
**CRITICAL RULES:**
1. **Positive drug test → ALWAYS ILLICIT** (unless in medication list)
2. **PMH/History of use → ILLICIT** (even if historical)
3. **Substance use disorder → ILLICIT**
4. **Patient self-reports (endorses, reports, admits) → ILLICIT**
5. **Prescribed/medical use → NOT ILLICIT**:
- Medication lists with dosages
- "Given", "administered" in medical context
- Pain control, procedural use (fentanyl)
- Prescription quantities
**Drugs to identify:**
- **Methamphetamine**: Illicit amphetamine use (not prescribed Adderall for ADHD)
- **Fentanyl**: Illicit fentanyl use (not prescribed patches/procedural use)
- **Injection Drug Use**: IV drug use (IVDU, IVDA), including IV heroin, cocaine, meth
**Temporal Classification** (for illicit cases only):
- **Current**: Present tense, recent use, positive test results, "POA" (present on arrival)
- **Historical**: Past tense, "history of", "former user", "in remission"
- **Unknown**: Timeframe unclear or ambiguous
### Desired Format:
Methamphetamine Illicit Use: <True/False/Unknown>
Fentanyl Illicit Use: <True/False/Unknown>
Injection Drug Use: <True/False/Unknown>
Methamphetamine Temporal Status: <Current/Historical/Unknown/N/A>
Fentanyl Temporal Status: <Current/Historical/Unknown/N/A>
Injection Drug Use Temporal Status: <Current/Historical/Unknown/N/A>
### The medical note to evaluate:
{note_text}
### Answer:
"""
# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=200,
do_sample=False,
pad_token_id=tokenizer.eos_token_id
)
# Decode
result = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(result)
Expected Output
Methamphetamine Illicit Use: True
Fentanyl Illicit Use: False
Injection Drug Use: False
Methamphetamine Temporal Status: Current
Fentanyl Temporal Status: N/A
Injection Drug Use Temporal Status: N/A
Training Data
- Training Set: 113 annotated clinical note examples
- Validation Set: 23 examples
- Test Set: 23 examples (separate from evaluation above)
Data includes:
- Discharge summaries
- H&P (History & Physical) notes
- Progress notes
- Emergency department notes
Annotations cover diverse clinical scenarios:
- Positive drug tests (urine, blood)
- Patient self-reports
- Substance use disorder diagnoses
- Prescribed medications (negative examples)
- Medical procedures using fentanyl (negative examples)
- Historical vs current use
Training Details
- Epochs: 5
- Best Checkpoint: Epoch 3.98
- Best Validation Loss: 0.740
- Batch Size: 2 per device × 2 GPUs
- Gradient Accumulation: 2 steps
- Learning Rate: 2e-4
- Optimizer: AdamW with warmup
- Gradient Checkpointing: Enabled
- Mixed Precision: bfloat16
Limitations
- Recall Trade-off: Achieves 94.7% (meth) and 97.4% (fentanyl) recall, missing some illicit cases compared to perfect recall of base model
- Domain-specific: Trained on specific clinical note formats and may not generalize to all medical documentation styles
- Drug Coverage: Limited to methamphetamine, fentanyl, and injection drug use. Does not detect other substances (cocaine, opioids, cannabis, etc.)
- Context Window: 2048 tokens may truncate very long notes
- Unknown Predictions: May output "Unknown" for ambiguous cases (e.g., drug test ordered but results not available)
Intended Use
Primary Use Cases:
- Retrospective chart review for substance use research
- Clinical decision support (with human review)
- Population health screening
- Cohort identification for substance use studies
NOT Recommended For:
- Fully automated clinical decision-making
- Forensic or legal determination of substance use
- Real-time clinical alerts without human oversight
Ethical Considerations
- Bias: Model trained on data from specific healthcare systems; performance may vary across populations
- Privacy: Ensure proper de-identification before processing clinical notes
- False Negatives: Missing 2.6-5.3% of illicit cases could impact clinical outcomes
- Stigma: Use predictions responsibly; substance use disorder is a medical condition requiring compassionate care
Citation
@misc{llama-drugdetector-mfi-2025,
author = {Harel-Canada, Fabrice},
title = {Llama-DrugDetector-8B-MFI: Multi-task Clinical Drug Detection},
year = {2025},
publisher = {HuggingFace},
url = {https://huggingface.co/fabriceyhc/Llama-DrugDetector-8B-MFI}
}
License
Llama 3.1 Community License Agreement
Contact
For questions or issues, please open an issue on the model repository.
Model tree for fabriceyhc/Llama-DrugDetector-8B-MFI
Base model
fabriceyhc/Llama-DrugDetector-8B