Llama-DrugDetector-8B-MFI

Model Description

Llama-DrugDetector-8B-MFI is a fine-tuned medical NLP model for detecting illicit drug use in clinical notes. It specializes in identifying:

Methamphetamine (illicit use)
Fentanyl (illicit use)
Injection Drug Use (IDU/IVDU)

The model performs multi-task learning, simultaneously predicting:

Illicit use detection (True/False/Unknown)
Temporal classification (Current/Historical/Unknown/N/A)

Model Architecture

Base Model: Meta Llama 3.1 8B
Fine-tuning: LoRA (Low-Rank Adaptation)
- Rank: 16
- Alpha: 32
- Target modules: q_proj, v_proj, k_proj, o_proj
Training Method: Multi-task instruction following
Context Length: 2048 tokens

Performance

Evaluated on 159 clinical note sentences:

Drug	F1 Score	Precision	Recall
Methamphetamine	0.931	0.915	0.947
Fentanyl	0.950	0.927	0.974
Injection Drug Use	0.923	0.913	0.933

Comparison with Base Model

Drug	Base F1	Fine-tuned F1	Improvement
Methamphetamine	0.787	0.931	+14.4pp
Fentanyl	0.753	0.950	+19.7pp

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

# Load base model
base_model = "fabriceyhc/Llama-DrugDetector-8B"
model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(model, "fabriceyhc/Llama-DrugDetector-8B-MFI")

# Load tokenizer
tokenizer = AutoTokenizer.from_pretrained("fabriceyhc/Llama-DrugDetector-8B-MFI")

# Create prompt
note_text = "Patient reports using meth daily for the past 2 weeks."

prompt = f"""### Task Description:
Please carefully review the following medical note and identify illicit drug use.

**CRITICAL RULES:**
1. **Positive drug test → ALWAYS ILLICIT** (unless in medication list)
2. **PMH/History of use → ILLICIT** (even if historical)
3. **Substance use disorder → ILLICIT**
4. **Patient self-reports (endorses, reports, admits) → ILLICIT**
5. **Prescribed/medical use → NOT ILLICIT**:
   - Medication lists with dosages
   - "Given", "administered" in medical context
   - Pain control, procedural use (fentanyl)
   - Prescription quantities

**Drugs to identify:**
- **Methamphetamine**: Illicit amphetamine use (not prescribed Adderall for ADHD)
- **Fentanyl**: Illicit fentanyl use (not prescribed patches/procedural use)
- **Injection Drug Use**: IV drug use (IVDU, IVDA), including IV heroin, cocaine, meth

**Temporal Classification** (for illicit cases only):
- **Current**: Present tense, recent use, positive test results, "POA" (present on arrival)
- **Historical**: Past tense, "history of", "former user", "in remission"
- **Unknown**: Timeframe unclear or ambiguous

### Desired Format:

Methamphetamine Illicit Use: <True/False/Unknown>
Fentanyl Illicit Use: <True/False/Unknown>
Injection Drug Use: <True/False/Unknown>
Methamphetamine Temporal Status: <Current/Historical/Unknown/N/A>
Fentanyl Temporal Status: <Current/Historical/Unknown/N/A>
Injection Drug Use Temporal Status: <Current/Historical/Unknown/N/A>

### The medical note to evaluate:
{note_text}

### Answer:
"""

# Generate
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(
    **inputs,
    max_new_tokens=200,
    do_sample=False,
    pad_token_id=tokenizer.eos_token_id
)

# Decode
result = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(result)

Expected Output

Methamphetamine Illicit Use: True
Fentanyl Illicit Use: False
Injection Drug Use: False
Methamphetamine Temporal Status: Current
Fentanyl Temporal Status: N/A
Injection Drug Use Temporal Status: N/A

Training Data

Training Set: 113 annotated clinical note examples
Validation Set: 23 examples
Test Set: 23 examples (separate from evaluation above)

Data includes:

Discharge summaries
H&P (History & Physical) notes
Progress notes
Emergency department notes

Annotations cover diverse clinical scenarios:

Positive drug tests (urine, blood)
Patient self-reports
Substance use disorder diagnoses
Prescribed medications (negative examples)
Medical procedures using fentanyl (negative examples)
Historical vs current use

Training Details

Epochs: 5
Best Checkpoint: Epoch 3.98
Best Validation Loss: 0.740
Batch Size: 2 per device × 2 GPUs
Gradient Accumulation: 2 steps
Learning Rate: 2e-4
Optimizer: AdamW with warmup
Gradient Checkpointing: Enabled
Mixed Precision: bfloat16

Limitations

Recall Trade-off: Achieves 94.7% (meth) and 97.4% (fentanyl) recall, missing some illicit cases compared to perfect recall of base model
Domain-specific: Trained on specific clinical note formats and may not generalize to all medical documentation styles
Drug Coverage: Limited to methamphetamine, fentanyl, and injection drug use. Does not detect other substances (cocaine, opioids, cannabis, etc.)
Context Window: 2048 tokens may truncate very long notes
Unknown Predictions: May output "Unknown" for ambiguous cases (e.g., drug test ordered but results not available)

Intended Use

Primary Use Cases:

Retrospective chart review for substance use research
Clinical decision support (with human review)
Population health screening
Cohort identification for substance use studies

NOT Recommended For:

Fully automated clinical decision-making
Forensic or legal determination of substance use
Real-time clinical alerts without human oversight

Ethical Considerations

Bias: Model trained on data from specific healthcare systems; performance may vary across populations
Privacy: Ensure proper de-identification before processing clinical notes
False Negatives: Missing 2.6-5.3% of illicit cases could impact clinical outcomes
Stigma: Use predictions responsibly; substance use disorder is a medical condition requiring compassionate care

Citation

@misc{llama-drugdetector-mfi-2025,
  author = {Harel-Canada, Fabrice},
  title = {Llama-DrugDetector-8B-MFI: Multi-task Clinical Drug Detection},
  year = {2025},
  publisher = {HuggingFace},
  url = {https://huggingface.co/fabriceyhc/Llama-DrugDetector-8B-MFI}
}

License

Llama 3.1 Community License Agreement

Contact

For questions or issues, please open an issue on the model repository.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for fabriceyhc/Llama-DrugDetector-8B-MFI

Base model

fabriceyhc/Llama-DrugDetector-8B

Finetuned

(1)

this model