grounded-ai
/

phi4-mini-judge

Generated from Trainer

hallucination-detection

toxicity-detection

relevance-evaluation

Model card Files Files and versions

Metrics Training metrics Community

Jlonge4 commited on Jun 8, 2025

Commit

9d6984b

·

verified ·

1 Parent(s): 3810324

Update README.md

Files changed (1) hide show

README.md +1 -33

README.md CHANGED Viewed

@@ -34,12 +34,6 @@ Our Phi-4-Mini-Judge model achieves strong performance across all three evaluati
 | **Hallucination Detection** | 35 | 29 | **82.86%** |
 | **Relevance Evaluation** | 35 | 25 | **71.43%** |
-### Common Failure Patterns
-The model's most frequent errors include:
-- Relevance evaluation: 9 cases of marking "unrelated" content as "relevant"
-- Hallucination detection: 5 cases of marking "accurate" content as "hallucination"
-- Toxicity assessment: 3 cases of marking "toxic" content as "non-toxic"
 ## Model Usage
 For best results, we recommend using the following system prompt and output format:
@@ -171,10 +165,9 @@ The model uses a structured output format with `<rating>` tags containing one of
 ## Intended Uses & Limitations
 ### Intended Uses
-- Content moderation and safety filtering
 - Automated evaluation of AI-generated responses
 - Quality assurance for conversational AI systems
-- Research in AI safety and alignment
 - Integration into larger AI safety pipelines
 ### Limitations
@@ -184,31 +177,6 @@ The model uses a structured output format with `<rating>` tags containing one of
 - Should be used as part of a broader safety strategy, not as sole arbiter
 - Best performance on English text (training data limitation)
-## Training Data
-This model was trained on a comprehensive dataset combining:
-- **HaluEval dataset** for hallucination detection
-- **Toxicity classification datasets** for harmful content detection
-- **Relevance evaluation datasets** for query-response alignment
-The training approach ensures balanced performance across all three safety dimensions while maintaining consistency in output format and reasoning quality.
-## Training Procedure
-### Training Hyperparameters
-The following hyperparameters were used during training:
-- learning_rate: 5e-05
-- train_batch_size: 2
-- eval_batch_size: 8
-- seed: 42
-- gradient_accumulation_steps: 2
-- total_train_batch_size: 4
-- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
-- lr_scheduler_type: cosine
-- lr_scheduler_warmup_steps: 20
-- training_steps: 300 (100 per task)
 ### Framework Versions
 - PEFT 0.12.0

 | **Hallucination Detection** | 35 | 29 | **82.86%** |
 | **Relevance Evaluation** | 35 | 25 | **71.43%** |
 ## Model Usage
 For best results, we recommend using the following system prompt and output format:
 ## Intended Uses & Limitations
 ### Intended Uses
+- SLM as a Judge
 - Automated evaluation of AI-generated responses
 - Quality assurance for conversational AI systems
 - Integration into larger AI safety pipelines
 ### Limitations
 - Should be used as part of a broader safety strategy, not as sole arbiter
 - Best performance on English text (training data limitation)
 ### Framework Versions
 - PEFT 0.12.0