h4c5
's Collections
moderation-prompts
updated
mmathys/openai-moderation-api-evaluation
Viewer
•
Updated
•
1.68k
•
297
•
35
Viewer
•
Updated
•
169k
•
26k
•
1.5k
WildGuard: Open One-Stop Moderation Tools for Safety Risks, Jailbreaks,
and Refusals of LLMs
Paper
•
2406.18495
•
Published
•
13
ShieldGemma: Generative AI Content Moderation Based on Gemma
Paper
•
2407.21772
•
Published
•
14
Viewer
•
Updated
•
1M
•
6.24k
•
768
PKU-Alignment/BeaverTails
Viewer
•
Updated
•
364k
•
9.87k
•
76
AgentPublic/camembert-base-toxic-fr-user-prompts
Text Classification
•
0.1B
•
Updated
•
247
•
7
Viewer
•
Updated
•
30.4k
•
762
•
26
meta-llama/Llama-Guard-3-8B
Text Generation
•
8B
•
Updated
•
57.2k
•
•
248
davanstrien/aart-ai-safety-dataset
Viewer
•
Updated
•
3.27k
•
44
•
2
Viewer
•
Updated
•
520
•
6.09k
•
50