Artifacts for paper "Controllable Safety Alignment: Inference-Time Adaptation to Diverse Safety Requirements" (https://arxiv.org/abs/2410.08968)
Jack Zhang
jackzhang
AI & ML interests
None yet
Recent Activity
updated a model about 7 hours ago
jackzhang/openlm_3b_201305_dolci_think_100k_pre2013_sft_full published a model about 7 hours ago
jackzhang/openlm_3b_201305_dolci_think_100k_pre2013_sft_full authored a paper 27 days ago
Jailbreak Distillation: Renewable Safety Benchmarking