·
AI & ML interests
None yet
Organizations
None yet
models 14
yungshun317/llava1.5-7b-rlaif-v-dpo
Updated
yungshun317/qwen2.5-0.5B-prm-mathshepherd
Token Classification
• 0.5B • Updated
• 1
yungshun317/sft-qwen2.5-7b-qlora
Text Generation
• Updated
yungshun317/qwen2.5-32b-deberta-ultrafeedback-grpo-lora-ds
Updated
yungshun317/qwen2.5-7b-deberta-ultrafeedback-grpo-lora-ds-composite-reward
Updated
yungshun317/deberta-v3-large-format-guard-preference-distillation
0.4B • Updated
yungshun317/deberta-v3-large-preference-distillation
0.4B • Updated
yungshun317/deberta-v3-large-format-guard
0.4B • Updated
• 1
yungshun317/qwen2.5-7b-deberta-ultrafeedback-grpo-lora-ds
Updated
yungshun317/qwen2-0.5B-deberta-ultrafeedback-grpo
Text Generation
• 0.5B • Updated
• 1