nics-efc/CoLLM_Qwen3_0_6B
0.8B • Updated • 14
None defined yet.
SALAD: Achieve High-Sparsity Attention via Efficient Linear Attention Tuning for Video Diffusion Transformer
Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models