bartowski/allura-forge_Llama-3.3-8B-Instruct-GGUF Text Generation • 8B • Updated 10 days ago • 8.84k • 22
Cerebras REAP Collection Sparse MoE models compressed using REAP (Router-weighted Expert Activation Pruning) method • 19 items • Updated 21 days ago • 79
VTP Collection Towards Scalable Pre-training of Visual Tokenizers for Generation • 4 items • Updated 24 days ago • 39
Teacher Logits Collection Logits captured from large models to act as the teacher for distillation • 3 items • Updated 25 days ago • 7