view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware +7 Titus-von-Koeller, jiaweizhao, mdouglas, hiyouga, ybelkada, muellerzr, amyeroberts, smangrul, BenjaminB • Mar 20, 2024 • 32
view article Article Mixture of Experts Explained +4 osanseviero, lewtun, philschmid, smangrul, ybelkada, pcuenq • Dec 11, 2023 • 1.13k
view article Article Personal Copilot: Train Your Own Coding Assistant smangrul, sayakpaul • Oct 27, 2023 • 79
view article Article Fine-tuning Llama 2 70B using PyTorch FSDP +2 smangrul, sgugger, lewtun, philschmid • Sep 13, 2023 • 32
view article Article The Falcon has landed in the Hugging Face ecosystem +6 lvwerra, ybelkada, smangrul, lewtun, olivierdehaene, pcuenq, philschmid, osanseviero • Jun 5, 2023 • 17
view article Article Making LLMs even more accessible with bitsandbytes, 4-bit quantization and QLoRA +3 ybelkada, timdettmers, artidoro, sgugger, smangrul • May 24, 2023 • 180
view article Article Fine-tuning 20B LLMs with RLHF on a 24GB consumer GPU +4 edbeeching, ybelkada, lvwerra, smangrul, lewtun, kashif • Mar 9, 2023 • 72
view article Article Parameter-Efficient Fine-Tuning using 🤗 PEFT smangrul, sayakpaul • Feb 10, 2023 • 119
view article Article Parameter-Efficient Fine-Tuning using 🤗 PEFT smangrul, sayakpaul • Feb 10, 2023 • 119
view article Article Accelerate Large Model Training using DeepSpeed smangrul, sgugger • Jun 28, 2022 • 7
view article Article Accelerate Large Model Training using PyTorch Fully Sharded Data Parallel smangrul, sgugger • May 2, 2022 • 9
view article Article Accelerate Large Model Training using PyTorch Fully Sharded Data Parallel smangrul, sgugger • May 2, 2022 • 9