-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 13 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2307.06945
-
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 27 -
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models
Paper • 2308.16137 • Published • 40 -
Scaling Transformer to 1M tokens and beyond with RMT
Paper • 2304.11062 • Published • 3 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 19
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 14 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 28 -
Self-slimmed Vision Transformer
Paper • 2111.12624 • Published • 1 -
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Paper • 2308.14903 • Published • 1
-
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Paper • 2310.08659 • Published • 27 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 45 -
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
Paper • 2309.16119 • Published • 1 -
LoRA ensembles for large language model fine-tuning
Paper • 2310.00035 • Published • 2
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
When can transformers reason with abstract symbols?
Paper • 2310.09753 • Published • 3 -
Improving Length-Generalization in Transformers via Task Hinting
Paper • 2310.00726 • Published • 1 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 28
-
A LoRA-Based Approach to Fine-Tuning LLMs for Educational Guidance in Resource-Constrained Settings
Paper • 2504.15610 • Published • 1 -
Train Small, Infer Large: Memory-Efficient LoRA Training for Large Language Models
Paper • 2502.13533 • Published • 13 -
LoRA-SP: Streamlined Partial Parameter Adaptation for Resource-Efficient Fine-Tuning of Large Language Models
Paper • 2403.08822 • Published -
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Paper • 2407.18242 • Published
-
Efficient Memory Management for Large Language Model Serving with PagedAttention
Paper • 2309.06180 • Published • 27 -
LM-Infinite: Simple On-the-Fly Length Generalization for Large Language Models
Paper • 2308.16137 • Published • 40 -
Scaling Transformer to 1M tokens and beyond with RMT
Paper • 2304.11062 • Published • 3 -
DeepSpeed Ulysses: System Optimizations for Enabling Training of Extreme Long Sequence Transformer Models
Paper • 2309.14509 • Published • 19
-
LoftQ: LoRA-Fine-Tuning-Aware Quantization for Large Language Models
Paper • 2310.08659 • Published • 27 -
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Paper • 2309.14717 • Published • 45 -
ModuLoRA: Finetuning 3-Bit LLMs on Consumer GPUs by Integrating with Modular Quantizers
Paper • 2309.16119 • Published • 1 -
LoRA ensembles for large language model fine-tuning
Paper • 2310.00035 • Published • 2
-
Sparse Autoencoders Find Highly Interpretable Features in Language Models
Paper • 2309.08600 • Published • 14 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 28 -
Self-slimmed Vision Transformer
Paper • 2111.12624 • Published • 1 -
MEMORY-VQ: Compression for Tractable Internet-Scale Memory
Paper • 2308.14903 • Published • 1
-
In-Context Learning Creates Task Vectors
Paper • 2310.15916 • Published • 43 -
When can transformers reason with abstract symbols?
Paper • 2310.09753 • Published • 3 -
Improving Length-Generalization in Transformers via Task Hinting
Paper • 2310.00726 • Published • 1 -
In-context Autoencoder for Context Compression in a Large Language Model
Paper • 2307.06945 • Published • 28