NSNQuant: A Double Normalization Approach for Calibration-Free Low-Bit Vector Quantization of KV Cache Paper • 2505.18231 • Published May 23, 2025 • 3
Gaussian Weight Sampling for Scalable, Efficient and Stable Pseudo-Quantization Training Paper • 2505.11170 • Published May 16, 2025 • 3