soundsgoodai
/

GLM-4.7-NVFP4-KV-cache-BF16

Text Generation

8-bit precision

Model card Files Files and versions

A quantization setup used for GLM-4.7:

Weights: NVFP4
KV cache: BF16
Tooling: NVIDIA/Model-Optimizer
Deploy with TensorRT-LLM

Downloads last month: 63

Safetensors

Model size

177B params

Tensor type

BF16

·

F32

·

F8_E4M3

·

U8

·

Model tree for soundsgoodai/GLM-4.7-NVFP4-KV-cache-BF16

Base model

zai-org/GLM-4.7

Quantized

(42)

this model