Qwen Image Edit ModelOpt FP8 SGLang Transformer
This repository contains a SGLang-ready ModelOpt FP8 transformer override for Qwen/Qwen-Image-Edit.
It only replaces the transformer weights; tokenizer, image encoder, scheduler, VAE, and other non-transformer components are loaded from the original base model.
The checkpoint is intended for SGLang Diffusion with the Qwen Image Edit FP8 support from sgl-project/sglang#23155.
Usage
Use your own input image, or download the validation input image from this repository:
huggingface-cli download BBuf/Qwen-Image-Edit-ModelOpt-FP8-SGLang \
validation/assets/qwen_image_edit_input.png \
--local-dir /tmp/qwen-image-edit-fp8
sglang generate \
--backend=sglang \
--model-id=Qwen-Image-Edit \
--model-path Qwen/Qwen-Image-Edit \
--transformer-path BBuf/Qwen-Image-Edit-ModelOpt-FP8-SGLang \
--prompt "A clean product photo of a small ceramic teapot on a wooden table, soft daylight, sharp details." \
--image-path /tmp/qwen-image-edit-fp8/validation/assets/qwen_image_edit_input.png \
--width=512 \
--height=512 \
--num-inference-steps=8 \
--guidance-scale=4.0 \
--seed=42 \
--num-gpus=1 \
--dit-cpu-offload false \
--dit-layerwise-offload false \
--warmup \
--save-output
H100 Validation Snapshot
Validation was run on one H100 GPU using rank0 with --backend=sglang. The FP8 image below is from the fixed checkpoint after keeping the validated sensitive Qwen Image fallback tensors in BF16.
Artifacts:
- Validation tree:
validation/ - Input image:
validation/assets/qwen_image_edit_input.png - BF16 command:
validation/commands/bf16_qwen_image_edit_512_8_benchmark.sh - FP8 command:
validation/commands/fp8_fixed_qwen_image_edit_512_8_benchmark.sh - Benchmark comparison:
qwen_image_edit_bf16_vs_fp8_fixed_512_8_compare.md
Benchmark, warmup excluded:
| Metric | BF16 | FP8 fixed | Delta | Speedup |
|---|---|---|---|---|
| E2E latency | 6.792 s | 6.085 s | -0.707 s (-10.4%) | 1.12x |
| Denoising stage | 5.204 s | 4.524 s | -0.680 s (-13.1%) | 1.15x |
| Decoding stage | 154.77 ms | 121.06 ms | -33.72 ms (-21.8%) | 1.28x |
| Image encoding | 1.316 s | 1.328 s | +0.011 s (+0.9%) | 0.99x |
| Image VAE encoding | 100.62 ms | 94.93 ms | -5.69 ms (-5.7%) | 1.06x |
Notes:
- Validation prompt:
A clean product photo of a small ceramic teapot on a wooden table, soft daylight, sharp details. - Validation settings:
512x512,8inference steps,guidance_scale=4.0,seed=42,--dit-cpu-offload false,--dit-layerwise-offload false,--warmup.
Conversion Notes
The checkpoint was converted from a NVIDIA ModelOpt FP8 export with SGLang's build_modelopt_fp8_transformer tool.
Most linear weights are FP8. The validated fallback set keeps numerically sensitive tensors in BF16, including the Qwen Image image-MLP output projection family needed for normal image quality.
- Downloads last month
- 81
Model tree for BBuf/Qwen-Image-Edit-ModelOpt-FP8-SGLang
Base model
Qwen/Qwen-Image-Edit

