| library_name: transformers | |
| base_model: | |
| - Qwen/Qwen3-30B-A3B | |
| FP8-Dynamic quant to support Ampere cards. | |
| Use following vllm command to run on 2x 3090 | |
| ```bash | |
| vllm serve khajaphysist/Qwen3-30B-A3B-FP8-Dynamic --enable-reasoning --reasoning-parser deepseek_r1 \ | |
| -tp 2 --gpu-memory-utilization 0.99 --disable-log-requests --enforce-eager --max-num-seqs 15 | |
| ``` | |