khajaphysist's picture
Update README.md
e96bfd4 verified
metadata
library_name: transformers
base_model:
  - Qwen/Qwen3-30B-A3B

FP8-Dynamic quant to support Ampere cards.

Use following vllm command to run on 2x 3090

vllm serve khajaphysist/Qwen3-30B-A3B-FP8-Dynamic --enable-reasoning --reasoning-parser deepseek_r1 \
     -tp 2 --gpu-memory-utilization 0.99 --disable-log-requests --enforce-eager --max-num-seqs 15