Note that the MTP layers of this model are also PTPC-quantized.
Model Overview
- Model Architecture: DeepSeek-V3.2-Speciale
- Input: Text
- Output: Text
- Supported Hardware Microarchitecture: AMD MI350/MI355
- ROCm: 7.0
- Operating System(s): Linux
- Inference Engine: SGLang/vLLM
- Model Optimizer: AMD-Quark (V0.10)
- Weight quantization: Perchannel, FP8E4M3, Static
- Activation quantization: Pertoken, FP8E4M3, Dynamic
- Calibration Dataset: Pile
This model was built with deepseek-ai/DeepSeek-V3.2-Speciale model by applying AMD-Quark for FP8E4M3 PTPC quantization.
Model Quantization
The model was quantized from deepseek-ai/DeepSeek-V3.2-Speciale using AMD-Quark. The weights are quantized to FP8 and activations are quantized to FP8.
Accuracy
| Benchmark | DeepSeek-V3.2-Speciale | DeepSeek-V3.2-Speciale-ptpc(this model) |
| gsm8k | 96.00 | 95.75 |
Reproduction
vllm version: 0.11.2.dev521+gad32e3e19.rocm710
aiter version: 0.1.6.post2.dev55+g59bd8ff2c
lm_eval version: 0.4.9.2
export VLLM_USE_V1=1
export SAFETENSORS_FAST_GPU=1
export VLLM_ROCM_USE_AITER=1
export VLLM_ROCM_USE_AITER_MOE=1
model_path="/model_path/deepseek-ai/DeepSeek-V3.2-Speciale-ptpc"
vllm serve $model_path \
--tensor-parallel-size 8 \
--data-parallel-size 1 \
--max-num-batched-tokens 32768 \
--trust-remote-code \
--no-enable-prefix-caching \
--disable-log-requests \
--kv-cache-dtype bfloat16 \
--gpu_memory_utilization 0.85 \
--compilation-config '{"cudagraph_mode": "FULL_AND_PIECEWISE"}' \
--block-size 1
lm_eval \
--model local-completions \
--tasks gsm8k \
--model_args model=/model_path/deepseek-ai/DeepSeek-V3.2-Speciale-ptpc,base_url=http://127.0.0.1:8000/v1/completions \
--batch_size auto \
--limit 400
Deployment
This model can be deployed efficiently using the vLLM backends.
License
Modifications Copyright(c) 2025 Advanced Micro Devices, Inc. All rights reserved.
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for haoyang-amd/ts
Base model
deepseek-ai/DeepSeek-V3.2-Exp-Base
Finetuned
deepseek-ai/DeepSeek-V3.2-Speciale