NPU - QNN
Collection
leading models optimized for NPU deployment on Qualcomm Snapdragon
•
7 items
•
Updated
qwen2.5-1.5b-instruct-onnx-qnn is an ONNX QNN int4 quantized version of Qwen2.5-1.5B-Instruct, providing a very fast inference implementation, optimized for AI PCs using Qualcommm NPU.
This is from the latest release series from Qwen.