Qwen3-VL-2B GGUF
This is a GGUF conversion of Qwen/Qwen3-VL-2B-Instruct - a Vision-Language Model optimized for on-device inference with llama.cpp.
Model Details
| Property | Value |
|---|---|
| Original Model | Qwen3-VL-2B-Instruct |
| Parameters | 2 billion |
| Quantization | Q8_0 |
| Model Size | ~1.7 GB |
| Vision Encoder Size | ~424 MB (Q8_0) |
| Context Window | 8,192 tokens |
| Architecture | Qwen3-VL with native vision encoder |
Files
Qwen3VL-2B-Instruct-Q8_0.gguf- Main language modelmmproj-Qwen3VL-2B-Instruct-Q8_0.gguf- Vision encoder (mmproj)
Intended Use
This model is optimized for:
- Mobile/Edge Deployment: Runs on iOS devices with 8GB+ RAM
- llama.cpp Integration: Compatible with llama.cpp vision features
- On-Device AI: Private, offline image understanding
Capabilities
- Image Captioning: Describe images in detail
- Visual Q&A: Answer questions about images
- Document OCR: Extract text from documents and photos
- Scene Understanding: Analyze complex visual scenes
- Superior Quality: Best-in-class for 2B parameter VLMs
Usage with llama.cpp
./llama-llava-cli -m Qwen3VL-2B-Instruct-Q8_0.gguf \
--mmproj mmproj-Qwen3VL-2B-Instruct-Q8_0.gguf \
--image your_image.jpg \
-p "Describe this image in detail"
Prompt Format
<|im_start|>system
You are a helpful assistant.<|im_end|>
<|im_start|>user
<|vision_start|><|vision_end|>
{prompt}<|im_end|>
<|im_start|>assistant
License
This model inherits the Apache 2.0 license from the original Qwen3-VL model.
Attribution
- Original Model: Qwen3-VL-2B-Instruct by Qwen Team, Alibaba Cloud
- GGUF Conversion: jc-builds
- Downloads last month
- 18
Hardware compatibility
Log In to add your hardware
8-bit
Model tree for jc-builds/qwen3vl-2b-gguf
Base model
Qwen/Qwen3-VL-2B-Instruct