Qwen3.5-9b-Sushi-Coder-RL-MLX

Qwen3.5-9b-Sushi-Coder-RL-MLX

Lineage

Training

The upstream SFT model was trained with Unsloth on:

The RL stage was then run for coding with NousResearch/hermes-agent using NousResearch/atropos.

During that run, vLLM was patched with vllm-project/vllm PR #36395, fix(lora): add bounds checking for TP configurations, to address the LoRA tensor-parallel bounds issue.

Conversion

This repo contains an MLX export for Apple Silicon generated from the original Hugging Face safetensors checkpoint, not from the GGUF release.

  • Format: MLX
  • Quantization: 4-bit affine
  • Conversion stack: mlx-vlm

Files

  • model-00001-of-00002.safetensors
  • model-00002-of-00002.safetensors
  • model.safetensors.index.json
  • config.json
  • processor_config.json
  • tokenizer.json
  • tokenizer_config.json
  • generation_config.json
  • chat_template.jinja

Usage Note

This is an MLX multimodal export intended for Apple Silicon. Use it with mlx-vlm, not llama.cpp.

Quick Start

Install:

pip install -U mlx-vlm

Text generation:

mlx_vlm.generate \
  --model bigatuna/Qwen3.5-9b-Sushi-Coder-RL-MLX \
  --prompt "Write a Python function that parses a CSV file into dataclasses." \
  --max-tokens 512

Image + text:

mlx_vlm.generate \
  --model bigatuna/Qwen3.5-9b-Sushi-Coder-RL-MLX \
  --image /path/to/image.png \
  --prompt "Describe the bug in this screenshot and suggest a fix." \
  --max-tokens 512

Metadata

  • License: Apache-2.0
  • Architecture: Qwen 3.5
  • Format: MLX
  • Tags: mlx, mlx-vlm, apple-silicon, multimodal, code, rl
Downloads last month
624
Safetensors
Model size
2B params
Tensor type
BF16
·
U32
·
MLX
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bigatuna/Qwen3.5-9b-Sushi-Coder-RL-MLX

Quantized
(2)
this model

Datasets used to train bigatuna/Qwen3.5-9b-Sushi-Coder-RL-MLX