--- title: 3B Thinking (vLLM + Controller) emoji: 🏆 colorFrom: indigo colorTo: blue sdk: gradio sdk_version: 5.49.1 app_file: app.py pinned: true license: apache-2.0 --- This Space wraps `meta-llama/Llama-3.2-3B-Instruct` with a simple **controller**: brainstorm (high T) → critic (low T) → finalize (low T). **Setup** - Attach a GPU (T4 small is fine). - Add a Space **Secret** `HF_TOKEN` so the app can pull gated weights. **Notes** - Uses the tokenizer's chat template for correct formatting. - Private reasoning stays inside ``; only `` is shown to the user.