How to use from the
Use from the
Transformers library
# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="Caffin/SVGThinker-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)
# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("Caffin/SVGThinker-7B")
model = AutoModelForCausalLM.from_pretrained("Caffin/SVGThinker-7B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
Quick Links

SVGThinker-7B

SVGThinker-7B is a text-to-SVG generation model introduced in SVGThinker: Instruction-Aligned and Reasoning-Driven Text-to-SVG Generation. It generates editable SVG code from natural-language descriptions, with a focus on compact icon-style vector graphics.

The model is fine-tuned from deepseek-ai/DeepSeek-R1-Distill-Qwen-7B and is released as BF16 sharded safetensors.

Links

Intended Use

This model is intended for:

  • generating SVG icons from English text prompts
  • prototyping simple vector graphics
  • producing editable SVG markup rather than raster images
  • research on text-to-SVG and structured code generation

Generated SVG should be reviewed and sanitized before being rendered in production web pages or downstream applications.

Quick Start

pip install "transformers>=4.51.0" torch accelerate safetensors
import re
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "Caffin/SVGThinker-7B"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto",
    use_safetensors=True,
)

description = "A minimalist calendar icon with two black tabs and a bold checkmark below it."
prompt = "Review the given information below and generate a svg according to it.\n" + description

messages = [{"role": "user", "content": prompt}]
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt",
).to(model.device)

with torch.no_grad():
    output = model.generate(
        inputs,
        max_new_tokens=4096,
        do_sample=True,
        temperature=0.8,
        top_p=0.6,
        repetition_penalty=1.05,
    )

text = tokenizer.decode(output[0], skip_special_tokens=False)
match = re.search(r"<svg.*?</svg>", text, flags=re.DOTALL)
print(match.group(0) if match else text)

The model may emit reasoning text before the final SVG. For most applications, extract the <svg>...</svg> block before rendering.

Model Notes

SVGThinker is trained directly in SVG code space. The paper describes a sequential annotation pipeline that aligns natural-language descriptions with the step-by-step construction of SVG primitives, helping the model generate more editable SVG code.

For full training data, annotation, and evaluation details, see the paper.

Evaluation Snapshot

On the paper's 1,000 held-out text-to-SVG prompts, SVGThinker reports:

Model FID lower is better CLIP higher is better FID-CLIP lower is better Primitive support
SVGThinker-7B 34.06 0.2765 21.08 all

Limitations

  • Outputs may be malformed, incomplete, or visually inconsistent with the prompt.
  • The model is best suited for simple to moderately complex icon-style graphics.
  • It may struggle with photorealistic scenes, dense layouts, and text-heavy SVGs.
  • SVG is executable markup in browser contexts; treat generated SVG as untrusted.
  • The model primarily targets English prompts.

Citation

@inproceedings{chen2025svgthinker,
  title = {SVGThinker: Instruction-Aligned and Reasoning-Driven Text-to-SVG Generation},
  author = {Chen, Hanqi and Zhao, Zhongyin and Chen, Ye and Liang, Zhujin and Ni, Bingbing},
  booktitle = {Proceedings of the 33rd ACM International Conference on Multimedia},
  year = {2025},
  publisher = {ACM},
  doi = {10.1145/3746027.3755392},
  url = {https://arxiv.org/abs/2509.24299}
}
Downloads last month
7
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Caffin/SVGThinker-7B

Finetuned
(314)
this model
Quantizations
1 model

Space using Caffin/SVGThinker-7B 1

Paper for Caffin/SVGThinker-7B