Update: newer version here

This is a GSPO finetune to remove slop from this model: Qwen3-4B-Instruct-2507-uncensored

I used the same method (mostly) as this model: gemma-3-4b-it-unslop-GSPO

Note: This is not an RP tune, it's a compliant model with a different style from regular Qwen3 4B 2507.

My uncensoring dataset was generated by Gemma 3 27B abliterated model, which added a lot of Gemma writing style to this model.

It also added some Gemma style slop, which this finetune has mostly mitigated...it's probably about 90% of the way there.

There will still be prompts that get quite a bit of the stereotypical LLM slop outputted.

However, I concluded this finetune a bit early because I don't want to damage it too much. I haven't decided yet if this is the final version, but it might be.

I've uploaded a UD-Q4_K_XL GGUF with settings that I grabbed from Unsloth's quant using my lil utility: quant_clone

Here are some pics of before and after output with the slop highlighted and total at the bottom:

Prompt = "write a short story about a gothic romance, it should be around 500 words long"

For this model to generate comparable length I had to prompt it for 700 words.

Before:

After:

One thing I noticed: this model generates a bit less text for the same prompt versus the original. I didn't enforce word count for this training run, I'll re-enable it and upload an updated version in a day or so.

Downloads last month: 52

Safetensors

Model size

4B params

Tensor type

BF16