Gemma3NPC-1b-Q8-GGUF
Q4 GGUF version of Gemma3NPC-1b-float16.
A new attempt in training Gemma3NPC.
Tensorboard data are available!
It's been a while since the last Gemma3NPC model release, in the mean while we were working on some other models like GemmaThink.
Now we are back with the newest Gemma3NPC-1b, trained using our RolePlay-NPCv2 dataset.
Training Parameters
We trained this model as a rank-32 LoRA adapter with two epoches over RolePlay-NPCv2 using a 80GB A100 in Google Colab. For this run, we employed a learning rate of 2e-5 and a total batch size of 8 and gradient accumulation steps of 4. A cosine learning rate scheduler was used with an 150-step warmup. With a gradient clipping of 1.0.
Check out our training notebook here.
Changes & Performance
With this new 1b model, we used much more aggresive training parameters and added some NSFW dataset to experiment with the results. We noticed a few really interesting responses:
- There seems to be some sign of "reasoning"
- The model is less likely to break out of character
- Something up to the users to explore for themselves, remember to provide a roleplaying prompt first!
Future Work
Now, we will be focusing on further improving Gemma3NPC, not only just through training parameters.
- Better data (most of our data are old and need an update), either collected or synthetically generated.
- Better & new models, expand beyond Gemma3 model family, our next goal is a Qwen3 based model.
- Adding GRPO into the training loop.
These improvements serve our ultimate goal of creating an small agentic NPC model, with good RP quality and tool-calling for dynamic in-game interactions.
We also plan to create some sort of a Unity game demo,it's on its way.
- Downloads last month
- 21
4-bit
Model tree for chimbiwide/Gemma3NPC-1b-Q4-GGUF
Base model
google/gemma-3-1b-pt
