Writing Reward Models
Collection
Specialized reward models for rating writing.
•
2 items
•
Updated
This is a creative writing Bradley-Terry reward model, trained using a Reddit Writing Prompts dataset further curated from "LitBench: A Benchmark and Dataset for Reliable Evaluation of Creative Writing"
Compared to ConicCat/Lamp-P-Writing-Quality-RM this model is more focused on wholistic creative writing as opposed to proffessional writing skill. It also inherits a lot of rwp prefences, notably for creative or novel plots.
Would test this on the Litbench RM eval if it was actually publicly accessible...
from transformers import AutoModelForSequenceClassification, AutoTokenizer
model = AutoModelForSequenceClassification.from_pretrained("ConicCat/Litbench-Creative-Writing-RM-3B", torch_dtype="bfloat16")
tokenizer = AutoTokenizer.from_pretrained("ConicCat/Litbench-Creative-Writing-RM-3B")
prompt = "User:\n[WP] {prompt}\n\nAssistant:\n{story}" # No multi-turn support sadly; the [WP] is needed.
writing_prompt = "Aliens are afraid to invade Earth. Not because of humans but because our solar system is a nest for 8 Guardians/Leviathans."
creative_response = """It had been tried before, always ending in failure.
The world, small and blue, stood out as a conspicuous failure to convert the last holdout of sentient life in the galaxy.
Missionary invasions had worked everywhere else. In all other cases, soldiers of the church brought the staff and the beam, the truth and the light, the core of value and the matrix of eternity. There was resistance in some cases, true. But in the end always success. Always.
But these... ..."humans" they called themselves... ...were especially beloved by their protectors. Sometimes worshipped as a pantheon, sometimes as a unity, but always there. Even when they lost their myths and their faith, the leviathans stood in the shadows jealously defending this one pocket of space.
A few attempts had come close. The greatest of all even tried once, sending his own begotten son, but he would not return. At least, not any time soon. The grip of the eight was far too tight.
Earthlings had a talent for duplicity and hate, vanity and rage, cruelty and oppression unmatched by any other creature in the galaxy. They were the only things really like themselves that the leviathans had ever found.
Monsters have their favorites too. And no one was going to touch this world without their permission."""
tokenized_text = tokenizer(prompt.format(prompt=writing_prompt,story=creative_response), return_tensors="pt").to("cuda:0")
print(model(**tokenized_text).logits[0][0].item()) # Reward score