๐ค Project website | ๐ LinkedIn
Introduction to ROLEPL-AI
ROLEPL-AI is an Erasmus+ project for soft skills training in education, and especially in technical and vocational training in the tourism industry. The project aims at exploring the potential of Artificial Intelligence as an interactive agent for educational exercises, allowing students to train their soft skills at their own pace, remotely and in a stress-free environment.
Various experiments with european students were conducted in 2024 and 2025 [1]. These experiments took place in Teemew, a virtual environment (commonly called "metaverse") inside which the participants were able to interact with AI-powered characters in order to train their soft skills.
The ROLEPL-AI models
The characters inside the aforementioned simulation were powered by the Qwen2.5 large language model series fine-tuned on a custom role-playing dataset to enhance their role-playing cababilities, after concluding through a survey of the then-existing best open-source models that none of them would be suitable as is for a realistic interaction [2].
The custom fine-tuning dataset notably included parts of the Beyond Dialogue dataset, as well as cherry-picked and annotated samples of the PRODIGy dataset (based on the Cornell Movie-Dialogs Corpus dataset) and the DailyDialog dataset.
While the training objectives were closely related to the specific use cases of the ROLEPL-AI project, we believe that the role-playing capabilities of the models were enhanced as a whole, especially in regard to human-likeness in speech. The models were trained solely in English.
We release the weights of the final fine-tuned model in three sizes, ROLEPL-AI-v2-Qwen2.5-7B, ROLEPL-AI-v2-Qwen2.5-32B and ROLEPL-AI-v2-Qwen2.5-72B.
How to use the models
The ROLEPL-AI models were trained on a non-conventional prompt template, built to be more robust to jailbreak attempts and ensure that the models would hardly ever break out of the role-play setting even when prompted to do so. While they are still compatible with the generic ChatML conversational template that the Qwen2.5 series also uses, we recommend using our code snippet to get the most out of the model.
First, install the latest version of Transformers:
pip install transformers
Then make sure you process the conversation history as described below:
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Inceptive/ROLEPL-AI-v2-Qwen2.5-7B"
# Load the tokenizer and the model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
# Prepare the system prompt
system_prompt = "You are a professional writer, writing educational content for students in tourism management." # You need to introduce the AI as an author/writer on whatever topic you wish to roleplay on (professional/fantasy setting etc.)
# Prepare the role-play setting and append the history.
# You may put your custom role-playing instructions and setting here as long as the general structure is the same.
# Always make sure to keep the "Write the next answer from [CHARACTER]" instruction, and to include the whole conversation history within the <user> part of the prompt.
def do_prompt(message_history):
return f"""<|im_start|>system
{system_prompt}
<|im_end|>
<|im_start|>user
[Dialogue completion]
Mia Nielsen, an electrical engineer at WindTech Innovators, is attending the Career Connect Expo job fair as an exhibitor. Sheโs setting up the booth with her colleagues and preparing materials for the first visitors.
Just as Mia is about to print more brochures, her portable printer wonโt connect to her laptop. She checks the cables, but everything seems fine. The fair is about to start, and she doesnโt have time to troubleshoot. Annoyed, she asks a fair assistant for help.
Write the next answer from Mia Nielsen. You may end your answer with one of the following emoticons based on the emotion of the character: (๐ ๐คข๐จ๐๐ข๐ฒ).
{"\n".join(message_history)}
<|im_end|>
<|im_start|>assistant
"""
user_input = ""
user_name = "Lucas Wright"
assistant_answer = "[Mia Nielsen]: Excuse me, do you have a moment?"
history = [assistant_answer]
while True:
print(assistant_answer)
user_input = input(f"[{user_name}]: ")
if user_input in ("quit", "exit"):
break
history.append(f"[{user_name}]: {user_input}")
final_input = do_prompt(history)
input_ids = tokenizer(final_input, return_tensors="pt")
input_ids.to("cuda")
outputs = model.generate(**input_ids, max_new_tokens=256)
out = tokenizer.decode(*outputs)
assistant_answer = out.split("<|im_start|>assistant")[1].split("<|im_end|>")[0].strip()
history.append(assistant_answer)
References
- [1] ROLEPL-AI: Experimentation results
- [2] ROLEPL-AI: Analysis and comparison of existing AI technology
License Agreement
The weights are licensed under the CC BY-NC-SA 4.0 License Agreement.
- Downloads last month
- 5