u-10bei/sft_alfworld_trajectory_dataset_v5
Viewer • Updated • 2.5k • 742
How to use Chattso-GPT/adv-sft-v2 with PEFT:
Task type is invalid.
This repository provides a LoRA adapter fine-tuned from Qwen/Qwen3-4B-Instruct-2507 using LoRA + Unsloth.
This repository contains LoRA adapter weights only. The base model must be loaded separately.
This adapter is trained to improve multi-turn agent task performance on both ALFWorld (household tasks) and DBBench (database operations).
Loss is applied to all assistant turns in the multi-turn trajectory, enabling the model to learn environment observation, action selection, tool use, SQL construction, and recovery from errors.
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
base = "Qwen/Qwen3-4B-Instruct-2507"
adapter = "Chattso-GPT/adv-sft-v2"
tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
base,
torch_dtype=torch.float16,
device_map="auto",
)
model = PeftModel.from_pretrained(model, adapter)
Training data:
Dataset License: MIT License. Compliance: Users must comply with the MIT license and the base model terms of use.
Base model
Qwen/Qwen3-4B-Instruct-2507