Qwen2.5-7B-Instruct โ€” Agent SFT

Fine-tuned model for AgentBench tasks (DB Bench + ALFWorld).

Base Model

  • Model: Qwen/Qwen2.5-7B-Instruct
  • Max sequence length (training): 3584

Training Configuration

  • Method: SFT with High-Rank LoRA (merged)
  • LoRA rank: 128
  • LoRA alpha: 256
  • LoRA dropout: 0
  • Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
  • Learning rate: 2e-05
  • Epochs: 2
  • Batch size: 1 (x8 gradient accumulation)
  • Warmup ratio: 0.1
  • LR scheduler: cosine
  • Weight decay: 0.05
  • Max grad norm: 1.0
  • Precision: bf16

Training Data

  • data/processed/db_bench_raw.jsonl
  • data/processed/alfworld_augmented.jsonl
  • data/synthetic/db_bench/trajectories-ranking-other.jsonl

Usage

This model can be served with vLLM:

vllm serve <model_path> --max-model-len 8192 --gpu-memory-utilization 0.95

License

Apache-2.0 (following the base model license)

Downloads last month
14
Safetensors
Model size
8B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for shiwano/qwen2.5-7b-agent-sft-v13

Base model

Qwen/Qwen2.5-7B
Finetuned
(2700)
this model