Qwen2.5-7B-Instruct โ Agent SFT
Fine-tuned model for AgentBench tasks (DB Bench + ALFWorld).
Base Model
- Model:
Qwen/Qwen2.5-7B-Instruct - Max sequence length (training): 3584
Training Configuration
- Method: SFT with High-Rank LoRA (merged)
- LoRA rank: 128
- LoRA alpha: 256
- LoRA dropout: 0
- Target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
- Learning rate: 2e-05
- Epochs: 2
- Batch size: 1 (x8 gradient accumulation)
- Warmup ratio: 0.1
- LR scheduler: cosine
- Weight decay: 0.05
- Max grad norm: 1.0
- Precision: bf16
Training Data
data/processed/db_bench_raw.jsonldata/processed/alfworld_augmented.jsonldata/synthetic/db_bench/trajectories-ranking-other.jsonl
Usage
This model can be served with vLLM:
vllm serve <model_path> --max-model-len 8192 --gpu-memory-utilization 0.95
License
Apache-2.0 (following the base model license)
- Downloads last month
- 14