OptimAI 7B (v1)

A fine-tuned Qwen2.5-Math-7B specialized for operations research and optimal control problems. Given a natural-language problem, OptimAI formulates it mathematically, solves it, and explains connections between optimization theory and optimal control.

What it does

  • Linear and integer programming (diet, blending, transportation, assignment, knapsack)
  • Network flow (max flow, shortest path, min cost flow)
  • Inventory problems (newsvendor, EOQ, multi-period)
  • Basic queuing (M/M/c)
  • LQR / Riccati optimal control
  • KKT conditions and LP duality
  • Explicit bridges between OR and optimal control

Training

  • Base: Qwen/Qwen2.5-Math-7B
  • SFT: LoRA (r=64, alpha=128) on ~1.6k synthetic OR/control problems, 4-bit base (bitsandbytes NF4)
  • DPO: ~200 preference pairs with beta=0.1, lr 5e-7
  • The LoRA adapter was merged into the base weights for this release.

A larger v2 (trained on ~16k SFT + ~1.2k DPO) is in progress.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("billwang37/optim-ai-7b-v1")
model = AutoModelForCausalLM.from_pretrained(
    "billwang37/optim-ai-7b-v1",
    torch_dtype="auto",
    device_map="auto",
)

system = ("You are OptimAI, an expert operations research and optimal control assistant. "
          "You formulate optimization problems mathematically, solve them, and explain "
          "connections between optimal control theory and operations research.")
problem = "A factory makes 2 products. A yields \$40 profit/unit, B yields \$30..."

messages = [{"role": "system", "content": system},
            {"role": "user", "content": problem}]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=600, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Limitations

  • Trained on synthetic data. Out-of-distribution problems may produce plausible-looking but incorrect formulations.
  • Strong on the problem classes listed above; experimental on stochastic programming, robust optimization, PDE-constrained optimization, HJB, and bilevel programming.
  • Research demo, not professional advice. Always verify solutions with a real solver.

Verification app

A Gradio demo that runs OptimAI alongside CVXPY and scipy solvers to cross-check its answers is at https://huggingface.co/spaces/billwang37/optim-ai

Citation

@misc{wang2026optimai,
  author  = {Bill Wang},
  title   = {OptimAI 7B: A fine-tuned LLM for operations research and optimal control},
  year    = {2026},
  url     = {https://huggingface.co/billwang37/optim-ai-7b-v1},
}

Author: Bill Wang, University of Oklahoma.

Downloads last month
565
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for billwang37/optim-ai-7b-v1

Base model

Qwen/Qwen2.5-7B
Adapter
(18)
this model