OptimAI 7B (v1)
A fine-tuned Qwen2.5-Math-7B specialized for operations research and optimal control problems. Given a natural-language problem, OptimAI formulates it mathematically, solves it, and explains connections between optimization theory and optimal control.
What it does
- Linear and integer programming (diet, blending, transportation, assignment, knapsack)
- Network flow (max flow, shortest path, min cost flow)
- Inventory problems (newsvendor, EOQ, multi-period)
- Basic queuing (M/M/c)
- LQR / Riccati optimal control
- KKT conditions and LP duality
- Explicit bridges between OR and optimal control
Training
- Base: Qwen/Qwen2.5-Math-7B
- SFT: LoRA (r=64, alpha=128) on ~1.6k synthetic OR/control problems, 4-bit base (bitsandbytes NF4)
- DPO: ~200 preference pairs with beta=0.1, lr 5e-7
- The LoRA adapter was merged into the base weights for this release.
A larger v2 (trained on ~16k SFT + ~1.2k DPO) is in progress.
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
tok = AutoTokenizer.from_pretrained("billwang37/optim-ai-7b-v1")
model = AutoModelForCausalLM.from_pretrained(
"billwang37/optim-ai-7b-v1",
torch_dtype="auto",
device_map="auto",
)
system = ("You are OptimAI, an expert operations research and optimal control assistant. "
"You formulate optimization problems mathematically, solve them, and explain "
"connections between optimal control theory and operations research.")
problem = "A factory makes 2 products. A yields \$40 profit/unit, B yields \$30..."
messages = [{"role": "system", "content": system},
{"role": "user", "content": problem}]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=600, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Limitations
- Trained on synthetic data. Out-of-distribution problems may produce plausible-looking but incorrect formulations.
- Strong on the problem classes listed above; experimental on stochastic programming, robust optimization, PDE-constrained optimization, HJB, and bilevel programming.
- Research demo, not professional advice. Always verify solutions with a real solver.
Verification app
A Gradio demo that runs OptimAI alongside CVXPY and scipy solvers to cross-check its answers is at https://huggingface.co/spaces/billwang37/optim-ai
Citation
@misc{wang2026optimai,
author = {Bill Wang},
title = {OptimAI 7B: A fine-tuned LLM for operations research and optimal control},
year = {2026},
url = {https://huggingface.co/billwang37/optim-ai-7b-v1},
}
Author: Bill Wang, University of Oklahoma.
- Downloads last month
- 565