Instructions to use billwang37/optim-ai-7b-v1 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use billwang37/optim-ai-7b-v1 with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="billwang37/optim-ai-7b-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("billwang37/optim-ai-7b-v1")
model = AutoModelForCausalLM.from_pretrained("billwang37/optim-ai-7b-v1")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use billwang37/optim-ai-7b-v1 with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "billwang37/optim-ai-7b-v1"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "billwang37/optim-ai-7b-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/billwang37/optim-ai-7b-v1

SGLang

How to use billwang37/optim-ai-7b-v1 with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "billwang37/optim-ai-7b-v1" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "billwang37/optim-ai-7b-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "billwang37/optim-ai-7b-v1" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "billwang37/optim-ai-7b-v1",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use billwang37/optim-ai-7b-v1 with Docker Model Runner:
```
docker model run hf.co/billwang37/optim-ai-7b-v1
```

OptimAI 7B (v1)

A fine-tuned Qwen2.5-Math-7B specialized for operations research and optimal control problems. Given a natural-language problem, OptimAI formulates it mathematically, solves it, and explains connections between optimization theory and optimal control.

What it does

Linear and integer programming (diet, blending, transportation, assignment, knapsack)
Network flow (max flow, shortest path, min cost flow)
Inventory problems (newsvendor, EOQ, multi-period)
Basic queuing (M/M/c)
LQR / Riccati optimal control
KKT conditions and LP duality
Explicit bridges between OR and optimal control

Training

Base: Qwen/Qwen2.5-Math-7B
SFT: LoRA (r=64, alpha=128) on ~1.6k synthetic OR/control problems, 4-bit base (bitsandbytes NF4)
DPO: ~200 preference pairs with beta=0.1, lr 5e-7
The LoRA adapter was merged into the base weights for this release.

A larger v2 (trained on ~16k SFT + ~1.2k DPO) is in progress.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("billwang37/optim-ai-7b-v1")
model = AutoModelForCausalLM.from_pretrained(
    "billwang37/optim-ai-7b-v1",
    torch_dtype="auto",
    device_map="auto",
)

system = ("You are OptimAI, an expert operations research and optimal control assistant. "
          "You formulate optimization problems mathematically, solve them, and explain "
          "connections between optimal control theory and operations research.")
problem = "A factory makes 2 products. A yields \$40 profit/unit, B yields \$30..."

messages = [{"role": "system", "content": system},
            {"role": "user", "content": problem}]
text = tok.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tok(text, return_tensors="pt").to(model.device)
out = model.generate(**inputs, max_new_tokens=600, do_sample=False)
print(tok.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Limitations

Trained on synthetic data. Out-of-distribution problems may produce plausible-looking but incorrect formulations.
Strong on the problem classes listed above; experimental on stochastic programming, robust optimization, PDE-constrained optimization, HJB, and bilevel programming.
Research demo, not professional advice. Always verify solutions with a real solver.

Verification app

A Gradio demo that runs OptimAI alongside CVXPY and scipy solvers to cross-check its answers is at https://huggingface.co/spaces/billwang37/optim-ai

Citation

@misc{wang2026optimai,
  author  = {Bill Wang},
  title   = {OptimAI 7B: A fine-tuned LLM for operations research and optimal control},
  year    = {2026},
  url     = {https://huggingface.co/billwang37/optim-ai-7b-v1},
}

Author: Bill Wang, University of Oklahoma.

Downloads last month: 565

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for billwang37/optim-ai-7b-v1

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Math-7B

Adapter

(18)

this model