Please update llama.cpp to see improved performance!

pinned

by danielhanchen - opened Dec 13, 2025

Unsloth AI org Dec 13, 2025

•

edited Dec 13, 2025

Hey guys, please update llama.cpp to use the latest updates from 2 days ago. According to many of people and our tests, you should see large improvements in Devstral 2 etc for use cases like tool calling as well. Looping should be also less.

We'll be reconverting today and all should be reuploaded by tomorrow.

See these 2 pull requests and issues:
https://github.com/ggml-org/llama.cpp/pull/17945
https://github.com/ggml-org/llama.cpp/issues/17980

danielhanchen pinned discussion Dec 13, 2025

koifish12

Dec 15, 2025

when can we tell that its been updated?

jkrauss82

Dec 15, 2025

just testing some arbitrary PHP code with Devstral 2 Small, Roo code and updated llama.cpp, works great so far (also better than Qwen3 Coder and Devstral 2507)

ciprianv

Dec 16, 2025

tested in roo code for a tetris game, it worked, no errors

ThijsL202

Jan 10

•

edited Jan 10

tested in roo code for a tetris game, it worked, no errors

What backend do you use? I can't get roocode to work, telling me it doesn't support tool calling...
Tried koboldcpp with jinja and jinja for tools enabled:

The model returned no assistant messages. This may indicate an issue with the API or the model's output. - koboldcpp over OpenAI Compatible

Date/time: 2026-01-10T17:03:04.836Z
Extension version: 3.39.2
Provider: ollama
Model: MODELFILE

The model provided text/reasoning but did not call any of the required tools. This usually indicates the model misunderstood the task or is having difficulty determining which tool to use. The model has been automatically prompted to retry with proper tool usage. - koboldcpp over ollama

and ollama afterwards with the same file and:

PARAMETER num_ctx 65536
PARAMETER temperature 0.7

If you're using ollama, which quant did you use and did you install it through ollama itself?

If you're using llamacpp, can you tell me how? I've never served a model using it.

I barely use llamacpp/ollama only tabbyapi/koboldcpp

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment