Skip to content

Unable to get docker model package to package with --chat-template #894

@lwsrbrts

Description

@lwsrbrts

I'm trying to create a new model which has thinking disabled. My original issue, which I assumed a packaged model would resolve, is that using Open WebUI to run Qwen 3.5 9B on a remote GPU workstation results in it "thinking", no matter what. The Open WebUI option to disable thinking doesn't have any effect on the model.

Just as with the context size setting in Open WebUI not having any effect on the model's context, and since I'd had success extending the context size to 40960 using docker model package, my thought was that I could do the same and include a new chat template which specifically disables thinking.

Using Docker's native model called qwen3.5:9B-UD-Q4_K_XL, I've tried the following commands, which all result in the following error:

Failed to package model
package model: failed to load packaged model: loading model archive: load failed with status 500 Internal Server Error: error while loading model: write manifest: missing blob "sha256:a427b36b6993cda8fd07037a8ac96b06be12be49dedecd01e0c6cff13276ade8" for manifest - refusing to write unless all blobs exist

Note that adding or removing single or double-quotes has absolutely no effect on the outcome, so I've not included those examples. Line Feeds I've also tried CRLF (Windows) and LF (Unix) in the chat template, which I've tried as both .txt and .jinja extensions (not that it should make any difference):

Package with from and a chat template - doesn't work.

docker model package --from ai/qwen3.5:9B-UD-Q4_K_XL --chat-template D:\Docker\qwen3-nothinking.txt --context-size 40960 lwsrbrts/qwen35-instruct:9b-40k

I also resorted to downloading the Unsloth GGUF from Hugging Face, the mmproj, and still using my own chat template (adding nother more than {%- set enable_thinking = false %} at line 3), but I get the exact same issue about a missing blob. (Side note, Unsloth are adamant that thinking is disabled in the 9B model, but no matter what, it still thinks when accessed from Open WebUI)

Package from gguf, mmproj and a chat template - doesn't work.

docker model package --gguf D:\Docker\Qwen3.5-9B-UD-Q4_K_XL.gguf --chat-template D:\Docker\qwen3-nothinking.jinja --mmproj D:\Docker\mmproj-BF16.gguf --context-size 40960 lwsrbrts/qwen35-instruct:9b-40k

The GGUF package gives this output. All looks absolutely fine right up until the end (note that the Transferred info is updated when loading the GGUF and it does count up to 5.6GB, then it does the mmproj, which is 879MB):

PS D:\Docker> docker model package --gguf D:\Docker\Qwen3.5-9B-UD-Q4_K_XL.gguf --chat-template D:\Docker\qwen3-nothinking.jinja --mmproj D:\Docker\mmproj-BF16.gguf --context-size 40960 lwsrbrts/qwen35-instruct:9b-40k
Adding GGUF file from "D:\\Docker\\Qwen3.5-9B-UD-Q4_K_XL.gguf"
Setting context size 40960
Adding chat template file from "D:\\Docker\\qwen3-nothinking.jinja"
Adding multimodal projector file from "D:\\Docker\\mmproj-BF16.gguf"
Loading model to Model Runner...
Transferred: 879.01 MB
Failed to package model
package model: failed to load packaged model: loading model archive: load failed with status 500 Internal Server Error: error while loading model: write manifest: missing blob "sha256:a427b36b6993cda8fd07037a8ac96b06be12be49dedecd01e0c6cff13276ade8" for manifest - refusing to write unless all blobs exist

As mentioned, simply changing the context-size works as expected to create the model variant so all eyes are on the chat template but I have no indication what is wrong with the chat template. It is clearly not "missing" as the error suggests, nor is there anything wrong with the line feeds, name of the file, its location or other things I can think of.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions