Skip to content

Misc. bug: --grammar-file does nothing with llama-server but APIs calls that pass "grammar" fields work fine #21262

@9iqdispatcher

Description

@9iqdispatcher

Name and Version

llama-server --version
ggml_cuda_init: found 1 CUDA devices (Total VRAM: 8150 MiB):
Device 0: NVIDIA GeForce RTX 5060 Laptop GPU, compute capability 12.0, VMM: yes, VRAM: 8150 MiB
version: 8616 (ced5734)
built with MSVC 19.50.35728.0 for Windows AMD64

Operating systems

Windows

Which llama.cpp modules do you know to be affected?

llama-server

Command line

llama-server --grammar-file grammar.gbnf -m qwen_qwen3.5-0.8b-q8_0.gguf

Problem description & steps to reproduce

The server ignores the file passed by the command line flag, but honors APIs requests that pass a "grammar" field.

possibly caused by commit

5e54d51

as it removed defaults.sampling.grammar from the initialization process (default initialization to empty string instead) and seems to depend on the grammar field having been sent through the API

First Bad Commit

No response

Relevant log output

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions