Skip to content

Commit 73cfedc

Browse files
localai-botmudler
andauthored
fix: tool-call JSON leaks into content with stream+tools on tokenizer-template models (#10052) (#10057)
* fix(grammars): honor properties_order entry at index 0 The JSON-schema-to-GBNF property sort used `aOrder != 0 && bOrder != 0` as its "is this key ordered?" guard. That treats index 0 — the first key listed in properties_order — as unset, so `properties_order: name,arguments` fell back to alphabetical ordering and still emitted "arguments" before "name". Use presence in the order map instead: listed keys sort by their index and ahead of unlisted keys, which keep a stable alphabetical order. This makes the documented `properties_order: name,arguments` actually produce name-first tool-call JSON. Relates to #10052. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] * fix(functions): defer tool grammar to the backend when the tokenizer template owns templating (#10052) When use_tokenizer_template delegates templating to the backend (llama.cpp), the backend also owns tool-call grammar generation and parsing. LocalAI was still generating its own GBNF grammar and sending it down. With a grammar present, llama.cpp does not hand the tools to its template, so its native peg/json tool parser never engages: it streams the grammar-constrained tool-call JSON back as plain content instead of emitting tool_calls. In streaming mode the JSON object leaked into the content field, and the Go-side incremental detector never gated content because the LocalAI-generated grammar emitted "arguments" before "name". The GGUF auto-import path already couples use_tokenizer_template with grammar.disable, but that block is skipped when a template is already configured, so gallery and hand-written configs (e.g. qwen3) that set the tokenizer template directly never got the paired grammar.disable. - SetDefaults now enforces the coupling for every config: when use_tokenizer_template is set, grammar generation is disabled and tools flow to the backend's native (name-first) pipeline. This also fixes already-installed models without editing each config. - Set function.grammar.disable in the shared gallery/qwen3.yaml, which is the base config referenced by every qwen3 gallery entry. Verified end to end against qwen3-4b with stream:true + tools: content no longer carries the tool-call JSON, reasoning is classified separately, and tool calls stream as proper name-first tool_calls deltas. Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Assisted-by: Claude:claude-opus-4-8 [Claude Code] --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
1 parent b982c97 commit 73cfedc

5 files changed

Lines changed: 119 additions & 4 deletions

File tree

core/config/model_config.go

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -732,6 +732,17 @@ func (cfg *ModelConfig) SetDefaults(opts ...ConfigLoaderOption) {
732732
cfg.Proxy.Mode = ProxyModePassthrough
733733
}
734734

735+
// When templating is delegated to the backend (use_tokenizer_template),
736+
// the backend also owns tool-call grammar generation and parsing. Sending
737+
// a LocalAI-generated grammar alongside overrides the backend's native
738+
// (name-first) tool pipeline and makes it stream the tool-call JSON back as
739+
// plain content (issue #10052). The GGUF auto-import path already couples
740+
// these two flags; enforce it here so gallery and hand-written configs that
741+
// set use_tokenizer_template directly stay consistent.
742+
if cfg.TemplateConfig.UseTokenizerTemplate {
743+
cfg.FunctionsConfig.GrammarConfig.NoGrammar = true
744+
}
745+
735746
// Apply model-family-specific inference defaults before generic fallbacks.
736747
// This ensures gallery-installed and runtime-loaded models get optimal parameters.
737748
ApplyInferenceDefaults(cfg, cfg.Name, cfg.Model)

core/config/model_config_test.go

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -471,4 +471,33 @@ concurrency_groups:
471471
Expect(configs[0].GetConcurrencyGroups()).To(Equal([]string{"vram-heavy", "120b"}))
472472
})
473473
})
474+
475+
// When templating is delegated to the backend (use_tokenizer_template),
476+
// the backend also owns tool-call grammar generation and parsing. A
477+
// LocalAI-generated grammar sent alongside would override the backend's
478+
// native (name-first) tool pipeline and make it stream the tool-call JSON
479+
// back as plain content (issue #10052). SetDefaults must therefore couple
480+
// the two: tokenizer template implies grammar generation is disabled.
481+
Context("use_tokenizer_template couples with grammar disable (issue #10052)", func() {
482+
It("disables Go grammar generation when the tokenizer template is used", func() {
483+
cfg := &ModelConfig{
484+
TemplateConfig: TemplateConfig{UseTokenizerTemplate: true},
485+
}
486+
Expect(cfg.FunctionsConfig.GrammarConfig.NoGrammar).To(BeFalse())
487+
488+
cfg.SetDefaults()
489+
490+
Expect(cfg.FunctionsConfig.GrammarConfig.NoGrammar).To(BeTrue(),
491+
"use_tokenizer_template must imply grammar.disable so tools go to the backend's native pipeline")
492+
})
493+
494+
It("leaves grammar generation enabled when the tokenizer template is not used", func() {
495+
cfg := &ModelConfig{}
496+
497+
cfg.SetDefaults()
498+
499+
Expect(cfg.FunctionsConfig.GrammarConfig.NoGrammar).To(BeFalse(),
500+
"models that template in Go still rely on the Go-generated grammar")
501+
})
502+
})
474503
})

gallery/qwen3.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,13 @@ config_file: |
1717
# "pure content" PEG parser that leaks reasoning tags into content.
1818
options:
1919
- use_jinja:true
20+
# With use_tokenizer_template the backend (llama.cpp) owns tool-call
21+
# grammar generation and parsing too. Disabling LocalAI's own grammar lets
22+
# llama.cpp's native name-first tool pipeline run; otherwise the generated
23+
# grammar overrides it and the tool-call JSON leaks into content (#10052).
24+
function:
25+
grammar:
26+
disable: true
2027
template:
2128
use_tokenizer_template: true
2229
name: qwen3

pkg/functions/grammars/json_schema.go

Lines changed: 14 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -155,12 +155,22 @@ func (sc *JSONSchemaConverter) visit(schema map[string]any, name string, rootSch
155155
propName string
156156
propSchema map[string]any
157157
}) int {
158-
aOrder := propOrder[a.propName]
159-
bOrder := propOrder[b.propName]
160-
if aOrder != 0 && bOrder != 0 {
158+
// Use presence in the order map (not a non-zero sentinel) so that
159+
// the first listed key — index 0 — is honored. Keys present in
160+
// properties_order sort by their index and ahead of any key that
161+
// isn't listed; unlisted keys keep a stable alphabetical order.
162+
aOrder, aOK := propOrder[a.propName]
163+
bOrder, bOK := propOrder[b.propName]
164+
switch {
165+
case aOK && bOK:
161166
return cmp.Compare(aOrder, bOrder)
167+
case aOK:
168+
return -1
169+
case bOK:
170+
return 1
171+
default:
172+
return cmp.Compare(a.propName, b.propName)
162173
}
163-
return cmp.Compare(a.propName, b.propName)
164174
})
165175

166176
var rule strings.Builder

pkg/functions/grammars/json_schema_test.go

Lines changed: 58 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -547,3 +547,61 @@ realvalue
547547
})
548548
})
549549
})
550+
551+
var _ = Describe("JSON schema property ordering (issue #10052)", func() {
552+
// A function-call shaped schema. The grammar must honor the configured
553+
// properties_order. Before the fix, the sort guard `aOrder != 0 && bOrder != 0`
554+
// treated the first listed key (index 0) as "unset" and fell back to
555+
// alphabetical order, so "arguments" was emitted before "name" even when
556+
// properties_order put name first.
557+
const schema = `{
558+
"type": "object",
559+
"properties": {
560+
"name": {"type": "string"},
561+
"arguments": {"type": "object", "properties": {"cmd": {"type": "string"}}}
562+
}
563+
}`
564+
565+
// keyIndex finds the position of an object-key literal (escaped as \"key\"
566+
// in GBNF), which only appears where the key is emitted in the rule — not
567+
// in derived rule names like root-name.
568+
keyIndex := func(grammar, key string) int {
569+
return strings.Index(grammar, `\"`+key+`\"`)
570+
}
571+
572+
It("honors properties_order with name listed first (index 0)", func() {
573+
grammar, err := NewJSONSchemaConverter("name,arguments").GrammarFromBytes([]byte(schema))
574+
Expect(err).To(BeNil())
575+
ni := keyIndex(grammar, "name")
576+
ai := keyIndex(grammar, "arguments")
577+
Expect(ni).To(BeNumerically(">=", 0))
578+
Expect(ai).To(BeNumerically(">=", 0))
579+
Expect(ni).To(BeNumerically("<", ai),
580+
"properties_order lists name first, so the grammar must emit \"name\" before \"arguments\"")
581+
})
582+
583+
It("keeps alphabetical order when properties_order is empty", func() {
584+
grammar, err := NewJSONSchemaConverter("").GrammarFromBytes([]byte(schema))
585+
Expect(err).To(BeNil())
586+
// No explicit order: keys fall back to alphabetical, so "arguments"
587+
// precedes "name". This is the documented default and must not change.
588+
Expect(keyIndex(grammar, "arguments")).To(BeNumerically("<", keyIndex(grammar, "name")))
589+
})
590+
591+
It("sorts keys present in properties_order ahead of unlisted keys", func() {
592+
const schemaWithExtra = `{
593+
"type": "object",
594+
"properties": {
595+
"name": {"type": "string"},
596+
"arguments": {"type": "object", "properties": {"cmd": {"type": "string"}}},
597+
"aaa_unlisted": {"type": "string"}
598+
}
599+
}`
600+
// "aaa_unlisted" is alphabetically first but not in the order list, so
601+
// it must still come after the listed name/arguments keys.
602+
grammar, err := NewJSONSchemaConverter("name,arguments").GrammarFromBytes([]byte(schemaWithExtra))
603+
Expect(err).To(BeNil())
604+
Expect(keyIndex(grammar, "name")).To(BeNumerically("<", keyIndex(grammar, "arguments")))
605+
Expect(keyIndex(grammar, "arguments")).To(BeNumerically("<", keyIndex(grammar, "aaa_unlisted")))
606+
})
607+
})

0 commit comments

Comments
 (0)