Skip to content

Commit dabd63c

Browse files
committed
docs: update constrained grammars with vLLM structured output support
Update the compatibility notice to include vLLM alongside llama.cpp. Add a vLLM-specific section with examples for all three supported methods: json_schema, json_object, and grammar (via xgrammar). Ref: #6857
1 parent a411f74 commit dabd63c

File tree

1 file changed

+58
-1
lines changed

1 file changed

+58
-1
lines changed

docs/content/features/constrained_grammars.md

Lines changed: 58 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,11 @@ url = "/features/constrained_grammars/"
1010
The `chat` endpoint supports the `grammar` parameter, which allows users to specify a grammar in Backus-Naur Form (BNF). This feature enables the Large Language Model (LLM) to generate outputs adhering to a user-defined schema, such as `JSON`, `YAML`, or any other format that can be defined using BNF. For more details about BNF, see [Backus-Naur Form on Wikipedia](https://en.wikipedia.org/wiki/Backus%E2%80%93Naur_form).
1111

1212
{{% notice note %}}
13-
**Compatibility Notice:** This feature is only supported by models that use the [llama.cpp](https://github.com/ggerganov/llama.cpp) backend. For a complete list of compatible models, refer to the [Model Compatibility]({{%relref "reference/compatibility-table" %}}) page. For technical details, see the related pull requests: [PR #1773](https://github.com/ggerganov/llama.cpp/pull/1773) and [PR #1887](https://github.com/ggerganov/llama.cpp/pull/1887).
13+
**Compatibility Notice:** Grammar and structured output support is available for the following backends:
14+
- **llama.cpp** — supports the `grammar` parameter (GBNF syntax) and `response_format` with `json_schema`/`json_object`
15+
- **vLLM** — supports the `grammar` parameter (via xgrammar), `response_format` with `json_schema` (native JSON schema enforcement), and `json_object`
16+
17+
For a complete list of compatible models, refer to the [Model Compatibility]({{%relref "reference/compatibility-table" %}}) page.
1418
{{% /notice %}}
1519

1620
## Setup
@@ -66,6 +70,59 @@ For more complex grammars, you can define multi-line BNF rules. The grammar pars
6670
- Character classes (`[a-z]`)
6771
- String literals (`"text"`)
6872

73+
## vLLM Backend
74+
75+
The vLLM backend supports structured output via three methods:
76+
77+
### JSON Schema (recommended)
78+
79+
Use the OpenAI-compatible `response_format` parameter with `json_schema` to enforce a specific JSON structure:
80+
81+
```bash
82+
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
83+
"model": "my-vllm-model",
84+
"messages": [{"role": "user", "content": "Generate a person object"}],
85+
"response_format": {
86+
"type": "json_schema",
87+
"json_schema": {
88+
"name": "person",
89+
"schema": {
90+
"type": "object",
91+
"properties": {
92+
"name": {"type": "string"},
93+
"age": {"type": "integer"}
94+
},
95+
"required": ["name", "age"]
96+
}
97+
}
98+
}
99+
}'
100+
```
101+
102+
### JSON Object
103+
104+
Force the model to output valid JSON (without a specific schema):
105+
106+
```bash
107+
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
108+
"model": "my-vllm-model",
109+
"messages": [{"role": "user", "content": "Generate a person as JSON"}],
110+
"response_format": {"type": "json_object"}
111+
}'
112+
```
113+
114+
### Grammar
115+
116+
The `grammar` parameter also works with vLLM via xgrammar:
117+
118+
```bash
119+
curl http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
120+
"model": "my-vllm-model",
121+
"messages": [{"role": "user", "content": "Do you like apples?"}],
122+
"grammar": "root ::= (\"yes\" | \"no\")"
123+
}'
124+
```
125+
69126
## Related Features
70127

71128
- [OpenAI Functions]({{%relref "features/openai-functions" %}}) - Function calling with structured outputs

0 commit comments

Comments
 (0)