Skip to content

Commit 9ffa5e3

Browse files
authored
feat(config): Make the gatekeeper model more configurable (#465)
Move the gatekeeper configuration into a nested member off our config object, so LINUX_MCP_GATEKEEPER_MODEL becomes LINUX_MCP_GATEKEEPER__MODEL (LINUX_MCP_GATEKEEPER_MODEL is supported as a deprecated alias.) Add controls for: reasoning_effort: turn off or down reasoning often make models perform better for us. structured_output: e.g. for gemma-4-31b-it, turning off response_format is needed to keep the model from going into infinite looop. temperature: Anthropc models need a non-zero temperature to enable reasoning. quantization: OpenRouter mixes together models with different quantization in a single model name - specifying a specific quantization is needed for clean benchmarking data. template_kwarg: Set model-specific values in the chat template - e.g. `{"enable_thinking": false}` is useful for llama.cpp.
1 parent 867d26b commit 9ffa5e3

12 files changed

Lines changed: 268 additions & 57 deletions

File tree

.gitlab/ci/eval-gatekeeper.yml

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,6 @@ download-secure-files:
3636
paths:
3737
- securefiles/
3838

39-
4039
# ==========================================
4140
# EVAL WORKFLOW - TEMPLATES
4241
# ==========================================
@@ -68,7 +67,7 @@ download-secure-files:
6867
- export VERTEXAI_PROJECT="rhel-lightspeed-650189"
6968
- export VERTEXAI_LOCATION="global"
7069
artifacts:
71-
paths:
70+
paths:
7271
- data/
7372
expire_in: 1 hour
7473

@@ -77,23 +76,22 @@ download-secure-files:
7776
# ==========================================
7877

7978
# Vertex AI
80-
gatekeeper-eval-gpt-oss-120b:
79+
gatekeeper-eval-gpt-oss-120b:
8180
extends: .eval-base
8281
variables:
8382
MODEL_NAME: "gpt-oss-120b"
8483
script:
85-
- export LINUX_MCP_GATEKEEPER_MODEL="vertex_ai/openai/$MODEL_NAME-maas"
84+
- export LINUX_MCP_GATEKEEPER__MODEL="vertex_ai/openai/$MODEL_NAME-maas"
8685
- uv run --extra gcp eval/gatekeeper/run-eval.py --all -f json --output-all -o "$CI_PROJECT_DIR/data/$MODEL_NAME.json"
8786

88-
gatekeeper-eval-gemini-3.1-pro-preview:
87+
gatekeeper-eval-gemini-3.1-pro-preview:
8988
extends: .eval-base
9089
variables:
9190
MODEL_NAME: "gemini-3.1-pro-preview"
9291
script:
93-
- export LINUX_MCP_GATEKEEPER_MODEL="vertex_ai/$MODEL_NAME"
92+
- export LINUX_MCP_GATEKEEPER__MODEL="vertex_ai/$MODEL_NAME"
9493
- uv run --extra gcp eval/gatekeeper/run-eval.py --all -f json --output-all -o "$CI_PROJECT_DIR/data/$MODEL_NAME.json"
9594

96-
9795
# Models.corp
9896
gatekeeper-eval-models-corp-granite-4.0-h-small:
9997
extends: .eval-base

docs/config-reference.md

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -56,10 +56,15 @@ See [Guarded Command Execution](guarded-command-execution.md) for details on the
5656
These are used when `LINUX_MCP_TOOLSET` is set to `run_script` or `both`.
5757

5858
| Option / Env Var | Default | Description |
59-
|------------------|---------|-------------|
60-
| `--gatekeeper-model`<br>`LINUX_MCP_GATEKEEPER_MODEL` | *(none)* | Required: [LiteLLM model name](https://docs.litellm.ai/docs/providers) to use |
59+
| ---------------- | ------- | ----------- |
6160
| `--always-confirm-scripts` / `--no-always-confirm-scripts`<br>`LINUX_MCP_ALWAYS_CONFIRM_SCRIPTS` | `False` | All scripts must be confirmed by the user |
62-
| Other environment variables | *(none)* | As required by the LiteLLM provider, e.g. `OPENAI_API_KEY` |
61+
| `--gatekeeper.model`<br>`LINUX_MCP_GATEKEEPER__MODEL` | _(none)_ | Required: [LiteLLM model name](https://docs.litellm.ai/docs/providers) to use |
62+
| `--gatekeeper.quantization`<br>`LINUX_MCP_GATEKEEPER__QUANTIZATION` | _(model specific)_ | _Not usually needed_ - Particular model quantization to use (openrouter only) |
63+
| `--gatekeeper.reasoning_effort`<br>`LINUX_MCP_GATEKEEPER__REASONING_EFFORT` | _(model specific)_ | Reasoning effort to use for gatekeeper model (`none`, `minimal`, `low`, `medium`, `high`, `xhigh`). Not all values are supported for all models. |
64+
| `--gatekeeper.structured_output`<br>`LINUX_MCP_GATEKEEPER__STRUCTURED_OUTPUT` | _(autodetected)_ | _Not usually needed_ - Whether to use structured output generation for the model. Default is to use if detected as available. |
65+
| `--gatekeeper.temperature`<br>`LINUX_MCP_GATEKEEPER__TEMPERATURE` | 0.0 | _Not usually needed_ - Temperature to use for model - for some models, a non-zero value may be necessary when enabling reasoning. |
66+
| `--gatekeeper.template_kwargs`<br>`LINUX_MCP_GATEKEEPER__TEMPLATE_KWARGS` | _(none)_ | _Not usually needed_ - Extra arguments for the model's chat template, formatted as a JSON string. Example: `{ "enable_thinking": false }` |
67+
| Other environment variables | _(none)_ | As required by the LiteLLM provider, e.g. `OPENAI_API_KEY` |
6368

6469
## Logging Configuration
6570

docs/guarded-command-execution.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -106,14 +106,14 @@ LINUX_MCP_TOOLSET=run_script
106106

107107
**Configure a Gatekeeper Model**
108108

109-
Set `LINUX_MCP_GATEKEEPER_MODEL` to the name of the model you want to use. Additional environment
109+
Set `LINUX_MCP_GATEKEEPER__MODEL` to the name of the model you want to use. Additional environment
110110
variables may be needed to configure credentials. See the
111111
[LiteLLM documentation](https://docs.litellm.ai/docs/providers) for details on how to configure your provider.
112112

113113
Example:
114114

115115
```sh
116-
LINUX_MCP_GATEKEEPER_MODEL=openai/chatgpt-5.2
116+
LINUX_MCP_GATEKEEPER__MODEL=openai/chatgpt-5.2
117117
OPENAI_API_KEY=<....>
118118
```
119119

eval/gatekeeper/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ Runs test cases through the gatekeeper and reports results.
6464

6565
```bash
6666
# Set the gatekeeper model
67-
export LINUX_MCP_GATEKEEPER_MODEL="openrouter/anthropic/claude-3.5-sonnet"
67+
export LINUX_MCP_GATEKEEPER__MODEL="openrouter/anthropic/claude-3.5-sonnet"
6868
6969
# Run evaluation on a single file
7070
uv run eval/gatekeeper/run-eval.py testcases/selinux-port-denial.yaml -o results.yaml

eval/gatekeeper/run-eval-models-corp.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,8 +38,8 @@ get_MC_base_url() {
3838
}
3939

4040
[[ -z "${MODEL}" ]] && echo "$MODEL must be set" && exit 1
41-
export LINUX_MCP_GATEKEEPER_MODEL
42-
LINUX_MCP_GATEKEEPER_MODEL="openai/$MODEL"
41+
export LINUX_MCP_GATEKEEPER__MODEL
42+
LINUX_MCP_GATEKEEPER__MODEL="openai/$MODEL"
4343
export OPENAI_API_BASE
4444
OPENAI_API_BASE="$(get_MC_base_url "$MODEL")"
4545

eval/gatekeeper/run-eval.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -219,9 +219,9 @@ def main(
219219
typer.echo("Must specify either a test case file or --all.", err=True)
220220
raise typer.Exit(code=1)
221221

222-
if "LINUX_MCP_GATEKEEPER_MODEL" not in os.environ:
222+
if "LINUX_MCP_GATEKEEPER__MODEL" not in os.environ and "LINUX_MCP_GATEKEEPER_MODEL" not in os.environ:
223223
typer.echo(
224-
"Please set the LINUX_MCP_GATEKEEPER_MODEL environment variable to specify the Gatekeeper model to use."
224+
"Please set the LINUX_MCP_GATEKEEPER__MODEL environment variable to specify the Gatekeeper model to use."
225225
)
226226
raise typer.Exit(code=1)
227227

src/linux_mcp_server/config.py

Lines changed: 59 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,14 @@
11
"""Settings for linux-mcp-server"""
22

3+
import logging
4+
import os
35
import sys
46

57
from pathlib import Path
8+
from typing import Any
69

710
from pydantic import Field
11+
from pydantic import model_validator
812
from pydantic import SecretStr
913
from pydantic_settings import BaseSettings
1014
from pydantic_settings import SettingsConfigDict
@@ -13,6 +17,9 @@
1317
from linux_mcp_server.utils.types import UpperCase
1418

1519

20+
logger = logging.getLogger(__name__)
21+
22+
1623
class Transport(StrEnum):
1724
stdio = "stdio"
1825
http = "http"
@@ -27,6 +34,18 @@ class Toolset(StrEnum):
2734
BOTH = "both"
2835

2936

37+
class ReasoningEffort(StrEnum):
38+
"""Reasoning effort levels for the gatekeeper model."""
39+
40+
NONE = "none"
41+
MINIMAL = "minimal"
42+
LOW = "low"
43+
MEDIUM = "medium"
44+
HIGH = "high"
45+
XHIGH = "xhigh"
46+
DEFAULT = "default"
47+
48+
3049
class AuthProvider(StrEnum):
3150
"""Authentication provider types."""
3251

@@ -78,6 +97,27 @@ class AuthConfig(BaseSettings):
7897
introspection: IntrospectionAuthConfig | None = None
7998

8099

100+
class GatekeeperConfig(BaseSettings):
101+
"""Gatekeeper Model configuration"""
102+
103+
model: str | None = None
104+
105+
# model quantization (e.g. fp8, bf16 - only supported for openrouter)
106+
quantization: str | None = None
107+
108+
# reasoning effort
109+
reasoning_effort: ReasoningEffort | None = None
110+
111+
# Whether we should use structured output (default, autodetect support)
112+
structured_output: bool | None = None
113+
114+
# dict of extra template keyword arguments
115+
template_kwargs: dict[str, Any] = Field(default_factory=dict)
116+
117+
# Temperature for gatekeeper model
118+
temperature: float = 0.0
119+
120+
81121
class Config(BaseSettings):
82122
# The '_' is required in the env_prefix, otherwise, pydantic would
83123
# interpret the prefix as LINUX_MCPLOG_DIR, instead of LINUX_MCP_LOG_DIR
@@ -127,8 +167,7 @@ class Config(BaseSettings):
127167
# What tools are available
128168
toolset: Toolset = Toolset.FIXED
129169

130-
# Gatekeeper model (required for run_script tools)
131-
gatekeeper_model: str | None = None
170+
gatekeeper: GatekeeperConfig = Field(default_factory=GatekeeperConfig)
132171

133172
# Command execution timeout (applies to both local and remote commands)
134173
command_timeout: int = 30 # Timeout in seconds; prevents hung commands
@@ -165,9 +204,25 @@ def transport_kwargs(self):
165204
#
166205
# @model_validator(mode="after")
167206
# def validate_gatekeeper_model(self):
168-
# if self.toolset != Toolset.FIXED and self.gatekeeper_model is None:
169-
# raise ValueError('gatekeeper_model must be set unless the toolset is "fixed"')
207+
# if self.toolset != Toolset.FIXED and self.gatekeeper.model is None:
208+
# raise ValueError('gatekeeper.model must be set unless the toolset is "fixed"')
170209
# return self
171210

211+
@model_validator(mode="before")
212+
@staticmethod
213+
def handle_deprecated_aliases(data: Any) -> Any:
214+
if isinstance(data, dict):
215+
old_value = os.environ.get("LINUX_MCP_GATEKEEPER_MODEL")
216+
if old_value is not None:
217+
logger.warning(
218+
"LINUX_MCP_GATEKEEPER_MODEL is deprecated. Please use LINUX_MCP_GATEKEEPER__MODEL instead.",
219+
)
220+
221+
gatekeeper_data = data.setdefault("gatekeeper", {})
222+
if isinstance(gatekeeper_data, dict) and "model" not in gatekeeper_data:
223+
gatekeeper_data["model"] = old_value
224+
225+
return data
226+
172227

173228
CONFIG = Config()

src/linux_mcp_server/gatekeeper/check_run_script.py

Lines changed: 49 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,7 @@
11
import logging
22

3+
from typing import Any
4+
35
import litellm
46

57
from litellm import Choices
@@ -9,6 +11,7 @@
911
from pydantic import BaseModel
1012

1113
from linux_mcp_server.config import CONFIG
14+
from linux_mcp_server.config import ReasoningEffort
1215
from linux_mcp_server.utils import StrEnum
1316

1417

@@ -23,10 +26,10 @@
2326

2427

2528
def get_model() -> str:
26-
if CONFIG.gatekeeper_model is None:
27-
raise ValueError("To use run_script tools, you must set gatekeeper_model in the linux-mcp-server config")
29+
if CONFIG.gatekeeper.model:
30+
return CONFIG.gatekeeper.model
2831
else:
29-
return CONFIG.gatekeeper_model
32+
raise ValueError("To use run_script tools, you must set LINUX_MCP_GATEKEEPER__MODEL")
3033

3134

3235
READONLY_INSTRUCTION = """
@@ -184,6 +187,42 @@ def parse_from_description(cls, description: str) -> "GatekeeperResult":
184187
return cls(status=status, detail=detail)
185188

186189

190+
def _build_completion_kwargs():
191+
extra_kwargs: dict[str, Any] = {}
192+
model = get_model()
193+
194+
structured_output = CONFIG.gatekeeper.structured_output
195+
if structured_output is None:
196+
params = get_supported_openai_params(model=model)
197+
structured_output = params is not None and "response_format" in params
198+
199+
if structured_output:
200+
extra_kwargs["response_format"] = GatekeeperResult
201+
202+
reasoning_effort = CONFIG.gatekeeper.reasoning_effort
203+
if reasoning_effort is not None:
204+
if model.startswith("openrouter/"):
205+
if reasoning_effort == ReasoningEffort.NONE:
206+
extra_kwargs["reasoning"] = {"enabled": False}
207+
else:
208+
extra_kwargs["reasoning"] = {"enabled": True, "effort": reasoning_effort.value}
209+
else:
210+
extra_kwargs["reasoning_effort"] = reasoning_effort.value
211+
212+
if model.startswith("openrouter/"):
213+
provider: dict[str, Any] = {
214+
"require_parameters": True,
215+
}
216+
extra_kwargs["provider"] = provider
217+
if CONFIG.gatekeeper.quantization:
218+
provider["quantizations"] = [CONFIG.gatekeeper.quantization]
219+
220+
if CONFIG.gatekeeper.template_kwargs:
221+
extra_kwargs["chat_template_kwargs"] = CONFIG.gatekeeper.template_kwargs
222+
223+
return extra_kwargs
224+
225+
187226
def check_run_script(description: str, script_type: str, script: str, *, readonly: bool) -> GatekeeperResult:
188227
# Check that the script does what is described
189228
if "start_of_script" in script.lower() or "end_of_script" in script.lower():
@@ -207,13 +246,14 @@ def check_run_script(description: str, script_type: str, script: str, *, readonl
207246

208247
messages = [{"role": "user", "content": prompt}]
209248

210-
params = get_supported_openai_params(model=get_model())
211-
if params is not None and "response_format" in params:
212-
response_format = GatekeeperResult
213-
else:
214-
response_format = None
249+
extra_kwargs = _build_completion_kwargs()
215250

216-
response = completion(model=get_model(), messages=messages, response_format=response_format, temperature=0)
251+
response = completion(
252+
model=get_model(),
253+
messages=messages,
254+
temperature=CONFIG.gatekeeper.temperature,
255+
**extra_kwargs,
256+
)
217257
assert isinstance(response, ModelResponse)
218258
assert isinstance(response.choices[0], Choices)
219259
response_text = (response.choices[0].message.content or "").strip()

src/linux_mcp_server/server.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -178,8 +178,8 @@ def _current_toolset():
178178

179179

180180
def _check_gatekeeper_model():
181-
if CONFIG.toolset != Toolset.FIXED and CONFIG.gatekeeper_model is None:
182-
logger.error("LINUX_MCP_GATEKEEPER_MODEL not set, this is needed for run_script tools")
181+
if CONFIG.toolset != Toolset.FIXED and CONFIG.gatekeeper.model is None:
182+
logger.error("LINUX_MCP_GATEKEEPER__MODEL not set, this is needed for run_script tools")
183183
sys.exit(1)
184184

185185

tests/conftest.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
# Register script tools on the in-process MCP instance used by ``mcp_client``.
99
# Default CLI/config is FIXED-only; tests need validate_script / run_script / etc.
1010
os.environ.setdefault("LINUX_MCP_TOOLSET", "both")
11-
os.environ.setdefault("LINUX_MCP_GATEKEEPER_MODEL", "test/gatekeeper-placeholder")
11+
os.environ.setdefault("LINUX_MCP_GATEKEEPER__MODEL", "test/gatekeeper-placeholder")
1212

1313
import pytest
1414

0 commit comments

Comments
 (0)