fix: propagate tool validation errors in gateway refresh response (#5290)#5317
Conversation
|
Hi @MohanLaksh Few comments here
|
Lang-Akshay
left a comment
There was a problem hiding this comment.
Thanks for the PR @MohanLaksh .
The core fix is correct and surgically targeted.
) The _refresh_gateway_tools_resources_prompts method was discarding the 5th return value from _initialize_gateway() (validation_errors), causing GatewayRefreshResponse.validation_errors to always be empty. - Capture validation_errors instead of discarding with _ - Populate result['validation_errors'] after successful init - Add tests covering manual refresh, auto-refresh, and API response Closes #5290 Signed-off-by: Mohan Lakshmaiah <mohan.economist@gmail.com>
4fc2764 to
c669cff
Compare
msureshkumar88
left a comment
There was a problem hiding this comment.
The core fix is correct and well-scoped — capturing the 5th return value from _initialize_gateway() and propagating it to result["validation_errors"] is exactly the right two-line change. Good that three test paths are covered (service layer, auto-refresh, and route handler).
A few things worth tracking as follow-ups (none blocking):
- Missing warning log on validation errors: The exception path at L5816 already logs a
logger.warning, but the success path with non-emptyvalidation_errorsis silent. Operators have no log signal that tools were silently skipped without polling the API. Worth addingif validation_errors: logger.warning(...)after the assignment. - Test coverage: All three new tests use a single-item error list and mock
_initialize_gatewayentirely. A test with multiple errors and at least one that exercises_validate_tools()directly (e.g., tool name > 128 chars) would harden this against regressions in the validation logic itself. - Schema docstring (
schemas.py:3963):"List of validation errors encountered"doesn't document the string format or what triggers errors. Minor, but useful for API consumers. - Unrelated formatting changes at lines 3752, 4928, 5568–5596 in
gateway_service.pyadd diff noise — worth isolating in a separate chore commit next time.
Build s390x is stuck in QUEUED (infrastructure issue, not code) — may need a re-run to unblock merge. Otherwise this is ready.
msureshkumar88
left a comment
There was a problem hiding this comment.
E2E verification — real running gateway, real upstream MCP server
Ran this against two real, live mcp-context-forge instances (not mocks/unit tests) — one on this PR's commit (c669cff2b), one on its parent commit (8e9c049e9, i.e. pre-fix) — each talking to a real upstream MCP stdio server (bridged to SSE via mcpgateway.translate) that advertises one valid tool and one tool whose name is 129 characters (1 over the 128-char MCP-spec limit mcp-context-forge enforces).
Script: scripts/verify-pr-5317-e2e.sh (added for this verification, not committed)
Config used (exported, not the repo .env):
DATABASE_URL=sqlite:///<scratch>/e2e_pr5317.db
AUTH_REQUIRED=true, MCPGATEWAY_ADMIN_API_ENABLED=true, PLUGINS_ENABLED=false
SSRF_ALLOW_LOCALHOST=true, SSRF_ALLOW_PRIVATE_NETWORKS=true # needed for gateway → 127.0.0.1 upstream
JWT_SECRET_KEY=<test-only secret>
Commands run:
python -m mcpgateway.translate --stdio "uv run python mock_upstream_server.py" --expose-sse --port 44552
uv run uvicorn mcpgateway.main:app --host 127.0.0.1 --port 44551
python -m mcpgateway.utils.create_jwt_token --username admin@example.com --exp 10 --secret ...
curl -X POST /gateways # register gateway against the mock upstream
curl -X POST /gateways/{id}/tools/refresh # trigger the code path this PR changes
curl /tools?gateway_id={id} # sanity check the valid tool still landedResult on pre-fix commit 8e9c049e9 (reproduces #5290):
Server logs show validation genuinely ran and failed the tool —
ERROR - Validation failed for tool '<129-char name>': Tool name exceeds MCP spec limit of 128 characters (got 129)
WARNING - Tool validation completed with 1 error(s). Successfully validated 1 tool(s).
but the API response silently drops it:
"validationErrors": []Result on this PR's commit c669cff2b:
Same scenario, same server-side validation log lines, but now the API response correctly surfaces it:
"validationErrors": ["<129-char name>: Tool name exceeds MCP spec limit of 128 characters (got 129)"]valid_echo_tool is still registered in both cases (partial-success behavior unaffected).
Unit tests referenced in the PR description (all green):
pytest tests/unit/mcpgateway/services/test_gateway_service.py -k TestRefreshGatewayToolsResourcesPrompts -x → 9/9
pytest tests/unit/mcpgateway/services/test_gateway_auto_refresh.py -k TestAutoRefreshGatewayToolsResourcesPrompts -x → 9/9
pytest tests/unit/mcpgateway/test_main_extended.py -k test_refresh_gateway_tools -x → 2/2
This is a clean, minimal fix — confirmed end-to-end against a real gateway process and a real upstream MCP server, not just mocked unit tests. Approving.
🐛 Bug-fix PR
📌 Summary
Tool validation errors collected during gateway refresh were silently discarded — the
GatewayRefreshResponse.validation_errorsfield always returned an empty list, even when tools failed validation (e.g., name too long, schema too deeply nested).🔁 Reproduction Steps
Fixes #5290
POST /gateways/{id}/tools/refreshresponse.validation_errorsis always[]even when tools were skipped🐞 Root Cause
In
_refresh_gateway_tools_resources_prompts()atmcpgateway/services/gateway_service.py:5822, the 5th return value (validation_errors) from_initialize_gateway()was discarded with_:The
resultdict was initialized with"validation_errors": []at line 5740, but was never populated with the actual errors from_validate_tools().💡 Fix Description
Production fix (2 lines):
_→validation_errorsto capture the return valueresult["validation_errors"] = validation_errorsafter theexceptblock to populate the result on the success path📏 Reviewability
triage🧪 Verification
make ruffmake blackmake isortmake pre-commitpytest tests/unit/mcpgateway/services/test_gateway_service.py -k TestRefreshGatewayToolsResourcesPrompts -xpytest tests/unit/mcpgateway/services/test_gateway_auto_refresh.py -k TestAutoRefreshGatewayToolsResourcesPrompts -xpytest tests/unit/mcpgateway/test_main_extended.py -k test_refresh_gateway_tools -xTest Coverage
Three new tests validate the fix:
test_gateway_service.py::test_validation_errors_propagated— Mocks_initialize_gatewayto return non-empty validation errors and assertsresult["validation_errors"]contains themtest_gateway_auto_refresh.py::test_refresh_propagates_validation_errors— Same pattern viapatch.objectfor the auto-refresh pathtest_main_extended.py::test_refresh_gateway_tools_success_and_errors— Extended to assert both empty and non-emptyvalidation_errorsin the API response📐 MCP Compliance (if relevant)
✅ Checklist
make black isort pre-commit)