From c9dd1449c5dc566eb7b1a54def1aea36e83785fb Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?=E2=80=9CShauna=20Diaz=E2=80=9D?= Date: Tue, 14 Apr 2026 13:38:38 -0400 Subject: [PATCH] OSDOCS-18945-2: adds 2nd half troubleshooting MCP gateway --- .../proc-mcp-gateway-register-mcp-server.adoc | 5 + modules/proc-mcp-gateway-ts-authn-issues.adoc | 165 ++++++++++++++ modules/proc-mcp-gateway-ts-authz-issues.adoc | 128 +++++++++++ ...gateway-ts-ext-mcp-server-auth-issues.adoc | 69 ++++++ ...eway-ts-ext-mcp-server-connect-issues.adoc | 98 +++++++++ ...oc-mcp-gateway-ts-extension-not-ready.adoc | 68 ++++++ ...roc-mcp-gateway-ts-on-prem-mcp-server.adoc | 201 ++++++++++++++++++ ...-ext-mcp-server-mcpserverregistration.adoc | 5 + .../mcp-gateway-troubleshooting.adoc | 15 +- 9 files changed, 753 insertions(+), 1 deletion(-) create mode 100644 modules/proc-mcp-gateway-ts-authn-issues.adoc create mode 100644 modules/proc-mcp-gateway-ts-authz-issues.adoc create mode 100644 modules/proc-mcp-gateway-ts-ext-mcp-server-auth-issues.adoc create mode 100644 modules/proc-mcp-gateway-ts-ext-mcp-server-connect-issues.adoc create mode 100644 modules/proc-mcp-gateway-ts-extension-not-ready.adoc create mode 100644 modules/proc-mcp-gateway-ts-on-prem-mcp-server.adoc diff --git a/modules/proc-mcp-gateway-register-mcp-server.adoc b/modules/proc-mcp-gateway-register-mcp-server.adoc index 0823d94871da..128c709655d9 100644 --- a/modules/proc-mcp-gateway-register-mcp-server.adoc +++ b/modules/proc-mcp-gateway-register-mcp-server.adoc @@ -93,6 +93,11 @@ spec: * Replace the `spec.targetRef.namespace:` field value with the namespace where your `HTTPRoute` CR is applied. In this example, `__` is used. * Replace the `credentialRef.name:` field value with the name of your `Secret` CR. In this example, `__` is used. You can omit this parameter if your MCP server does not require authentication or authorization. * For more information about these parameters, see "Understanding the `MCPServerRegistration` custom resource." ++ +[IMPORTANT] +==== +A `toolPrefix` value cannot include spaces or special characters. +==== . Apply the CR by running the following command: + diff --git a/modules/proc-mcp-gateway-ts-authn-issues.adoc b/modules/proc-mcp-gateway-ts-authn-issues.adoc new file mode 100644 index 000000000000..312f5b82e572 --- /dev/null +++ b/modules/proc-mcp-gateway-ts-authn-issues.adoc @@ -0,0 +1,165 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-authn-issues_{context}"] += Troubleshooting {mcpg} authentication issues + +[role="_abstract"] +Authentication errors can happen in a variety of ways, including silent failures, broken sessions, and tool-access denials, depending on your backend MCP server setup. You can troubleshoot common problems by checking your connections and custom resource (CR) configurations. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You installed the {oc-first}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You registered an MCP server. +* You created a `Secret` CR for authentication. + +.Procedure + +. When your clients cannot discover OAuth configuration, discovery is not working. Use the following steps to troubleshoot this situation: + +.. Retrieve the JSON object that lists the security requirements for your backend MCP server by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ curl http://__/_<.well-known/oauth-protected-resource>_ +---- ++ +* Replace `__` with the name of the host for your MCP server. +* Replace `_<.well-known/oauth-protected-resource>_` with the reserved path and JSON file that describes the OAuth 2.0 security requirements for the MCP server. ++ +.Example output +[source,text] +---- +{ + "resource": "https://mcp.example.com", + "authorization_servers": [ + "https://auth.provider.com/oauth2/default" + ], + "scopes_supported": ["mcp:tools", "mcp:resources"], + "bearer_methods_supported": ["header"], + "resource_documentation": "https://docs.example.com/mcp-help" +} +---- + +.. Check that your associated `HTTPRoute` CR includes a path for your `/.well-known/oauth-protected-resource` by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get httproute __ -o jsonpath='{.spec.rules[*].matches[*].path.value}' +---- ++ +Replace `__` with the associated `HTTPRoute` CR. + +.. Check the specific `AuthPolicy` CR configuration by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe authpolicy __ -n __ +---- + +.. Check that you excluded all `/.well-known` paths from your `AuthPolicy` CR by trying to reach the endpoint without any credentials by using the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ curl -o /dev/null -s -w "%{http_code}\n" http://__/_<.well-known/oauth-protected-resource>_ +---- ++ +* Replace `__` with the name of the host for your MCP server. +* Replace `_<.well-known/oauth-protected-resource>_` with the reserved path and JSON file that describes the OAuth 2.0 security requirements for the MCP server. ++ +[NOTE] +==== +The following codes are examples of possible outputs: + +* `200`: Means that the exclusion exists and matches. +* `401`: Means that your `AuthPolicy` CR is still demanding a token for this path. The exclusion is either not present or not working. +* `404`: The exclusion might be present and working, but the `HTTPRoute` CR does not point to that path to a valid backend. +==== + +.. Optional. Check all MCP broker component environment variables by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get deployment __ -n __ -o yaml | grep -A 10 env +---- +* Replace `__` with the name of your {mcpg} deployment. +* Replace `__` with the namespace where your {mcpg} deployment is applied. +//Q: do we actually make Deployment or DeploymentConfig objects for MCP gateway? If yes, when? + +.. Check that the `OAUTH_*` environment variables are set on your MCP broker component. ++ +[source,terminal,subs="+quotes"] +---- +$ oc set env deployment/__ --list +---- ++ +Replace `__` with the name of your {mcpg} deployment. + +.. Verify that the MCP broker component pod restarted after any environment variable changes. + +. If your valid tokens are being rejected with `401` errors, your JWT token validation is failing. Use the following steps to troubleshoot this situation: + +.. List the `AuthPolicy` CRs by running the following command: ++ +[source,terminal] +---- +$ oc get authpolicy -A +---- + +.. Check the specific `AuthPolicy` CR configuration by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe authpolicy __ -n __ +---- + +.. Verify that the `issuerUrl` in the `AuthPolicy` CR matches your identity provider's `realm`. + +.. Check the Authorino Operator logs by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n kuadrant-system -l authorino-resource=authorino +---- + +.. Decode JWT to verify claims by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ echo "__" | cut -d. -f2 | base64 -d | jq +---- ++ +Replace `__` with your token. + +.. Ensure that your issuer URL is reachable from the cluster by using the `cluster-local` service name. + +.. Check the token expiration time by examining the `exp` claim. + +.. Verify the audience, if required, by using an `aud` claim. + +.. Ensure that the token includes all required claims such as `groups`, `email`, and so on. + +. If your `401 Unauthorized` responses do not include OAuth discovery information, the `WWW-Authenticate` header is missing. This usually means that your `AuthPolicy` CR is not properly configured. Use the following steps to troubleshoot this situation: + +.. Isolate the failure point by using verbose output which lists the TLS handshake, the HTTP headers, and the server response code by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ curl -v http://__/mcp \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}' +---- ++ +Replace `__` with the hostname for your backend MCP server. + +.. Verify that your `AuthPolicy` CR `spec.response.unauthenticated.headers:` list includes `response.unauthenticated.headers.WWW-Authenticate`. + +.. Check that your `AuthPolicy` CR `response.unauthenticated.headers.WWW-Authenticate.value` includes the correct `metadata`. + +.. Ensure that your `AuthPolicy` CR is applied to the correct `Gateway` object and listener. diff --git a/modules/proc-mcp-gateway-ts-authz-issues.adoc b/modules/proc-mcp-gateway-ts-authz-issues.adoc new file mode 100644 index 000000000000..c4843c66c50b --- /dev/null +++ b/modules/proc-mcp-gateway-ts-authz-issues.adoc @@ -0,0 +1,128 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-authz-issues_{context}"] += Troubleshooting {mcpg} authorization issues + +[role="_abstract"] +Authorization errors can happen in a variety of ways, including an authenticated user getting `403` errors for all tool calls, authorization checks not enforced, or authorization failing with `CEL` evaluation errors. You can troubleshoot these problems by checking your configurations and troubleshooting `CEL`. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You installed the {oc-first}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You registered an MCP server. +* You created a `Secret` CR for authentication. + +.Procedure + +. Check your `AuthPolicy` custom resource (CR) authorization rules by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get authpolicy __ -n __ -o yaml | grep -A 20 authorization +---- ++ +* Replace `__` with the name of your `AuthPolicy` CR. +* Replace `__ `with the namespace where your `AuthPolicy` CR is applied. + +. Check the Authorino Operator logs for CEL evaluation by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n kuadrant-system -l authorino-resource=authorino | grep -i authz +---- + +. Ensure that the Authorino Operator can communicate with your identity server. + +. Verify that your JWT token includes `resource_access[server-name].roles` claims. + +. When your authorization checks are not enforced, first check your `AuthPolicy` CR status by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe authpolicy __ -n __ +---- ++ +* Replace `__` with the name of your `AuthPolicy` CR. +* Replace `__ `with the namespace where your `AuthPolicy` CR is applied. + +. Next, verify your `AuthPolicy` CR targets the correct resource by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get authpolicy __ -n __ -o yaml | grep -A 5 targetRef +---- ++ +* Replace `__` with the name of your `AuthPolicy` CR. +* Replace `__ `with the namespace where your `AuthPolicy` CR is applied. + +. Ensure that your `AuthPolicy` CR `targetRef` matches your `Gateway` object name and namespace by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ echo "--- AuthPolicy Targets ---" && \ +oc get authpolicy -n __ -o jsonpath='{range .items[*]}{.metadata.name}{"\t targets -> \t"}{.spec.targetRef.kind}{"/"}{.spec.targetRef.name}{"\n"}{end}' && \ +echo "--- Actual Gateways ---" && \ +oc get gateway -n __ -o custom-columns=NAME:.metadata.name +---- ++ +Replace `__` with the name of your {mcpg} deployment. + +. If your `AuthPolicy` CR and your `Gateway` object are in different namespaces, you must either move the `AuthPolicy` CR into the same namespace as the Gateway object, or target the `HTTPRoute` CR instead. + +. Check that your `AuthPolicy` CR `sectionName` matches your `Gateway` object's listener name. ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe authpolicy __ -n __ +---- ++ +Replace `__` with the name of your `AuthPolicy` CR. +Replace `__` with the name of your {mcpg} deployment. + +. Examine the `Status` block for an entry about your listener. If the `sectionName` is wrong, the policy shows `"Accepted"`, but the policy does not affect the intended traffic path. + +. Check that Kuadrant Operator is working by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get pods -n kuadrant-system +---- ++ +.Example output +[source,text] +---- +NAME READY STATUS RESTARTS AGE +authorino-78c5679f94-abc12 1/1 Running 0 5d +dns-operator-controller-manager-5d4789f6-x1y2z 1/1 Running 0 5d +kuadrant-operator-controller-manager-8495bc4d-98765 1/1 Running 0 5d +limitador-67f89bc5d4-z9w8v 1/1 Running 0 5d +---- ++ +* If the `authorino-*` pod shows `CrashLoopBackOff`, it either cannot reach your OIDC issuer or has an invalid configuration. +* If the `kuadrant-operator-controller-manager-*` pod is down, any changes you make to your `AuthPolicy` CR cannot be applied to the Gateway object because the controller pod reconciles your `AuthPolicy` CR. + +. Remedy pod issues as required. + +. Check the Authorino Operator logs for `CEL` errors by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n kuadrant-system -l authorino-resource=authorino | grep -i cel +---- + +. Verify the CEL syntax in your authorization `rules`. + +. Check that referenced fields exist, such as `auth.identity.groups`. + +. Ensure that the `metadata` `source` is accessible and returns the expected structure. + +. Test `CEL` expression syntax using online validators. + +. Add persistent logging to understand the `CEL` evaluation context. diff --git a/modules/proc-mcp-gateway-ts-ext-mcp-server-auth-issues.adoc b/modules/proc-mcp-gateway-ts-ext-mcp-server-auth-issues.adoc new file mode 100644 index 000000000000..e2dd4a579414 --- /dev/null +++ b/modules/proc-mcp-gateway-ts-ext-mcp-server-auth-issues.adoc @@ -0,0 +1,69 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-ext-mcp-server-auth-issues_{context}"] += Troubleshooting external MCP server authentication issues + +[role="_abstract"] +If your registered external MCP server fails authentication and returns `401` or `403` errors, troubleshoot by checking the credentials and the logs. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You installed the {oc-first}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You registered an external MCP server. +* You created a `Secret` CR for authentication. + +.Procedure + +. Check that a Secret custom resource (CR) exists and that it has the correct `label` by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get secret __ -n __ --show-labels +---- ++ +* Replace `__` with the name of your `Secret` CR. +* Replace `__` with the namespace where your `Secret` CR is applied. + +. Verify the `Secret` CR contents by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get secret __ -n __ -o yaml +---- ++ +* Replace `__` with the name of your `Secret` CR. +* Replace `__` with the namespace where your `Secret` CR is applied. + +. Check the `MCPServerRegistration` CR `credentialRef` parameter value by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpsr __ -n __ -o yaml | grep -A 3 credentialRef +---- ++ +* Replace `__` with the name of your `MCPServerRegistration` CR. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +. Ensure that the `Secret` CR has the label `mcp.kuadrant.io/secret: "true"`. + +. Verify that the `Secret` CR data key matches the `credentialRef.key` in the `MCPServerRegistration` CR. + +. Check the credential format. + +. Verify that the credential you are using has the necessary permissions for the external service. + +. Check the MCP gateway broker component logs for credential errors by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n __ deployment/mcp-gateway | grep -i auth` +---- ++ +Replace `__` with the name of your MCP gateway deployment. diff --git a/modules/proc-mcp-gateway-ts-ext-mcp-server-connect-issues.adoc b/modules/proc-mcp-gateway-ts-ext-mcp-server-connect-issues.adoc new file mode 100644 index 000000000000..b29f1c43cf9a --- /dev/null +++ b/modules/proc-mcp-gateway-ts-ext-mcp-server-connect-issues.adoc @@ -0,0 +1,98 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-ext-mcp-server-connect-issues_{context}"] += Troubleshooting external MCP server authentication issues + +[role="_abstract"] +If your external MCP server cannot connect, or if tools are not appearing, you can troubleshoot by checking custom resources (CRs) and network settings. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You installed the {oc-first}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You registered an external MCP server. +* You created a `Secret` CR for authentication. + +.Procedure + +. If you are seeing errors such as `502 Bad Gateway` and `403 Forbidden`, check the `ServiceEntry` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get serviceentry -n __ +---- ++ +Replace with the `__` where the `ServiceEntry` CR is applied. + +.. If the command returns an empty list, create a `ServiceEntry` CR to allow egress. + +. For detailed troubleshooting, such as to checks ports and the TLS setting, examine the `ServiceEntry` CR activity by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe serviceentry __ -n __ +---- ++ +* Replace with the `__` with the name the `ServiceEntry` CR. +* Replace with the `__` where the `ServiceEntry` CR is applied. + +. Verify that `ServiceEntry` CR `hosts` values are exact matches external `hostname` values. + +. If you are seeing `TLS` errors, verify that a `DestinationRule` CR exists by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get destinationrule -n __ +---- ++ +Replace with the `__` where the `DestinationRule` CR is applied. + +.. If the previous command returns `No resources found in namespace`, then create a `DestinationRule` CR. + +.. Ensure that `DestinationRule` CR `host` values exactly match `ServiceEntry` CR host values. + +.. Check the `TLS` configuration in `DestinationRule` CR. + +. For detailed troubleshooting of actual connection details, such as encryption, load balancing, and connection pooling, examine the `DestinationRule` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe destinationrule __ -n +---- ++ +* Replace with the `__` with the name the `DestinationRule` CR. +* Replace with the `__` where the `DestinationRule` CR is applied. + +.. Ensure that your network egress policies allow external traffic from the pod. + +. Test the DNS resolution by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc run -it --rm debug --image=__ --restart=Never -- \ + nslookup __ +---- ++ +* Replace `__` with the path to your application image. +* Replace `__` with your external MCP server `hostname` parameter value. + +. Test the external connectivity by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc run -it --rm debug --image=__ --restart=Never -- \ + curl -v https://__ +---- ++ +* Replace `__` with the path to your application image. +* Replace `__` with your external MCP server URL. + +. Verify that the `HTTPRoute` CR `backendRef` value is configured with the correct external hostname. + +. Ensure that the `HTTPRoute` CR has the `URLRewrite` filter applied to rewrite to the external hostname. diff --git a/modules/proc-mcp-gateway-ts-extension-not-ready.adoc b/modules/proc-mcp-gateway-ts-extension-not-ready.adoc new file mode 100644 index 000000000000..5bf24e3c5eaa --- /dev/null +++ b/modules/proc-mcp-gateway-ts-extension-not-ready.adoc @@ -0,0 +1,68 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-extension-not-ready_{context}"] += Troubleshooting an MCPGatewayExtension status of not ready + +[role="_abstract"] +You can troubleshoot when your `MCPGatewayExtension` custom resource (CR) shows a `Ready: False` state by running a few commands. + +Common causes include the following errors and indicate an associated action: + +* `InvalidMCPGatewayExtension`: This often means that the `targetRef` points to a Gateway object that does not exist, or you have a typing error in the `kind` or `group`. +* `ReferenceGrantRequired`: This occurs if your extension is in one namespace but is trying to target a `Gateway` object in another. To fix this, you must apply a `ReferenceGrant` in the `Gateway` object namespace. +* `Conflict`: Only one `MCPGatewayExtension` can target a specific `Gateway` object. If another extension is already pointing to the `Gateway` object you configured with a new extension, the new one fails. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You installed the {oc-first}. +* You registered an MCP server. + +.Procedure + +. Check the status of the specific `MCPGatewayExtension` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpgatewayextension -n __ +---- ++ +Replace `__` with the namespace where your `MCPGatewayExtension` CR is applied. + +. Check for conflicting `MCPGatewayExtension` CRs by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpgatewayextension -A +---- ++ +There must be only one extension per namespace. + +. Verify that the target Gateway object exists by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get gateway -n __ +---- ++ +Replace `__` with the namespace where your `Gateway` object is applied. + +. Check the `MCPServerRegistration` CR namespace and route status by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe mcpsr __ -n __ +---- ++ +* Replace `__` with the names of your `MCPServerRegistration` CR is applied. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +. Verify that an `MCPGatewayExtension` CR exists in the same namespace as the `MCPServerRegistration` CR. If the two do not share the same namespace, you must create a `ReferenceGrant` CR. + +. Ensure that the `MCPGatewayExtension` CR targets the `Gateway` object that the `HTTPRoute` CR is attached to. diff --git a/modules/proc-mcp-gateway-ts-on-prem-mcp-server.adoc b/modules/proc-mcp-gateway-ts-on-prem-mcp-server.adoc new file mode 100644 index 000000000000..8131f4e5fcae --- /dev/null +++ b/modules/proc-mcp-gateway-ts-on-prem-mcp-server.adoc @@ -0,0 +1,201 @@ +// Module included in the following assemblies: +// +// *mcp_gateway_config/mcp-gateway-troubleshooting.adoc + +:_mod-docs-content-type: PROCEDURE +[id="proc-mcp-gateway-ts-on-prem-mcp-server_{context}"] += Troubleshooting on-premise MCP server registration issues + +[role="_abstract"] +When your on-premise MCP server is not discovered by your {mcpg} after you registered it, or if you are having trouble with your tools, you can troubleshoot by checking for common problems. Basic steps include making sure that your backend server is available, that routing is applied correctly, and that tool prefix headers are labeled correctly. + +.Prerequisites + +* You installed {mcpg}. +* You installed {prodname}. +* You configured a `Gateway` object. +* You configured an `HTTPRoute` object for the gateway. +* You installed the {oc-first}. +* You registered an MCP server. + +.Procedure + +. If tools from your on-premise MCP server do not appear in `tools/list`, check that the MCP server is properly discovered by {mcpg} components by running the following command: + +** List the `MCPServerRegistration` CRs by running the following command: ++ +[source,terminal] +---- +$ oc get mcpsr -A +---- + +** Get the detailed status and configuration of your `MCPServerRegistration` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe mcpserverregistration __ -n __ +---- ++ +* Replace `__` with the `name` of your MCP server. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +. Verify that the `MCPServerRegistration` CR `targetRef` points to correct `HTTPRoute` name and namespace by running the following commands: + +.. Get the `name` and `namespace` of the `HTTPRoute` CR that the `MCPServerRegistration` CR is attempting to attach to by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpserverregistration __ -n __ -o jsonpath='{.spec.targetRef.name}{"\n"}{.spec.targetRef.namespace}{"\n"}' +---- ++ +* Replace `__` with the `name` of your MCP server. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +.. Cross-reference the `HTTPRoute` details from of the previous command with your `HTTPRoute` CRs by verifying that a route actually exists with those exact details by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get httproute __ -n __ +---- ++ +* Replace `__` with the value from the previous step's output. +* Replace `__` with the value from the previous step's output. + +.. If you get a `NotFound` server `Error` from the previous step, correct your `MCPServerRegistration` CR. + +. Verify that the MCP gateway controller component successfully bound the extension and route together by checking the status with the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpserverregistration __ -n __ -o jsonpath='{.status.conditions[?(@.type=="Accepted")]}' +---- +* Replace `__` with the `name` of your MCP server. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +.. If you get a `status: False` output, the MCP gateway controller component found the route but could not use it. Check that you created a `ReferenceGrant` CR if the `MCPGatewayExtension` and `HTTPRoute` CRs are in different namespaces. + +. Check that the backend MCP server is running by entering the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get pods -n __ +---- + +. If your MCP server pods are crashing, check for the following conditions: + +** `CrashLoopBackOff`: This generally means that your MCP server is starting but then crashing. This problem can be caused by a missing environment variable or by permission issues. +** `Pending`: This might mean that the pod has not started because of a resource issue in the cluster. +** `ErrImagePull`: Either the cluster cannot reach the registry, or you do not have permission to pull the image. Check your {ocp} settings and network configuration. +** If your pod has a `Running` status with a high restart count, your MCP server might be unstable because of memory constraints or connectivity issues. Check your {ocp} settings and network configuration. + +. Verify that the targeted application of your `MCPServerRegistration` CR actually exists and has a valid entry point by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get svc -n __ __ +---- ++ +* Replace `__` with the namespace where your application runs. +* Replace `__` with the name of your application. + +. Check that your application has a standard `ClusterIP` assigned to handle load balancing across pods. + +. Compare the ports to your `MCPServerRegistration` CR to make sure that they match and that either the `TCP` or `SCTP` protocol is used. + +. Check that the selectors on the `Service` CR for your application match the labels on your pods by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get endpoints __ -n __ +---- ++ +* Replace `__` with the name of your application. +* Replace `__` with the namespace where your application runs. + +. Check that the attached `HTTPRoute` CR has valid backend reference by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc describe httproute __ +---- + +. If your `MCPServerRegistration` CR is active, but tools are missing, test the backend server directly by running the following commands: ++ +.. Start a debug session based on your existing MCP server deployment by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc debug deployment/__ -it +---- + +.. Test whether your backend MCP server is functional by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +sh-4.4# curl -X POST http://localhost:/mcp \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc": "2.0", "id": 1, "method": "tools/list"}' +---- ++ +If the command returns a list of tools, your backend is healthy. + +. If your backend is healthy but tools are missing in the `Gateway` CR, your `MCPServerRegistration` CR is not correctly mapping the service. Check the logs to see why the backend rejected the request by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -l app=__ +---- ++ +Replace `__` with the application defined in the `metadata.labels.app:` section of your MCP server's `Pod` or `Deployment` CR. + +. If you created an `MCPServerRegistration` CR, but your tools are not appearing, check the MCP broker component's router logs for errors by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n mcp-system -l app.kubernetes.io/name=__ +---- ++ +Replace `__` with the name of your MCP gateway deployment. + +.. If you see errors, check for typos in the `targetRef` parameter value and ensure that you are pointing to an existing `Service` CR. + +. Verify that the backend MCP server is implementing the `tools/list` method correctly. Using the MCP Inspector is the easiest way to check. + +. Check your backend MCP server logs for errors. + +. Ensure that your backend MCP server is returning valid MCP protocol responses. Using the MCP Conformance Test Framework is the easiest way to catch protocol errors. + +. Verify that the `toolPrefix` entry in the `MCPServerRegistration` CR is valid, meaning that there are no spaces or special characters. + +. If tools appear without the configured prefix, check the `MCPServerRegistration` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc get mcpsr __ -n __ -o yaml | grep toolPrefix +---- ++ +* Replace `__` with the name of your MCP server. +* Replace `__` with the namespace where your `MCPServerRegistration` CR is applied. + +.. Ensure that a `toolPrefix` is set correctly in `MCPServerRegistration` CR. + +. Check the MCP gateway controller component logs for problems with your `MCPServerRegistration` CR by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc logs -n __ deployment/mcp-gateway-controller | grep prefix +---- ++ +Replace `__` with the name of your MCP gateway deployment. + +.. If you see `failed to generate prefix`, then there is an error in your `MCPServerRegistration` CR `metadata` or there is a conflict. + +.. If there is no output at all, then either the controller is not detecting your `MCPServerRegistration` CR, or the request is failing before it gets to the routing stage. + +. Ensure that you restart the MCP gateway broker component after `MCPServerRegistration` CR changes by running the following command: ++ +[source,terminal,subs="+quotes"] +---- +$ oc rollout restart deployment/mcp-gateway -n mcp-system +---- + diff --git a/modules/proc-register-ext-mcp-server-mcpserverregistration.adoc b/modules/proc-register-ext-mcp-server-mcpserverregistration.adoc index 3009228525f0..1d8269293e4b 100644 --- a/modules/proc-register-ext-mcp-server-mcpserverregistration.adoc +++ b/modules/proc-register-ext-mcp-server-mcpserverregistration.adoc @@ -48,6 +48,11 @@ spec: * Replace the `spec.targetRef.name:` field value with the name of the `HTTPRoute` CR you applied. * Replace the value of `spec.targetRef.namespace:` with the namespace where your `HTTPRoute` CR is applied. * The `spec.credentialRef:` field points to the `Secret` CR that has credentials for the external MCP server. ++ +[IMPORTANT] +==== +A `toolPrefix` value cannot include spaces or special characters. +==== . Apply the CR by running the following command: + diff --git a/observe_troubleshoot/mcp-gateway-troubleshooting.adoc b/observe_troubleshoot/mcp-gateway-troubleshooting.adoc index 896a481b134c..d3d4ea991228 100644 --- a/observe_troubleshoot/mcp-gateway-troubleshooting.adoc +++ b/observe_troubleshoot/mcp-gateway-troubleshooting.adoc @@ -5,4 +5,17 @@ include::_attributes/attributes.adoc[] :context: mcp-gateway-troubleshooting [role="_abstract"] -FPO assembly +You can troubleshoot common issues with solutions when working with the MCP gateway across installation, configuration, and operation. + + +include::modules/proc-mcp-gateway-ts-extension-not-ready.adoc[leveloffset=+1] + +include::modules/proc-mcp-gateway-ts-on-prem-mcp-server.adoc[leveloffset=+1] + +include::modules/proc-mcp-gateway-ts-ext-mcp-server-connect-issues.adoc[leveloffset=+1] + +include::modules/proc-mcp-gateway-ts-ext-mcp-server-auth-issues.adoc[leveloffset=+1] + +include::modules/proc-mcp-gateway-ts-authn-issues.adoc[leveloffset=+1] + +include::modules/proc-mcp-gateway-ts-authz-issues.adoc[leveloffset=+1]