fixed minor nits, added troubleshooting section and additional links

gaurav-nelson · dminnear-rh · commit a57bdb5a9460 · 2026-06-04T09:58:36.000-04:00
diff --git a/content/patterns/maas-quickstart/_index.adoc b/content/patterns/maas-quickstart/_index.adoc
@@ -33,4 +33,7 @@ include::modules/maas-quickstart-architecture.adoc[leveloffset=+1]
 [id="next-steps-maas-quickstart"]
 == Next steps
 
-* link:getting-started[Install this pattern.]
+* link:getting-started[Install this pattern]
+* link:cluster-sizing[Cluster sizing]
+* link:customizing-this-pattern[Customizing this pattern]
+* link:troubleshooting[Troubleshooting]
diff --git a/content/patterns/maas-quickstart/cluster-sizing.adoc b/content/patterns/maas-quickstart/cluster-sizing.adoc
@@ -20,7 +20,7 @@ In addition to the worker nodes listed above, this pattern requires at least 2 G
 .GPU node minimum requirements
 [cols="<,^,<,<"]
 |===
-| Cloud Provider | Node Type | Number of nodes | Instance Type
+| Cloud provider | Node type | Number of nodes | Instance type
 
 | Amazon Web Services
 | GPU Worker
diff --git a/content/patterns/maas-quickstart/customizing-this-pattern.adoc b/content/patterns/maas-quickstart/customizing-this-pattern.adoc
@@ -22,7 +22,7 @@ The pattern serves two models by default:
 * `nemotron-3-nano-30b-a3b-fp8` -- Available to premium and enterprise tier users.
 * `gpt-oss-20b` -- Available to all user tiers.
 
-To change or add models, edit the `models` list in `overrides/maas-quickstart.yaml`. Models are pulled from OCI registries and do not require a HuggingFace API token.
+To change or add models, edit the `models` list in `overrides/maas-quickstart.yaml`. The pattern pulls models from OCI registries and does not require a HuggingFace API token.
 
 The model definitions specify the model URI, resource requirements, GPU tolerations, and vLLM arguments. For example:
 
@@ -61,7 +61,7 @@ The pattern uses Kuadrant (Red Hat Connectivity Link) to enforce per-tier rate l
 
 [cols="1,1,2",options="header"]
 |===
-| Tier | Rate Limit | Description
+| Tier | Rate limit | Description
 
 | Free
 | 5 requests per 2 minutes
@@ -76,7 +76,7 @@ The pattern uses Kuadrant (Red Hat Connectivity Link) to enforce per-tier rate l
 | High-throughput workloads
 |===
 
-To adjust rate limits, modify the `tiers` section in `overrides/maas-quickstart.yaml`. For example, to increase the premium tier request limit to 40 and the token limit to 20000:
+To adjust rate limits, modify the `tiers` section in `overrides/maas-quickstart.yaml`. The following example increases the premium tier request limit to 40 and the token limit to 20000:
 
 [source,yaml]
 ----
@@ -97,14 +97,14 @@ Push your changes to your forked repository so the GitOps framework applies the
 [id="managing-users-maas"]
 === Managing users
 
-User authentication is handled by htpasswd with OpenShift OAuth. The default users are:
+htpasswd with OpenShift OAuth handles user authentication. The default users are:
 
 * `admin` -- Full administrative access (enterprise tier)
 * `free-user` -- Free tier access
 * `premium-user` -- Premium tier access
 * `enterprise-user` -- Enterprise tier access
 
-User passwords are stored in the `values-secret.yaml` file and managed through HashiCorp Vault and the External Secrets Operator (ESO). To change a user password after initial deployment, update the secret value in your `values-secret.yaml` file and redeploy the pattern.
+{hashicorp-vault} and the {eso-op} store and manage user passwords in the `values-secret.yaml` file. To change a user password after initial deployment, update the secret value in your `values-secret.yaml` file and redeploy the pattern.
 
 To assign users to different tiers, modify the `tiers` section in `overrides/maas-quickstart.yaml`:
 
@@ -136,8 +136,8 @@ To customize the DevSpaces configuration, you can adjust:
 * The inference endpoint URL used by the Continue extension
 
 [id="gpu-node-provisioning-maas"]
-=== GPU node provisioning
+=== Provisioning GPU nodes
 
 This pattern requires at least 2 NVIDIA GPU nodes with 48 GB or more of VRAM each. On AWS, the pattern automatically provisions `g6e.2xlarge` GPU machine sets with NVIDIA L40S GPUs.
 
-If your cluster does not have GPU nodes, you must add them before deploying the pattern. The pattern installs all required operators, including the NVIDIA GPU Operator, automatically during deployment.
+If your cluster does not have GPU nodes, you must add them before you deploy the pattern. The pattern installs all required operators, including the NVIDIA GPU Operator, automatically during deployment.
diff --git a/content/patterns/maas-quickstart/getting-started.adoc b/content/patterns/maas-quickstart/getting-started.adoc
@@ -15,8 +15,8 @@ include::modules/comm-attributes.adoc[]
 .Prerequisites
 
 * An OpenShift cluster (version 4.20 or later). This pattern requires at least 2 NVIDIA GPU nodes with 48 GB or more of VRAM each.
- ** *AWS*: The pattern automatically provisions 2 `g6e.2xlarge` GPU worker nodes (NVIDIA L40S) during installation. No GPU nodes need to be present before deploying.
- ** *Other providers and bare metal*: GPU nodes must already be part of the OpenShift cluster before deploying this pattern. The pattern installs all required operators automatically.
+ ** *AWS*: The pattern automatically provisions 2 `g6e.2xlarge` GPU worker nodes (NVIDIA L40S) during installation. No GPU nodes need to be present before you deploy.
+ ** *Other providers and bare metal*: GPU nodes must already be part of the OpenShift cluster before you deploy this pattern. The pattern installs all required operators automatically.
  ** To create an OpenShift cluster, go to the https://console.redhat.com/[Red Hat Hybrid Cloud console].
  ** Select *OpenShift \-> Red Hat OpenShift Container Platform \-> Create cluster*.
 * The Helm binary. For instructions, see link:https://helm.sh/docs/intro/install/[Installing Helm].
@@ -71,7 +71,7 @@ upstream	git@github.com:validatedpatterns-sandbox/ai-quickstart-maas-code-assist
 +
 [WARNING]
 ====
-Do not add, commit, or push this file to your repository. Doing so may expose personal credentials to GitHub.
+Do not add, commit, or push this file to your repository. Doing so might expose personal credentials to GitHub.
 ====
 +
 Run the following command:
@@ -184,3 +184,10 @@ $ oc get inferenceservice -A
 ----
 
 . Access the OpenShift DevSpaces dashboard to confirm the IDE environment is available. Navigate to *Networking -> Routes* in the DevSpaces namespace and open the route URL.
+
+[id="next-steps-getting-started-maas"]
+== Next steps
+
+* link:customizing-this-pattern[Customizing this pattern]
+* link:cluster-sizing[Cluster sizing]
+* link:troubleshooting[Troubleshooting]
diff --git a/content/patterns/maas-quickstart/troubleshooting.adoc b/content/patterns/maas-quickstart/troubleshooting.adoc
@@ -0,0 +1,264 @@
+---
+title: Troubleshooting
+weight: 40
+aliases: /maas-quickstart/troubleshooting/
+---
+
+:toc:
+:imagesdir: /images
+:_content-type: ASSEMBLY
+include::modules/comm-attributes.adoc[]
+
+[id="troubleshooting-maas-quickstart"]
+== Troubleshooting the MaaS Code Assistant AI Quickstart pattern
+
+Use this page to diagnose and resolve common issues when deploying or operating this pattern.
+
+[id="troubleshooting-prereqs-maas"]
+== Prerequisite and tooling issues
+
+[id="troubleshooting-podman-version"]
+=== Podman version not supported
+
+The `pattern.sh` script requires Podman 4.3.0 or later. Earlier versions do not support the `--userns=keep-id` flag required for correct UID/GID mapping inside the container.
+
+.Symptom
+
+The script exits with an error referencing the Podman version or `keep-id`.
+
+.Resolution
+
+. Check your Podman version:
++
+[source,terminal]
+----
+$ podman --version
+----
+
+. If the version is earlier than 4.3.0, upgrade Podman. For instructions, see the link:https://podman.io/docs/installation[Podman installation documentation].
+
+[id="troubleshooting-kubeconfig"]
+=== KUBECONFIG path is outside the HOME directory
+
+The `pattern.sh` script runs inside a container and mounts your `$HOME` directory. If your `KUBECONFIG` file is located outside `$HOME`, the container cannot access it.
+
+.Symptom
+
+The script fails to connect to the cluster or reports that the kubeconfig file cannot be found.
+
+.Resolution
+
+Move your kubeconfig file to a path inside your home directory and export the updated path:
+
+[source,terminal]
+----
+$ cp <current-kubeconfig-path> ~/kubeconfig
+$ export KUBECONFIG=~/kubeconfig
+----
+
+[id="troubleshooting-deployment-maas"]
+== Deployment issues
+
+[id="troubleshooting-argocd-sync"]
+=== ArgoCD applications are not syncing or are unhealthy
+
+After running `./pattern.sh make install`, ArgoCD applications can take 15–30 minutes to reach a healthy state. Model downloads and GPU operator initialization take additional time.
+
+.Symptom
+
+Running `./pattern.sh make argo-healthcheck` reports applications in `Progressing` or `Degraded` state.
+
+.Resolution
+
+. Check which applications are not healthy:
++
+[source,terminal]
+----
+$ oc get applications -n openshift-gitops
+----
+
+. Inspect the failing application for error details:
++
+[source,terminal]
+----
+$ oc describe application <application-name> -n openshift-gitops
+----
+
+. Check the logs of the ArgoCD application controller:
++
+[source,terminal]
+----
+$ oc logs -n openshift-gitops deployment/openshift-gitops-application-controller
+----
+
+. If applications are stuck in `Progressing`, wait an additional 10 minutes and re-run the health check. Model downloads from OCI registries can take significant time depending on network conditions.
+
+[id="troubleshooting-schema-validation"]
+=== Values file schema validation fails
+
+The pattern validates `values-*.yaml` files against a schema before deployment.
+
+.Symptom
+
+Running `./pattern.sh make install` fails with a schema validation error.
+
+.Resolution
+
+. Run the validation step independently to see the full error output:
++
+[source,terminal]
+----
+$ ./pattern.sh make validate-schema
+----
+
+. Review the error message to identify the malformed field and correct the value in your `values-secret.yaml` or `overrides/maas-quickstart.yaml` file.
+
+[id="troubleshooting-gpu-maas"]
+== GPU and inference issues
+
+[id="troubleshooting-gpu-nodes"]
+=== GPU nodes are not ready
+
+The NVIDIA GPU Operator must successfully initialize on each GPU node before model serving can start.
+
+.Symptom
+
+Inference service pods remain in `Pending` state, or `oc get inferenceservice -A` shows services not ready.
+
+.Resolution
+
+. Check the status of GPU nodes:
++
+[source,terminal]
+----
+$ oc get nodes -l nvidia.com/gpu.present=true
+----
+
+. Check the NVIDIA GPU Operator pods:
++
+[source,terminal]
+----
+$ oc get pods -n nvidia-gpu-operator
+----
+
+. Check for driver initialization errors:
++
+[source,terminal]
+----
+$ oc logs -n nvidia-gpu-operator -l app=nvidia-driver-daemonset
+----
+
+. If you are using a provider other than AWS, confirm that GPU nodes were present in the cluster before you deployed the pattern. The pattern does not provision GPU nodes on providers other than AWS.
+
+[id="troubleshooting-inference-endpoints"]
+=== Inference endpoints are not serving
+
+.Symptom
+
+`oc get inferenceservice -A` shows inference services in a non-ready state, or the Continue AI extension in DevSpaces returns connection errors.
+
+.Resolution
+
+. Check the status of inference services:
++
+[source,terminal]
+----
+$ oc get inferenceservice -A
+----
+
+. Check the vLLM model server pod logs for a specific model:
++
+[source,terminal]
+----
+$ oc logs -n redhat-ods-applications -l serving.kserve.io/inferenceservice=<model-name>
+----
+
+. Confirm that the GPU nodes have sufficient available VRAM. Each model requires a GPU with at least 48 GB of VRAM. If both models are scheduled on the same node, the node requires at least 96 GB of VRAM or you must use two separate GPU nodes.
+
+[id="troubleshooting-rate-limiting-maas"]
+== Rate limiting and authentication issues
+
+[id="troubleshooting-rate-limits"]
+=== Rate limiting is not enforced
+
+.Symptom
+
+Requests from all users succeed regardless of the configured rate limits, or requests are blocked for all users.
+
+.Resolution
+
+. Check the status of the Kuadrant operator and Limitador pod:
++
+[source,terminal]
+----
+$ oc get pods -n kuadrant-system
+----
+
+. Check the Limitador logs for policy errors:
++
+[source,terminal]
+----
+$ oc logs -n kuadrant-system deployment/limitador
+----
+
+. Confirm that rate limit policies are applied correctly:
++
+[source,terminal]
+----
+$ oc get ratelimitpolicy -A
+----
+
+[id="troubleshooting-auth-maas"]
+=== Users cannot authenticate
+
+.Symptom
+
+Users receive authentication errors when accessing the inference API or DevSpaces.
+
+.Resolution
+
+. Confirm that the htpasswd secret was correctly provisioned by the External Secrets Operator:
++
+[source,terminal]
+----
+$ oc get externalsecret -A
+$ oc get secret htpasswd-secret -n openshift-config
+----
+
+. If the secret is missing or incorrect, verify that your `values-secret.yaml` file contains the correct passwords for all four users (`admin`, `free-user`, `premium-user`, `enterprise-user`) and redeploy the pattern.
+
+[id="troubleshooting-devspaces-maas"]
+== OpenShift DevSpaces issues
+
+[id="troubleshooting-devspaces-connection"]
+=== Continue AI extension cannot connect to inference endpoints
+
+.Symptom
+
+Code suggestions are not returned in DevSpaces, or the Continue extension reports a connection error.
+
+.Resolution
+
+. Confirm that the inference services are healthy:
++
+[source,terminal]
+----
+$ oc get inferenceservice -A
+----
+
+. Navigate to *Networking -> Routes* in the namespace where the inference services are running and confirm the routes are accessible.
+
+. In DevSpaces, open the Continue extension settings and verify that the endpoint URL matches the route URL for the vLLM service.
+
+[id="troubleshooting-get-help-maas"]
+== Getting help
+
+If you cannot resolve an issue using this guide:
+
+* Check the link:https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant/issues[GitHub issues] for known problems and workarounds.
+* Open a new issue with the output of the following command to help diagnose the problem:
++
+[source,terminal]
+----
+$ oc get pods -A | grep -v Running | grep -v Completed
+----
diff --git a/modules/maas-quickstart-about.adoc b/modules/maas-quickstart-about.adoc
@@ -12,17 +12,12 @@ Use case::
 * Deploy an AI-powered code assistant that provides intelligent code suggestions through an integrated development environment.
 * Implement Model-as-a-Service (MaaS) governance with tiered user access, rate limiting, and chargeback capabilities.
 * Use a GitOps approach to provision AI inference infrastructure including GPU-accelerated model serving, identity management, and API rate limiting.
-+
-[NOTE]
-====
-Based on the requirements of a specific implementation, certain details might differ. However, all Validated Patterns that are based on a portfolio architecture, generalize one or more successful deployments of a use case.
-====
 
 Background::
 
-This pattern is scaffolding around the link:https://github.com/rh-ai-quickstart/maas-code-assistant[MaaS Code Assistant AI Quickstart]. It provisions the OpenShift cluster with link:https://www.redhat.com/en/products/ai/openshift-ai[{rhoai}] configured for GPU-accelerated inference using vLLM and llm-d. It deploys the NVIDIA GPU Operator for model serving on GPU nodes and manages secrets through the {solution-name-upstream} framework using HashiCorp Vault and the External Secrets Operator.
+This pattern builds on the link:https://github.com/rh-ai-quickstart/maas-code-assistant[MaaS Code Assistant AI Quickstart]. It provisions the OpenShift cluster with link:https://www.redhat.com/en/products/ai/openshift-ai[{rhoai}] configured for GPU-accelerated inference using vLLM and llm-d. It deploys the NVIDIA GPU Operator for model serving on GPU nodes and manages secrets through the {solution-name-upstream} framework using HashiCorp Vault and the External Secrets Operator. This pattern generalizes one or more successful deployments of this use case. Implementation details might vary depending on your specific environment and requirements.
 
-The MaaS Code Assistant enables organizations to offer AI code assistance as an internal service with differentiated access tiers. It demonstrates a production-ready approach to:
+Organizations can use the MaaS Code Assistant to offer AI code assistance as an internal service with differentiated access tiers. It demonstrates a production-ready approach to:
 
 - Serving multiple NVIDIA Nemotron language models optimized for code completion and generation
 - Enforcing per-user rate limits through Kuadrant (Red Hat Connectivity Link) to manage capacity and enable chargeback
@@ -40,7 +35,7 @@ The solution uses vLLM with llm-d for high-performance inference of NVIDIA Nemot
 [id="about-maas-quickstart-technology"]
 == About the technology
 
-The following technologies are used in this solution:
+This solution uses the following technologies:
 
 https://www.redhat.com/en/technologies/cloud-computing/openshift/try-it[{rh-ocp}]::
 An enterprise-ready Kubernetes container platform built for an open hybrid cloud strategy. It provides a consistent application platform to manage hybrid cloud, public cloud, and edge deployments.
diff --git a/modules/maas-quickstart-architecture.adoc b/modules/maas-quickstart-architecture.adoc