Skip to content

Commit 3dabcf9

Browse files
committed
add docs for maas code assistant quickstart pattern
1 parent f4dc7b4 commit 3dabcf9

9 files changed

Lines changed: 634 additions & 0 deletions

File tree

Lines changed: 36 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,36 @@
1+
---
2+
title: MaaS Code Assistant AI Quickstart
3+
date: 2026-06-03
4+
tier: sandbox
5+
summary: This pattern deploys a multi-tenant AI code assistant with NVIDIA Nemotron models, tiered rate limiting, and IDE integration on OpenShift.
6+
rh_products:
7+
- Red Hat OpenShift Container Platform
8+
- Red Hat OpenShift AI
9+
- Red Hat OpenShift DevSpaces
10+
- Red Hat Connectivity Link
11+
industries:
12+
- General
13+
focus_areas:
14+
- AI
15+
- Code
16+
- AI Quickstart
17+
aliases: /maas-quickstart/
18+
links:
19+
github: https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant
20+
install: getting-started
21+
bugs: https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant/issues
22+
feedback: https://docs.google.com/forms/d/e/1FAIpQLScI76b6tD1WyPu2-d_9CCVDr3Fu5jYERthqLKJDUGwqBg7Vcg/viewform
23+
---
24+
:toc:
25+
:imagesdir: /images
26+
:_content-type: ASSEMBLY
27+
include::modules/comm-attributes.adoc[]
28+
29+
include::modules/maas-quickstart-about.adoc[leveloffset=+1]
30+
31+
include::modules/maas-quickstart-architecture.adoc[leveloffset=+1]
32+
33+
[id="next-steps-maas-quickstart"]
34+
== Next steps
35+
36+
* link:getting-started[Install this pattern.]
Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
---
2+
title: Cluster sizing
3+
weight: 30
4+
aliases: /maas-quickstart/cluster-sizing/
5+
---
6+
7+
:toc:
8+
:imagesdir: /images
9+
:_content-type: ASSEMBLY
10+
include::modules/comm-attributes.adoc[]
11+
include::modules/ai-quickstart-maas-code-assistant/metadata-ai-quickstart-maas-code-assistant.adoc[]
12+
13+
include::modules/cluster-sizing-template.adoc[]
14+
15+
[id="maas-quickstart-gpu-node-requirements"]
16+
== GPU node requirements
17+
18+
In addition to the worker nodes listed above, this pattern requires at least 2 GPU-equipped nodes for model inference. On AWS, the pattern automatically provisions `g6e.2xlarge` instances with NVIDIA L40S GPUs. On other providers and bare metal, GPU nodes must already be part of the cluster before deploying the pattern.
19+
20+
.GPU node minimum requirements
21+
[cols="<,^,<,<"]
22+
|===
23+
| Cloud Provider | Node Type | Number of nodes | Instance Type
24+
25+
| Amazon Web Services
26+
| GPU Worker
27+
| 2
28+
| g6e.2xlarge
29+
|===
Lines changed: 143 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,143 @@
1+
---
2+
title: Customizing this pattern
3+
weight: 20
4+
aliases: /maas-quickstart/customizing/
5+
---
6+
7+
:toc:
8+
:imagesdir: /images
9+
:_content-type: ASSEMBLY
10+
include::modules/comm-attributes.adoc[]
11+
12+
[id="customizing-maas-quickstart"]
13+
== Customizing the MaaS Code Assistant AI Quickstart pattern
14+
15+
This pattern deploys an AI code assistant with tiered user access, rate limiting, and NVIDIA Nemotron model serving. You can customize the models, rate limit policies, user tiers, and IDE configuration.
16+
17+
[id="changing-models-maas"]
18+
=== Changing models
19+
20+
The pattern serves two models by default:
21+
22+
* `nemotron-3-nano-30b-a3b-fp8` -- Available to premium and enterprise tier users.
23+
* `gpt-oss-20b` -- Available to all user tiers.
24+
25+
To change or add models, edit the `models` list in `overrides/maas-quickstart.yaml`. Models are pulled from OCI registries and do not require a HuggingFace API token.
26+
27+
The model definitions specify the model URI, resource requirements, GPU tolerations, and vLLM arguments. For example:
28+
29+
[source,yaml]
30+
----
31+
models:
32+
- name: gpt-oss-20b
33+
displayName: OpenAI gpt-oss-20b
34+
uri: oci://registry.redhat.io/rhelai1/modelcar-gpt-oss-20b:1.5
35+
resources:
36+
limits:
37+
cpu: "4"
38+
memory: 24Gi
39+
nvidia.com/gpu: "1"
40+
requests:
41+
cpu: "2"
42+
memory: 16Gi
43+
nvidia.com/gpu: "1"
44+
extraArgs:
45+
- --enable-force-include-usage
46+
tolerations:
47+
- effect: NoSchedule
48+
key: nvidia.com/gpu
49+
operator: Exists
50+
----
51+
52+
[NOTE]
53+
====
54+
Each model requires a GPU with at least 48 GB of VRAM. Adding models beyond the default two requires additional GPU nodes.
55+
====
56+
57+
[id="adjusting-rate-limits-maas"]
58+
=== Adjusting rate limits and user tiers
59+
60+
The pattern uses Kuadrant (Red Hat Connectivity Link) to enforce per-tier rate limits on inference requests. The default tiers and limits are:
61+
62+
[cols="1,1,2",options="header"]
63+
|===
64+
| Tier | Rate Limit | Description
65+
66+
| Free
67+
| 5 requests per 2 minutes
68+
| Basic access for evaluation
69+
70+
| Premium
71+
| 20 requests per 2 minutes
72+
| Standard production usage
73+
74+
| Enterprise
75+
| 50 requests per 2 minutes
76+
| High-throughput workloads
77+
|===
78+
79+
To adjust rate limits, modify the `tiers` section in `overrides/maas-quickstart.yaml`. For example, to increase the premium tier request limit to 40 and the token limit to 20000:
80+
81+
[source,yaml]
82+
----
83+
tiers:
84+
premium:
85+
users:
86+
- premium-user
87+
requestRates:
88+
- limit: 40
89+
window: 2m
90+
tokenRates:
91+
- limit: 20000
92+
window: 1m
93+
----
94+
95+
Push your changes to your forked repository so the GitOps framework applies the updated configuration.
96+
97+
[id="managing-users-maas"]
98+
=== Managing users
99+
100+
User authentication is handled by htpasswd with OpenShift OAuth. The default users are:
101+
102+
* `admin` -- Full administrative access (enterprise tier)
103+
* `free-user` -- Free tier access
104+
* `premium-user` -- Premium tier access
105+
* `enterprise-user` -- Enterprise tier access
106+
107+
User passwords are stored in the `values-secret.yaml` file and managed through HashiCorp Vault and the External Secrets Operator (ESO). To change a user password after initial deployment, update the secret value in your `values-secret.yaml` file and redeploy the pattern.
108+
109+
To assign users to different tiers, modify the `tiers` section in `overrides/maas-quickstart.yaml`:
110+
111+
[source,yaml]
112+
----
113+
tiers:
114+
free:
115+
users:
116+
- free-user
117+
premium:
118+
users:
119+
- premium-user
120+
- user1
121+
enterprise:
122+
users:
123+
- admin
124+
- enterprise-user
125+
----
126+
127+
[id="configuring-devspaces-maas"]
128+
=== Configuring OpenShift DevSpaces
129+
130+
The pattern integrates the Continue AI extension in OpenShift DevSpaces to provide code assistance directly in the IDE. DevSpaces is preconfigured to clone the AI Quickstart repository and connect to the vLLM inference endpoints.
131+
132+
To customize the DevSpaces configuration, you can adjust:
133+
134+
* Default IDE settings and extensions
135+
* Resource limits for developer workspaces
136+
* The inference endpoint URL used by the Continue extension
137+
138+
[id="gpu-node-provisioning-maas"]
139+
=== GPU node provisioning
140+
141+
This pattern requires at least 2 NVIDIA GPU nodes with 48 GB or more of VRAM each. On AWS, the pattern automatically provisions `g6e.2xlarge` GPU machine sets with NVIDIA L40S GPUs.
142+
143+
If your cluster does not have GPU nodes, you must add them before deploying the pattern. The pattern installs all required operators, including the NVIDIA GPU Operator, automatically during deployment.
Lines changed: 186 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,186 @@
1+
---
2+
title: Getting started
3+
weight: 10
4+
aliases: /maas-quickstart/getting-started/
5+
---
6+
7+
:toc:
8+
:imagesdir: /images
9+
:_content-type: ASSEMBLY
10+
include::modules/comm-attributes.adoc[]
11+
12+
[id="deploying-maas-quickstart-pattern"]
13+
== Deploying the MaaS Code Assistant AI Quickstart pattern
14+
15+
.Prerequisites
16+
17+
* An OpenShift cluster (version 4.20 or later). This pattern requires at least 2 NVIDIA GPU nodes with 48 GB or more of VRAM each.
18+
** *AWS*: The pattern automatically provisions 2 `g6e.2xlarge` GPU worker nodes (NVIDIA L40S) during installation. No GPU nodes need to be present before deploying.
19+
** *Other providers and bare metal*: GPU nodes must already be part of the OpenShift cluster before deploying this pattern. The pattern installs all required operators automatically.
20+
** To create an OpenShift cluster, go to the https://console.redhat.com/[Red Hat Hybrid Cloud console].
21+
** Select *OpenShift \-> Red Hat OpenShift Container Platform \-> Create cluster*.
22+
* The Helm binary. For instructions, see link:https://helm.sh/docs/intro/install/[Installing Helm].
23+
* The `oc` CLI tool. For instructions, see link:https://docs.openshift.com/container-platform/latest/cli_reference/openshift_cli/getting-started-cli.html[Getting started with the OpenShift CLI].
24+
* Additional installation tool dependencies. For details, see link:https://validatedpatterns.io/learn/quickstart/[Patterns quick start].
25+
26+
[id="preparing-for-deployment-maas"]
27+
== Preparing for deployment
28+
.Procedure
29+
30+
. Fork the link:https://github.com/validatedpatterns-sandbox/ai-quickstart-maas-code-assistant[ai-quickstart-maas-code-assistant] repository on GitHub. You must fork the repository to customize this pattern.
31+
32+
. Clone the forked copy of this repository.
33+
+
34+
[source,terminal]
35+
----
36+
$ git clone git@github.com:your-username/ai-quickstart-maas-code-assistant.git
37+
----
38+
39+
. Go to the root directory of your Git repository:
40+
+
41+
[source,terminal]
42+
----
43+
$ cd ai-quickstart-maas-code-assistant
44+
----
45+
46+
. Run the following command to set the upstream repository:
47+
+
48+
[source,terminal]
49+
----
50+
$ git remote add -f upstream git@github.com:validatedpatterns-sandbox/ai-quickstart-maas-code-assistant.git
51+
----
52+
53+
. Verify the setup of your remote repositories by running the following command:
54+
+
55+
[source,terminal]
56+
----
57+
$ git remote -v
58+
----
59+
+
60+
.Example output
61+
+
62+
[source,terminal]
63+
----
64+
origin git@github.com:your-username/ai-quickstart-maas-code-assistant.git (fetch)
65+
origin git@github.com:your-username/ai-quickstart-maas-code-assistant.git (push)
66+
upstream git@github.com:validatedpatterns-sandbox/ai-quickstart-maas-code-assistant.git (fetch)
67+
upstream git@github.com:validatedpatterns-sandbox/ai-quickstart-maas-code-assistant.git (push)
68+
----
69+
70+
. Make a local copy of the secrets template outside of your repository to hold credentials for the pattern.
71+
+
72+
[WARNING]
73+
====
74+
Do not add, commit, or push this file to your repository. Doing so may expose personal credentials to GitHub.
75+
====
76+
+
77+
Run the following command:
78+
+
79+
[source,terminal]
80+
----
81+
$ cp values-secret.yaml.template ~/values-secret-ai-quickstart-maas-code-assistant.yaml
82+
----
83+
84+
. Populate this file with the user passwords needed for the pattern:
85+
+
86+
[source,terminal]
87+
----
88+
$ vim ~/values-secret-ai-quickstart-maas-code-assistant.yaml
89+
----
90+
91+
.. Edit the `htpasswd` section to set passwords for each user tier:
92+
+
93+
[source,yaml]
94+
----
95+
- name: htpasswd
96+
fields:
97+
- name: admin
98+
value: <admin-password>
99+
- name: free-user
100+
value: <free-user-password>
101+
- name: premium-user
102+
value: <premium-user-password>
103+
- name: enterprise-user
104+
value: <enterprise-user-password>
105+
----
106+
107+
. Optional: To customize the deployment, create and switch to a new branch by running the following command:
108+
+
109+
[source,terminal]
110+
----
111+
$ git checkout -b my-branch
112+
----
113+
+
114+
Make your changes, then stage and commit them:
115+
+
116+
[source,terminal]
117+
----
118+
$ git add <changed-files>
119+
$ git commit -m "Customize deployment"
120+
----
121+
+
122+
Push the changes to your forked repository:
123+
+
124+
[source,terminal]
125+
----
126+
$ git push origin my-branch
127+
----
128+
129+
[id="deploying-cluster-using-patternsh-file-maas"]
130+
== Deploying the pattern by using the pattern.sh file
131+
132+
To deploy the pattern by using the `pattern.sh` file, complete the following steps:
133+
134+
. Log in to your cluster by following this procedure:
135+
136+
.. Obtain an API token by visiting link:https://oauth-openshift.apps.<your_cluster>.<domain>/oauth/token/request[https://oauth-openshift.apps.<your_cluster>.<domain>/oauth/token/request].
137+
138+
.. Log in to the cluster by running the following command:
139+
+
140+
[source,terminal]
141+
----
142+
$ oc login --token=<retrieved-token> --server=https://api.<your_cluster>.<domain>:6443
143+
----
144+
+
145+
Or log in by running the following command:
146+
+
147+
[source,terminal]
148+
----
149+
$ export KUBECONFIG=~/<path_to_kubeconfig>
150+
----
151+
152+
. Deploy the pattern to your cluster. Run the following command:
153+
+
154+
[source,terminal]
155+
----
156+
$ ./pattern.sh make install
157+
----
158+
159+
.Verification
160+
161+
To verify a successful installation, check the health of the ArgoCD applications:
162+
163+
. Run the following command:
164+
+
165+
[source,terminal]
166+
----
167+
$ ./pattern.sh make argo-healthcheck
168+
----
169+
+
170+
It might take several minutes for all applications to synchronize and reach a healthy state. This includes downloading the NVIDIA Nemotron models and configuring the inference endpoints.
171+
172+
. Verify that the Operators are installed by navigating to *Operators -> Installed Operators* in the {ocp} web console. Confirm the following Operators are present:
173+
+
174+
* NVIDIA GPU Operator
175+
* {rhoai}
176+
* Red Hat OpenShift DevSpaces
177+
* Red Hat Connectivity Link
178+
179+
. After all applications are healthy, verify the inference endpoints are serving by running:
180+
+
181+
[source,terminal]
182+
----
183+
$ oc get inferenceservice -A
184+
----
185+
186+
. Access the OpenShift DevSpaces dashboard to confirm the IDE environment is available. Navigate to *Networking -> Routes* in the DevSpaces namespace and open the route URL.

0 commit comments

Comments
 (0)