You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+45-29Lines changed: 45 additions & 29 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,28 +2,34 @@
2
2
3
3
## Introduction
4
4
5
-
This deployment is based on the `validated pattern framework`, using GitOps for
6
-
seamless provisioning of all operators and applications. It deploys a Chatbot
7
-
application that harnesses the power of Large Language Models (LLMs) combined
5
+
This deployment uses the [**Validated Patterns**](https://validatedpatterns.io/) framework,
6
+
taking advantage of GitOps for seamless provisioning of all operators and applications.
7
+
It deploys a Chatbot application that harnesses the power of Large Language Models (LLMs) combined
8
8
with the Retrieval-Augmented Generation (RAG) framework.
9
9
10
-
The pattern uses the [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale.
10
+
The pattern uses [**Red Hat OpenShift AI**](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale.
11
11
12
-
The application uses either the [EDB Postgres for Kubernetes operator](https://catalog.redhat.com/software/container-stacks/detail/5fb41c88abd2a6f7dbe1b37b)
13
-
(default) or Redis to store embeddings of Red Hat products, running on Red Hat
14
-
OpenShift to generate project proposals for specific Red Hat products.
12
+
By default, this pattern uses [**pgvector**](https://github.com/pgvector/pgvector) as the RAG DB backend.
(either a local deployment as part of the pattern or an existing SQL Server DB on Azure) are also options
18
+
for RAG DB backends.
19
+
20
+
This pattern populates your chosen RAG DB with documents relating to Red Hat OpenShift AI for the purpose
21
+
of generating project proposals.
15
22
16
23
## Pre-requisites
17
24
18
25
- Podman
19
26
- Red Hat Openshift cluster running in AWS. Supported regions are : us-east-1 us-east-2 us-west-1 us-west-2 ca-central-1 sa-east-1 eu-west-1 eu-west-2 eu-west-3 eu-central-1 eu-north-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-southeast-1 ap-southeast-2 ap-south-1.
20
-
- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster.
21
27
- Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) Git repository.
22
28
-**EDB Postgres Operator Credentials** (Required only if you select EDB): The EDB Postgres for Kubernetes operator from the certified-operators catalog requires authentication to pull images from `docker.enterprisedb.com`. You will need to:
23
29
1. Register for a free trial account at [EDB Registration](https://www.enterprisedb.com/accounts/register)
24
30
2. Obtain your subscription token from [EDB Repos Downloads](https://www.enterprisedb.com/repos-downloads)
25
31
3. Add the token to your `values-secret.yaml` file during configuration (see below)
26
-
32
+
27
33
For more details, see the [EDB Installation Documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/installation_upgrade/).
28
34
29
35
## Demo Description & Architecture
@@ -111,7 +117,7 @@ cd rag-llm-gitops
111
117
112
118
### Configuring model
113
119
114
-
This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) out of box. Run the following command to configure vault with the model ID.
120
+
This pattern deploys [IBM Granite 3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct) out of box. Run the following command to configure vault with the model ID.
115
121
116
122
```sh
117
123
# Copy values-secret.yaml.template to ~/values-secret-rag-llm-gitops.yaml.
@@ -120,53 +126,49 @@ This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-gr
To deploy a model that can requires an Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Edit ~/values-secret-rag-llm-gitops.yaml to replace the `model Id` and the `Hugging Face` token.
129
+
To deploy a model that requires a Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Update the `hftoken` secret in
130
+
`~/values-secret-rag-llm-gitops.yaml` and edit the value of `.global.model.vllm`in
131
+
[`values-global.yaml`](./values-global.yaml) to your desired model.
124
132
125
-
**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to the `values-secret.yaml` file:
133
+
**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to
134
+
`~/values-secret-rag-llm-gitops.yaml`:
126
135
127
136
```sh
128
137
secrets:
129
138
- name: hfmodel
130
139
fields:
131
140
- name: hftoken
132
141
value: null
133
-
- name: modelId
134
-
value: "ibm-granite/granite-3.1-8b-instruct"
135
142
- name: edb
136
143
fields:
137
144
- name: token
138
145
value: "YOUR_EDB_TOKEN_HERE" # Replace with your EDB subscription token
139
146
description: EDB subscription token for pulling certified operator images
140
-
- name: minio
141
-
fields:
142
-
- name: MINIO_ROOT_USER
143
-
value: minio
144
-
- name: MINIO_ROOT_PASSWORD
145
-
value: null
146
-
onMissingValue: generate
147
147
```
148
148
149
149
The EDB token is synced into Vault and then used by External Secrets to create the required pull secret (`postgresql-operator-pull-secret`) in`openshift-operators`. Without this token, the EDB operator will fail to pull its container image and the database will not be created.
150
150
151
-
### Provision GPU MachineSet
151
+
If you are using PGVector or SQL Server, you can update the password in this file. Otherwise, an autogenerated
152
+
password is used.
152
153
153
-
As a pre-requisite to deploy the application using the validated pattern, GPU nodes should be provisioned along with Node Feature Discovery Operator and NVIDIA GPU operator. To provision GPU Nodes
154
+
### Provision GPU MachineSet
154
155
155
-
Following command will take about 5-10 minutes.
156
+
As a pre-requisite to deploy the application using this Validated Pattern, a GPU node needs to be provisioned.
157
+
To provision the GPU node on AWS:
156
158
157
159
```sh
158
160
./pattern.sh make create-gpu-machineset
159
161
```
160
162
161
-
Wait till the nodes are provisioned and running.
163
+
Wait till the node is provisioned and running.
162
164
163
165

164
166
165
-
Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install GPU nodes, Node Feature Discovery Operator and NVIDIA GPU operator.
167
+
Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install the GPU node.
166
168
167
169
### Deploy application
168
170
169
-
***Note:**: This pattern supports four types of vector databases: PGVECTOR (local chart), EDB Postgres forKubernetes, Elasticsearch, and Redis. By default the pattern will deploy PGVECTOR as a vector DB. To deploy EDB, set `global.db.type` to `EDB`in [values-global.yaml](./values-global.yaml).
171
+
***Note:**: This pattern supports five types of vector databases: pgvector, EDB Postgres forKubernetes, Elasticsearch, Redis, and SQL Server. By default the pattern will deploy pgvector as the RAG DB. To deploy EDB, set `global.db.type` to `EDB`in [values-global.yaml](./values-global.yaml).
170
172
171
173
```yaml
172
174
---
@@ -176,14 +178,28 @@ global:
176
178
useCSV: false
177
179
syncPolicy: Automatic
178
180
installPlanApproval: Automatic
179
-
# Possible value for db.type = [REDIS, EDB, ELASTIC, PGVECTOR]
181
+
# Possible values for RAG vector DB db.type:
182
+
# REDIS -> Redis (Local chart deploy)
183
+
# EDB -> PGVector via EDB operator (Local chart deploy)
0 commit comments