Skip to content

Commit f1ae720

Browse files
committed
use external secrets for pgvector secret
1 parent 49c29dd commit f1ae720

9 files changed

Lines changed: 131 additions & 94 deletions

File tree

README.md

Lines changed: 45 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -2,28 +2,34 @@
22

33
## Introduction
44

5-
This deployment is based on the `validated pattern framework`, using GitOps for
6-
seamless provisioning of all operators and applications. It deploys a Chatbot
7-
application that harnesses the power of Large Language Models (LLMs) combined
5+
This deployment uses the [**Validated Patterns**](https://validatedpatterns.io/) framework,
6+
taking advantage of GitOps for seamless provisioning of all operators and applications.
7+
It deploys a Chatbot application that harnesses the power of Large Language Models (LLMs) combined
88
with the Retrieval-Augmented Generation (RAG) framework.
99

10-
The pattern uses the [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale.
10+
The pattern uses [**Red Hat OpenShift AI**](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale.
1111

12-
The application uses either the [EDB Postgres for Kubernetes operator](https://catalog.redhat.com/software/container-stacks/detail/5fb41c88abd2a6f7dbe1b37b)
13-
(default) or Redis to store embeddings of Red Hat products, running on Red Hat
14-
OpenShift to generate project proposals for specific Red Hat products.
12+
By default, this pattern uses [**pgvector**](https://github.com/pgvector/pgvector) as the RAG DB backend.
13+
[**EDB Postgres**](https://www.enterprisedb.com/docs/edb-postgres-ai/latest/ai-factory/vector-engine/),
14+
[**Redis**](https://redis.io/docs/latest/develop/get-started/vector-database/),
15+
[**Elasticsearch**](https://www.elastic.co/elasticsearch/vector-database), and
16+
[**Microsoft SQL Server**](https://learn.microsoft.com/en-us/sql/sql-server/ai/vectors?view=sql-server-ver17)
17+
(either a local deployment as part of the pattern or an existing SQL Server DB on Azure) are also options
18+
for RAG DB backends.
19+
20+
This pattern populates your chosen RAG DB with documents relating to Red Hat OpenShift AI for the purpose
21+
of generating project proposals.
1522

1623
## Pre-requisites
1724

1825
- Podman
1926
- Red Hat Openshift cluster running in AWS. Supported regions are : us-east-1 us-east-2 us-west-1 us-west-2 ca-central-1 sa-east-1 eu-west-1 eu-west-2 eu-west-3 eu-central-1 eu-north-1 ap-northeast-1 ap-northeast-2 ap-northeast-3 ap-southeast-1 ap-southeast-2 ap-south-1.
20-
- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster.
2127
- Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) Git repository.
2228
- **EDB Postgres Operator Credentials** (Required only if you select EDB): The EDB Postgres for Kubernetes operator from the certified-operators catalog requires authentication to pull images from `docker.enterprisedb.com`. You will need to:
2329
1. Register for a free trial account at [EDB Registration](https://www.enterprisedb.com/accounts/register)
2430
2. Obtain your subscription token from [EDB Repos Downloads](https://www.enterprisedb.com/repos-downloads)
2531
3. Add the token to your `values-secret.yaml` file during configuration (see below)
26-
32+
2733
For more details, see the [EDB Installation Documentation](https://www.enterprisedb.com/docs/postgres_for_kubernetes/latest/installation_upgrade/).
2834

2935
## Demo Description & Architecture
@@ -111,7 +117,7 @@ cd rag-llm-gitops
111117
112118
### Configuring model
113119
114-
This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.1-8b-instruct) out of box. Run the following command to configure vault with the model ID.
120+
This pattern deploys [IBM Granite 3.3-8B-Instruct](https://huggingface.co/ibm-granite/granite-3.3-8b-instruct) out of box. Run the following command to configure vault with the model ID.
115121
116122
```sh
117123
# Copy values-secret.yaml.template to ~/values-secret-rag-llm-gitops.yaml.
@@ -120,53 +126,49 @@ This pattern deploys [IBM Granite 3.1-8B-Instruct](https://huggingface.co/ibm-gr
120126
cp values-secret.yaml.template ~/values-secret-rag-llm-gitops.yaml
121127
```
122128
123-
To deploy a model that can requires an Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Edit ~/values-secret-rag-llm-gitops.yaml to replace the `model Id` and the `Hugging Face` token.
129+
To deploy a model that requires a Hugging Face token, grab the [Hugging Face token](https://huggingface.co/settings/tokens) and accept the terms and conditions on the model page. Update the `hftoken` secret in
130+
`~/values-secret-rag-llm-gitops.yaml` and edit the value of `.global.model.vllm` in
131+
[`values-global.yaml`](./values-global.yaml) to your desired model.
124132
125-
**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to the `values-secret.yaml` file:
133+
**IMPORTANT**: If you are using EDB Postgres for Kubernetes, you must add your EDB subscription token to
134+
`~/values-secret-rag-llm-gitops.yaml`:
126135
127136
```sh
128137
secrets:
129138
- name: hfmodel
130139
fields:
131140
- name: hftoken
132141
value: null
133-
- name: modelId
134-
value: "ibm-granite/granite-3.1-8b-instruct"
135142
- name: edb
136143
fields:
137144
- name: token
138145
value: "YOUR_EDB_TOKEN_HERE" # Replace with your EDB subscription token
139146
description: EDB subscription token for pulling certified operator images
140-
- name: minio
141-
fields:
142-
- name: MINIO_ROOT_USER
143-
value: minio
144-
- name: MINIO_ROOT_PASSWORD
145-
value: null
146-
onMissingValue: generate
147147
```
148148
149149
The EDB token is synced into Vault and then used by External Secrets to create the required pull secret (`postgresql-operator-pull-secret`) in `openshift-operators`. Without this token, the EDB operator will fail to pull its container image and the database will not be created.
150150
151-
### Provision GPU MachineSet
151+
If you are using PGVector or SQL Server, you can update the password in this file. Otherwise, an autogenerated
152+
password is used.
152153
153-
As a pre-requisite to deploy the application using the validated pattern, GPU nodes should be provisioned along with Node Feature Discovery Operator and NVIDIA GPU operator. To provision GPU Nodes
154+
### Provision GPU MachineSet
154155
155-
Following command will take about 5-10 minutes.
156+
As a pre-requisite to deploy the application using this Validated Pattern, a GPU node needs to be provisioned.
157+
To provision the GPU node on AWS:
156158
157159
```sh
158160
./pattern.sh make create-gpu-machineset
159161
```
160162
161-
Wait till the nodes are provisioned and running.
163+
Wait till the node is provisioned and running.
162164
163165
![Diagram](images/nodes.png)
164166
165-
Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install GPU nodes, Node Feature Discovery Operator and NVIDIA GPU operator.
167+
Alternatiely, follow the [instructions](./GPU_provisioning.md) to manually install the GPU node.
166168
167169
### Deploy application
168170
169-
***Note:**: This pattern supports four types of vector databases: PGVECTOR (local chart), EDB Postgres for Kubernetes, Elasticsearch, and Redis. By default the pattern will deploy PGVECTOR as a vector DB. To deploy EDB, set `global.db.type` to `EDB` in [values-global.yaml](./values-global.yaml).
171+
***Note:**: This pattern supports five types of vector databases: pgvector, EDB Postgres for Kubernetes, Elasticsearch, Redis, and SQL Server. By default the pattern will deploy pgvector as the RAG DB. To deploy EDB, set `global.db.type` to `EDB` in [values-global.yaml](./values-global.yaml).
170172
171173
```yaml
172174
---
@@ -176,14 +178,28 @@ global:
176178
useCSV: false
177179
syncPolicy: Automatic
178180
installPlanApproval: Automatic
179-
# Possible value for db.type = [REDIS, EDB, ELASTIC, PGVECTOR]
181+
# Possible values for RAG vector DB db.type:
182+
# REDIS -> Redis (Local chart deploy)
183+
# EDB -> PGVector via EDB operator (Local chart deploy)
184+
# PGVECTOR -> PGVector (Local Postgres chart deploy)
185+
# ELASTIC -> Elasticsearch (Local chart deploy)
186+
# MSSQL -> MS SQL Server (Local chart deploy)
187+
# AZURESQL -> Azure SQL (Pre-existing in Azure)
180188
db:
181189
index: docs
182-
type: PGVECTOR # <--- Default is PGVECTOR. Use EDB, REDIS, or ELASTIC as needed.
190+
type: PGVECTOR
191+
# Models used by the inference service (should be a HuggingFace model ID)
192+
model:
193+
vllm: ibm-granite/granite-3.3-8b-instruct
194+
embedding: sentence-transformers/all-mpnet-base-v2
195+
196+
storageClass: gp3-csi
197+
183198
main:
184199
clusterGroupName: hub
185200
multiSourceConfig:
186201
enabled: true
202+
clusterGroupChartVersion: 0.9.*
187203
```
188204
189205
Following commands will take about 15-20 minutes
Lines changed: 33 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,35 @@
1-
{{- if and (eq .Values.global.db.type "PGVECTOR") .Values.secret.create }}
2-
kind: Secret
3-
apiVersion: v1
1+
{{- if eq .Values.global.db.type "PGVECTOR" }}
2+
apiVersion: "external-secrets.io/v1beta1"
3+
kind: ExternalSecret
44
metadata:
5-
name: vectordb-app
6-
labels:
7-
{{- include "pgvector.labels" . | nindent 4 }}
8-
data:
9-
username: {{ .Values.secret.user | b64enc | quote }}
10-
password: {{ .Values.secret.password | b64enc | quote }}
11-
host: {{ default (include "pgvector.fullname" .) .Values.secret.host | b64enc | quote }}
12-
port: {{ .Values.secret.port | b64enc | quote }}
13-
dbname: {{ .Values.secret.dbname | b64enc | quote }}
14-
type: Opaque
5+
name: pgvector-external-secret
6+
spec:
7+
refreshInterval: 15s
8+
secretStoreRef:
9+
name: {{ .Values.secretStore.name }}
10+
kind: {{ .Values.secretStore.kind }}
11+
target:
12+
name: vectordb-app
13+
template:
14+
type: Opaque
15+
engineVersion: v2
16+
data:
17+
username: "{{ `{{ .username }}` }}"
18+
password: "{{ `{{ .password }}` }}"
19+
dbname: "{{ `{{ .dbname }}` }}"
20+
host: {{ include "pgvector.fullname" . }}
21+
port: "{{ .Values.service.port }}"
22+
data:
23+
- secretKey: username
24+
remoteRef:
25+
key: {{ .Values.secretStore.key }}
26+
property: username
27+
- secretKey: password
28+
remoteRef:
29+
key: {{ .Values.secretStore.key }}
30+
property: password
31+
- secretKey: dbname
32+
remoteRef:
33+
key: {{ .Values.secretStore.key }}
34+
property: dbname
1535
{{- end }}

charts/all/rag-llm/charts/pgvector/values.yaml

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -73,18 +73,10 @@ volumeMounts:
7373
nodeSelector: {}
7474
affinity: {}
7575

76-
# Secret configuration for pgvector
77-
# These values can be overridden from a parent chart or via --set flags
78-
# Example: helm install ...
79-
# --set pgvector.secret.user=myuser -
80-
# --set pgvector.secret.password=mypass
81-
secret:
82-
create: true
83-
user: postgres
84-
password: rag_password
85-
dbname: rag_blueprint
86-
host: pgvector
87-
port: "5432"
76+
secretStore:
77+
name: vault-backend
78+
kind: ClusterSecretStore
79+
key: secret/data/hub/pgvector
8880

8981
extraDatabases: []
9082
#- name: test_db

charts/all/rhods/templates/dsc.yaml

Lines changed: 0 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -14,18 +14,8 @@ spec:
1414
managementState: Managed
1515
kueue:
1616
managementState: Removed
17-
codeflare:
18-
managementState: Removed
1917
ray:
2018
managementState: Removed
21-
modelmeshserving:
22-
managementState: Managed
2319
kserve:
2420
managementState: Managed
25-
serving:
26-
ingressGateway:
27-
certificate:
28-
type: SelfSigned
29-
managementState: Managed
30-
name: knative-serving
3121
rawDeploymentServiceConfig: Headed

charts/all/vllm-inference-service/templates/accelerator-profile.yaml

Lines changed: 0 additions & 13 deletions
This file was deleted.
Lines changed: 20 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,20 @@
1+
{{- if .Values.hardwareProfile.enabled }}
2+
apiVersion: infrastructure.opendatahub.io/v1
3+
kind: HardwareProfile
4+
metadata:
5+
name: nvidia-gpu
6+
namespace: redhat-ods-applications
7+
spec:
8+
identifiers:
9+
- displayName: NVIDIA GPU
10+
defaultCount: 1
11+
minCount: 1
12+
maxCount: 1
13+
identifier: nvidia.com/gpu
14+
resourceType: Accelerator
15+
scheduling:
16+
type: Node
17+
node:
18+
tolerations:
19+
{{- toYaml .Values.vllmInferenceService.tolerations | nindent 8 }}
20+
{{- end }}

charts/all/vllm-inference-service/values.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ global:
55
vllmInferenceService:
66
annotations:
77
openshift.io/display-name: vllm-inference
8-
serving.kserve.io/deploymentMode: RawDeployment
8+
serving.kserve.io/deploymentMode: Standard
99
argocd.argoproj.io/sync-wave: "20"
1010

1111
predictor:
@@ -57,5 +57,5 @@ vllmServingRuntime:
5757

5858
port: 8080
5959

60-
acceleratorProfile:
60+
hardwareProfile:
6161
enabled: true

values-hub.yaml

Lines changed: 0 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -38,12 +38,6 @@ clusterGroup:
3838
name: elasticsearch-eck-operator-certified
3939
namespace: rag-llm
4040
source: certified-operators
41-
serverless:
42-
name: serverless-operator
43-
namespace: openshift-serverless
44-
servicemesh:
45-
name: servicemeshoperator
46-
namespace: openshift-operators
4741
rhoai:
4842
name: rhods-operator
4943
namespace: redhat-ods-operator

values-secret.yaml.template

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -2,15 +2,8 @@
22
# https://github.com/validatedpatterns/common/tree/main/ansible/roles/vault_utils#values-secret-file-format
33

44
version: "2.0"
5-
# Ideally you NEVER COMMIT THESE VALUES TO GIT (although if all passwords are
6-
# automatically generated inside the vault this should not really matter)
7-
8-
# In order to use huggingface models, you will need to
9-
# provide your token as a value for hftoken below.
10-
11-
# EDB Postgres Operator requires authentication to pull images from docker.enterprisedb.com
12-
# Register for a free trial at: https://www.enterprisedb.com/accounts/register
13-
# Get your token from: https://www.enterprisedb.com/repos-downloads
5+
# Do not update sensitive secrets (db credentials) in this file and commit to git.
6+
# Copy this template file to ~/values-secret-rag-llm-gitops and update secrets in your home directory
147

158
backingStore: vault
169

@@ -22,22 +15,47 @@ vaultPolicies:
2215
rule "charset" { charset = "0123456789" min-chars = 1 }
2316

2417
secrets:
18+
# This must be set to use models requiring huggingface authentication
19+
# The default model (ibm-granite/granite-3.3-8b-instruct) does not require authentication
2520
- name: hfmodel
2621
fields:
2722
- name: hftoken
2823
value: null
24+
25+
# Only used when .global.db is set to PGVECTOR in values-global.yaml
26+
- name: pgvector
27+
fields:
28+
- name: username
29+
value: postgres
30+
- name: password
31+
onMissingValue: generate
32+
override: true
33+
vaultPolicy: basicPolicy
34+
- name: dbname
35+
value: rag_blueprint
36+
37+
# Only used when .global.db is set to EDB in values-global.yaml
38+
# EDB Postgres Operator requires authentication to pull images from docker.enterprisedb.com
39+
# Register for a free trial at: https://www.enterprisedb.com/accounts/register
40+
# Get your token from: https://www.enterprisedb.com/repos-downloads
2941
- name: edb
3042
fields:
3143
- name: token
3244
value: null
3345
description: EDB subscription token for pulling certified operator images
46+
47+
# Only used when .global.db is set to MSSQL in values-global.yaml
48+
# The pattern creates a local SQL Server deployment. To use an existing SQL Server DB on Azure, use secret below.
3449
- name: mssql
3550
fields:
3651
- name: sa-pass
3752
onMissingValue: generate
3853
override: true
3954
vaultPolicy: basicPolicy
4055
description: mssql password for sa user
56+
57+
# Only used when .global.db is set to AZURESQL in values-global.yaml
58+
# The Azure SQL Server database needs to be created outside of the pattern.
4159
- name: azuresql
4260
fields:
4361
- name: user

0 commit comments

Comments
 (0)