Skip to content

Commit 6576038

Browse files
committed
Incorporated Jordan's and Gabe's comments
1 parent 65c37eb commit 6576038

10 files changed

Lines changed: 48 additions & 33 deletions

assemblies/shared/assembly-appendix-llm-requirements.adoc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,11 +13,11 @@ include::../modules/shared/con-large-language-model-llm-requirements.adoc[levelo
1313

1414
include::../modules/shared/con-openai-model-integration-for-your-deployment.adoc[leveloffset=+1]
1515

16-
include::../modules/shared/con-ollama-model-integration-for-local-development-environments.adoc[leveloffset=+1]
16+
include::../modules/shared/con-ollama-model-integration-requirements.adoc[leveloffset=+1]
1717

1818
include::../modules/shared/con-vllm-model-integration-for-high-throughput-inference.adoc[leveloffset=+1]
1919

20-
include::../modules/shared/con-vertex-ai-integration-for-scalable-model-deployment.adoc[leveloffset=+1]
20+
include::../modules/shared/con-vertex-ai-integration-for-gemini-models.adoc[leveloffset=+1]
2121

2222
ifdef::parent-context[:context: {parent-context}]
2323
ifndef::parent-context[:!context:]

modules/shared/con-large-language-model-llm-requirements.adoc

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,9 @@ To plan your {ls-short} deployment, you must determine which compatible large la
88

99
{ls-short} operates on a _Bring Your Own Model (BYOM)_ architecture. Because the service does not include a native model, you must connect a compatible inference provider during installation.
1010

11-
The underlying {lcs-short} service integrates with several platforms that support the OpenAI API specification or utilize the vLLM inference engine. Because there is no explicit {rhoai-brand-name} provider option in the configuration, you must route those deployments through the vLLM or OpenAI-compatible provider settings.
11+
The underlying {lcs-short} service integrates with platforms that support the OpenAI API specification or utilize the vLLM inference engine. Because there is no explicit {rhoai-brand-name} provider option in the configuration, you must route those deployments through the vLLM or OpenAI-compatible provider settings.
12+
13+
The `vllm` provider type communicates with endpoints that conform to the OpenAI API schema by automatically appending `/v1` to the configured provider URL. This mechanism allows you to use the `vllm` configuration for other hosted, OpenAI-compliant inference providers.
1214

1315
{ls-short} supports the following inference provider configurations:
1416

modules/shared/con-ollama-model-integration-for-local-development-environments.adoc

Lines changed: 0 additions & 16 deletions
This file was deleted.
Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
:_mod-docs-content-type: CONCEPT
2+
3+
[id="ollama-model-integration-requirements_{context}"]
4+
= Ollama model integration requirements
5+
6+
[role="_abstract"]
7+
To integrate the open-source Ollama framework with {ls-short}, you must ensure that your network topology allows the {ls-short} service to route traffic to the Ollama server endpoint.
8+
9+
The Ollama server operates as a containerized layer, providing a command-line interface (CLI) to download, manage, and execute open-source models such as Llama 3 and Mistral. You can deploy Ollama on both local workstations and cluster environments.
10+
11+
However, a cluster-deployed {ls-short} instance cannot access an Ollama server that runs exclusively on a workstation `localhost` interface. For cluster deployments, the Ollama server must reside on an externally accessible network perimeter or run directly inside the cluster.
12+
13+
The following integration configurations are supported:
14+
* Both {ls-short} and Ollama deploy on a local workstation.
15+
* {ls-short} deploys locally and connects to an externally accessible cluster Ollama server.
16+
* Both {ls-short} and Ollama deploy inside the cluster infrastructure.
17+
18+
19+
.Additional resources
20+
* link:https://ollama.com[Ollama project website]
21+
* link:https://hub.docker.com/r/ollama/ollama[Ollama server container image]
22+
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
:_mod-docs-content-type: CONCEPT
2+
3+
[id="vertex-ai-integration-for-gemini-models_{context}"]
4+
= Vertex AI integration for Gemini models
5+
6+
[role="_abstract"]
7+
To use Gemini models with {ls-short}, you can configure Google Cloud Vertex AI to act as your managed large language model (LLM) inference provider.
8+
9+
The underlying {lcs-short} service connects to Vertex AI to access hosted Gemini models. This integration provides {ls-short} with enterprise-grade language processing and chat assistance capabilities without requiring you to maintain a local inference server.
10+
11+
.Additional resources
12+
* link:https://cloud.google.com/vertex-ai/docs[Vertex AI documentation]
13+

modules/shared/con-vertex-ai-integration-for-scalable-model-deployment.adoc

Lines changed: 0 additions & 12 deletions
This file was deleted.

modules/shared/proc-configure-by-using-the-operator.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ stringData:
4040
VLLM_API_KEY: "<api_key>"
4141
ENABLE_VALIDATION: "true"
4242
VALIDATION_PROVIDER: "vllm"
43-
VALIDATION_MODEL_NAME: "gpt-4o-mini"
43+
VALIDATION_MODEL_NAME: "llama3.1"
4444
----
4545

4646
. Map your secret inside the `extraEnvs` section of the {backstage} CR to complete container provisioning:

modules/shared/proc-customize-chat-history-storage.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ Storing chat history records user prompts and responses. You must assess data pr
2525
[source,yaml]
2626
----
2727
conversation_cache:
28-
type: postgres
28+
type: "postgres"
2929
postgres:
3030
host: _<your_database_host>_
3131
port: _<your_database_port>_

modules/shared/proc-mirror-images-for-air-gapped-environments.adoc

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14,6 +14,7 @@ You must mirror the following {ls-short} images:
1414
.Prerequisites
1515
* You have a target mirror registry accessible to your disconnected cluster.
1616
* You authenticated to the {rhcr} and your target mirror registry.
17+
* You updated the cluster install secret (the 'pull-secret' in the 'openshift-config' namespace) to include the authentication credentials to your mirror registry. The `kubelet` requires these credentials to pull the sidecar images when starting up the {product-very-short} pod.
1718
1819
.Procedure
1920
. Extract or identify the image digests for the {ls-short} sidecar and initialization container images.

modules/shared/snip-lightspeed-secret-keys.adoc

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,10 @@
11
:_mod-docs-content-type: SNIPPET
22

3+
[IMPORTANT]
4+
====
5+
To disable an inference provider or configuration feature, you must leave the corresponding `ENABLE_*` variable completely unset. Setting an `ENABLE_*` variable to `false` does not disable the component because the underlying system checks only whether the variable is defined.
6+
====
7+
38
|===
49
| Key | Description
510

0 commit comments

Comments
 (0)