Add explicit docs for consuming OKP

bparees · bparees · commit 539c7f045198 · 2026-03-17T10:14:49.000-04:00
diff --git a/docs/okp_guide.md b/docs/okp_guide.md
@@ -0,0 +1,196 @@
+# OKP Deployment and Configuration Guide
+
+This document explains how to deploy the Offline Knowledge Portal (OKP) as a RAG source and configure Lightspeed Stack and Llama Stack to use it. You will:
+
+* Configure Lightspeed Stack for OKP (inline or tool RAG)
+* Run the configuration enrichment step
+* Deploy and verify the OKP Solr service
+* Install dependencies and launch Lightspeed Stack
+* Confirm the end-to-end stack with a sample query
+
+For general RAG concepts, BYOK vector stores, and manual Llama Stack configuration, see the [RAG Configuration Guide](rag_guide.md).
+
+---
+
+## Table of Contents
+
+* [Introduction](#introduction)
+* [Prerequisites](#prerequisites)
+* [Step 1: Configure Lightspeed Stack for OKP](#step-1-configure-lightspeed-stack-for-okp)
+* [Step 2: Run Configuration Enrichment](#step-2-run-configuration-enrichment)
+* [Step 3: Launch OKP](#step-3-launch-okp)
+* [Step 4: Install Dependencies and Launch Lightspeed Stack](#step-4-install-dependencies-and-launch-lightspeed-stack)
+* [Step 5: Verify the Stack](#step-5-verify-the-stack)
+* [OKP Configuration Reference](#okp-configuration-reference)
+* [References](#references)
+
+---
+
+## Introduction
+
+OKP (Offline Knowledge Portal) provides a Solr-backed RAG source that Lightspeed Stack can use for both **Inline RAG** (context injected before the LLM request) and **Tool RAG** (context retrieved on demand via the `file_search` tool). This guide walks through deploying the OKP container, enriching your Llama Stack config from Lightspeed Stack settings, and validating that queries return referenced chunks.
+
+---
+
+## Prerequisites
+
+* Lightspeed Stack repository cloned and ready to run
+* [lightspeed-providers](https://github.com/lightspeed-core/lightspeed-providers) repository cloned
+* A base Llama Stack config file (e.g. `run.yaml`) that defines inference and other providers you need
+* [Podman](https://podman.io/) (or Docker) to run the OKP image
+* [uv](https://docs.astral.sh/uv/) for Python dependency management
+* An OpenAI API key (for inference when using OpenAI in your run config)
+
+---
+
+## Step 1: Configure Lightspeed Stack for OKP
+
+Edit your Lightspeed Stack config file (e.g. `lightspeed-stack.yaml`) and add the following top-level sections so that OKP is used for either inline or tool RAG:
+
+Inline RAG:
+```yaml
+# RAG configuration
+rag:
+  tool:
+  - okp
+okp:
+  offline: true
+  chunk_filter_query: "is_chunk:true"
+```
+
+Tool RAG:
+```yaml
+# RAG configuration
+rag:
+  tool:
+  - okp
+okp:
+  offline: true
+  chunk_filter_query: "is_chunk:true"
+```
+
+* **`rag.inline`** and **`rag.tool`**: Enable OKP as the RAG source for inline context injection and for the RAG tool.
+* **`okp.offline`**: When `true`, source URLs use `parent_id` (offline/Mimir-style). When `false`, use `reference_url` (online).
+* **`okp.chunk_filter_query`**: Solr filter query applied to every OKP search (e.g. `"is_chunk:true"` to restrict to chunk documents). You can extend this with Solr boolean syntax (see [RAG Guide - Query Filtering](rag_guide.md#query-filtering)).
+
+---
+
+## Step 2: Run Configuration Enrichment
+
+Enrich your base Llama Stack config with OKP (and any BYOK) settings from `lightspeed-stack.yaml` using the configuration script:
+
+```bash
+uv run src/llama_stack_configuration.py -c lightspeed-stack.yaml -i run.yaml -o run_enriched_new.yaml
+```
+
+* **`-c`**: Lightspeed Stack config (e.g. `lightspeed-stack.yaml`)
+* **`-i`**: Input Llama Stack config (e.g. `run.yaml`)
+* **`-o`**: Output enriched config (e.g. `run_enriched_new.yaml`)
+
+Use the **output file** (e.g. `run_enriched_new.yaml`) when starting Llama Stack / Lightspeed Stack so that OKP vector IO and related resources are registered.
+
+> [!TIP]
+> Re-run this command whenever you change RAG or OKP settings in `lightspeed-stack.yaml` so that the enriched run file stays in sync.
+
+---
+
+## Step 3: Launch OKP
+
+Start the OKP RAG service with Podman:
+
+```bash
+podman run --rm -d -p 8983:8080 images.paas.redhat.com/offline-kbase/rhokp-rag:mar-9-2026
+```
+
+Note: remove `-d` to run in the foreground
+
+* The service listens on **port 8983** on the host (mapped from 8080 in the container).
+* Confirm it is running by opening in a browser or with `curl`:
+
+  ```bash
+  curl -s http://localhost:8983
+  ```
+
+  Or visit: **http://localhost:8983**
+
+Ensure Lightspeed Stack is configured to use this OKP endpoint (the enrichment step typically adds the correct Solr/OKP URL; if you need a different host/port, adjust the `solr_url` config value accordingly).
+
+---
+
+## Step 4: Install Lightspeed Stack Dependencies
+
+Then install dependencies and (if needed) your custom providers:
+
+```bash
+uv sync --group dev --group llslibdev
+uv pip install -e ../lightspeed-providers # Path to lightspeed-providers repo
+```
+
+* **`EXTERNAL_PROVIDERS_DIR`**: Path to external providers in the lightspeed-providers repo (e.g. `../lightspeed-providers/resources/external_providers`). Required for Lightspeed Stack to load OKP and other external providers.
+* **`OPENAI_API_KEY`**: Your OpenAI API key; required when your run config uses OpenAI for inference.
+* **`uv sync`**: Installs project and dev/llslibdev groups so that the app and tooling run correctly.
+* **`uv pip install -e $PROVIDERS_DIR`**: Installs the lightspeed stack providers.
+
+Note:  Running `uv sync` will break the symlink created by the `uv pip install` command, so you will need to 
+rerun the `uv pip install` command if you rerun `uv sync`.
+
+
+## Step 5: Setup llamamstack config environment variables
+
+Set the required environment variables. The external providers path must point to the `external_providers` content inside the [lightspeed-providers](https://github.com/lightspeed-core/lightspeed-providers/tree/main/lightspeed_stack_providers/) repository:
+
+```bash
+export EXTERNAL_PROVIDERS_DIR=../lightspeed-providers/resources/external_providers
+export OPENAI_API_KEY=<your-openai-api-key>
+```
+
+Adjust `EXTERNAL_PROVIDERS_DIR` if your lightspeed-providers repo is in a different location relative to your current working directory.
+
+## Step 6: Launch Lightspeed Stack
+
+Then launch Lightspeed Stack using your **enriched** Llama Stack config (e.g. `run_enriched_new.yaml`) and your Lightspeed Stack config, following your usual run instructions (e.g. pointing to the enriched run file and the correct port).
+
+```bash
+make run
+```
+
+
+---
+
+## Step 7: Verify the Stack
+
+Confirm that the full stack (Lightspeed Stack + Llama Stack + OKP) is working by sending a query and checking that the response includes referenced chunks from OKP:
+
+```bash
+curl -sX POST http://localhost:8080/v1/query \
+  -H "Content-Type: application/json" \
+  -d '{"query": "configure remote desktop using gnome"}' | jq .
+```
+
+* Adjust the URL and port if your Lightspeed Stack API is exposed elsewhere.
+* In the JSON response, look for `rag_chunks` that indicate OKP/Solr results were retrieved.
+
+Example response excerpt:
+```json
+"rag_chunks": [
+{
+    "content": "You can connect from a Red Hat Enterprise Linux client to a remote desktop server by using the\n**Connections**\napplication. The connection depends on the remote server configuration.\n**Prerequisites**\n- Desktop sharing or remote login is enabled on the server. For more information, see [Enabling desktop sharing on the server by using GNOME](#enabling-desktop-sharing-on-the-server-by-using-gnome) or [Configuring GNOME remote login](#configuring-gnome-remote-login) .\n- For desktop sharing, a user is logged in to the GNOME graphical session on the server.\n- The `gnome-connections` package is installed on the client.\n**Procedure**\n1. On the client, launch the **Connections** application.\n2. Click the + button in the top bar to open a new connection.\n4. Enter the IP address of the server.\n5. Choose the connection type based on the operating system you want to connect to: Remote Desktop Protocol (RDP) Use RDP for connecting to Windows and RHEL 10 servers. Virtual Network Computing (VNC) Use VNC for connecting to servers with RHEL 9 and previous versions.\n6. Click Connect .\n**Verification**\n1. On the client, check that you can see the shared server desktop.\n2. On the server, a screen sharing indicator appears on the right side of the top panel: You can control screen sharing in the **System** menu of the server.",
+    "source": "okp",
+    "score": 826.40784,
+    "attributes": {
+    "doc_url": "https://mimir.corp.redhat.com/documentation/en-us/red_hat_enterprise_linux/10/html-single/administering_rhel_by_using_the_gnome_desktop_environment/index",
+    "document_id": "/documentation/en-us/red_hat_enterprise_linux/10/html-single/administering_rhel_by_using_the_gnome_desktop_environment/index"
+    }
+}
+],
+```
+
+Note: The first time you query the system the response may take additional time because it must first download the necessary embedding model to perform the vector search.
+
+If you see no RAG context, verify:
+
+1. OKP is up at http://localhost:8983
+2. You started Lightspeed Stack with the **enriched** run file from Step 2
+3. `lightspeed-stack.yaml` has `okp` under `rag.inline` and/or `rag.tool` as in Step 1
+
+---