Merge pull request #3372 from pareenaverma/content_review

pareenaverma · web-flow · commit 103e2d21c79b · 2026-06-08T11:49:01.000-04:00
Tech review llama index LP
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md
@@ -44,17 +44,17 @@ operatingsystems:
 
 further_reading:
     - resource:
-      title: LlamaIndex official documentation
-      link: https://docs.llamaindex.ai/en/stable/
-      type: documentation
+        title: LlamaIndex official documentation
+        link: https://docs.llamaindex.ai/en/stable/
+        type: documentation
     - resource:
-      title: LlamaIndex GitHub repository
-      link: https://github.com/run-llama/llama_index
-      type: documentation
+        title: LlamaIndex GitHub repository
+        link: https://github.com/run-llama/llama_index
+        type: documentation
     - resource:
-      title: Ollama documentation
-      link: https://ollama.com/library
-      type: documentation
+        title: Ollama documentation
+        link: https://ollama.com/library
+        type: documentation
     - resource:
         title: Introducing Google Axion Processors, our new Arm-based CPUs
         link: https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md
@@ -19,17 +19,10 @@ You'll:
 - Create a FastAPI backend
 - Query documents directly from a web browser
 
-## Terminal usage
-
-You'll use:
-
-- **Terminal A** → FastAPI, file creation, and testing
-- **Terminal B** → Ollama server
-
-Leave Terminal B running throughout the rest of this Learning Path.
-
 ## Architecture
 
+The following diagram shows how the components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
+
 ```text
 Browser UI
     ↓
@@ -42,11 +35,11 @@ ChromaDB Vector Store
 Ollama Local LLM
     ↓
 Documents
-````
+```
 
 ## Activate the Python environment
 
-Open Terminal A and activate the Python virtual environment:
+Activate the Python virtual environment:
 
 ```bash
 cd ~/llamaindex-rag
@@ -320,9 +313,13 @@ EOF
 
 ## Start the browser-based RAG application
 
-Make sure Ollama is still running in Terminal B.
+Verify that Ollama is still running before starting the application:
+
+```bash
+sudo systemctl status ollama
+```
 
-In Terminal A run:
+Activate the virtual environment and navigate to the project directory:
 
 ```bash
 cd ~/llamaindex-rag
@@ -389,13 +386,13 @@ Copy your own files into the data directory:
 cp yourfile.txt ~/llamaindex-rag/data/
 ```
 
-First stop the server and then restart FastAPI:
+Stop the running FastAPI server by pressing `Ctrl+C` in the terminal where Uvicorn is running. Then restart it:
 
 ```bash
 uvicorn api:app --host 0.0.0.0 --port 8000
 ```
 
-The application automatically indexes the new documents and makes them searchable through the browser UI.
+The `build_query_engine()` function runs on startup and reads all documents from the `data/` directory each time the server starts. Restarting the server causes LlamaIndex to ingest the new file, generate its embeddings, and store them in ChromaDB, making the new document searchable through the browser UI.
 
 ## What you've accomplished
 
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md
@@ -28,22 +28,20 @@ To configure a firewall rule for the LlamaIndex browser-based RAG application:
 
 5. Under **Protocols and ports**, select **Specified protocols and ports**.
 
-6. Select the **TCP** checkbox and enter:
+6. Select the **TCP** checkbox. Port **8000** is used by the FastAPI server that backs the browser-based LlamaIndex RAG application. Enter:
 
 ```text
 8000
-````
-
-Use port mapping **8000** for the browser-based LlamaIndex RAG application running with FastAPI.
+```
 
 ![Google Cloud Console Protocols and ports section with TCP selected and port 8000 entered#center](images/network-port.png "Setting the LlamaIndex browser application port in the firewall rule")
 
-7. Also add port 22 in **TCP** checkbox for ssh access.
+7. In the same **TCP** field, also add port `22` to allow SSH access to the VM.
 
 8. Select **Create**.
 
 ## What you've accomplished and what's next
 
-You've created a firewall rule to expose the browser-based LlamaIndex RAG application. You also enabled external access to query documents and interact with the application directly from a web browser.
+You've created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next step.
 
-Next, you'll access the browser-based RAG application using the external IP address of your Google Cloud Axion virtual machine.
+Next, you'll create a Google Cloud Axion C4A virtual machine and connect to it using SSH.
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md
@@ -24,7 +24,7 @@ To create a virtual machine based on the C4A instance type in the console:
 5. Under **OS and storage**, select **Change** and then choose an Arm64-based operating system image. For this Learning Path, select **SUSE Linux Enterprise Server**.
 6. For the license type, choose **Pay as you go**.
 7. Increase **Size (GB)** from **10** to **100** to allocate sufficient disk space, and then click **Select**.
-8. Select **Networking** from column on the left
+8. Select **Networking** from the column on the left.
 9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous step and allow inbound access to port 8000 for the browser-based LlamaIndex RAG application and port 22 for ssh access.
 10. Select **Create** to launch the virtual machine.
 
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md
@@ -6,18 +6,16 @@ weight: 5
 layout: learningpathall
 ---
 
-## Install and Configure LlamaIndex on Google Cloud Axion
+## Prepare the environment
 
 In this section, you will prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
 
 You will:
 
 - Verify the VM architecture
 - Install required system packages
-- Install Docker
 - Install Python 3.11
-- Install Ollama
-- Pull a lightweight LLM model
+- Install Ollama and pull a lightweight LLM model
 - Install LlamaIndex and required Python packages
 
 
@@ -26,16 +24,11 @@ You will:
 ```text
 Cloud: Google Cloud Platform
 VM Type: C4A Axion ARM64
-OS: SUSE Linux Enterprise Server 15 SP6
+OS: SUSE Linux Enterprise Server 15 SP5
 Architecture: aarch64
 RAM: 16 GB or higher recommended
 ```
 
-## Terminal usage You'll use
-
-- **Terminal A** → setup, package installation, FastAPI, and testing
-- **Terminal B** → Ollama server Open both terminals connected to the VM before starting.
-
 ## Verify VM architecture
 
 ```bash
@@ -70,8 +63,9 @@ sudo zypper update -y
 
 This ensures your system is up to date before installing anything.
 
-## Install required packages:
-Now install Python 3.11 and other tools:
+## Install required packages
+
+Install Python 3.11 and the build tools needed to compile Python packages with native extensions:
 
 ```bash
 sudo zypper install -y \
@@ -92,7 +86,7 @@ python311-setuptools \
 python311-wheel
 ```
 
-**Verify Python:**
+Verify Python is installed correctly:
 
 ```bash
 python3.11 --version
@@ -105,15 +99,17 @@ Python 3.11.10
 pip 22.3.1 from /usr/lib/python3.11/site-packages/pip (python 3.11)
 ```
 
-## Install Docker and Add current user to Docker group
+## Install Docker
+
+Docker is installed here so that you can run containerized workloads alongside the RAG pipeline if needed. For this Learning Path, ChromaDB and Ollama run natively, but Docker is available for extended use.
 
 ```bash
 sudo zypper install -y docker
 sudo systemctl enable docker
 sudo systemctl start docker
 ```
 
-**Check Docker Add current user to Docker group:**
+Verify Docker is running and add your user to the `docker` group so you don't need `sudo` for Docker commands:
 
 ```bash
 sudo systemctl status docker
@@ -136,19 +132,21 @@ This message shows that your installation appears to be working correctly.
 
 ## Create project directory
 
+Create a project directory and a Python virtual environment. The virtual environment isolates the Python packages for this project from your system packages:
+
 ```bash
 mkdir -p ~/llamaindex-rag/data
 cd ~/llamaindex-rag
 ```
 
-**Create and Activate Python virtual environment:**
+Create and activate the Python virtual environment:
 
 ```bash
 python3.11 -m venv rag-env
 source rag-env/bin/activate
 ```
 
-**Upgrade pip:**
+Upgrade pip to the latest version:
 
 ```bash
 pip install --upgrade pip setuptools wheel
@@ -160,7 +158,7 @@ pip install --upgrade pip setuptools wheel
 curl -fsSL https://ollama.com/install.sh | sh
 ```
 
-**Verify:**
+Verify the Ollama version:
 
 ```bash
 ollama -v
@@ -172,32 +170,29 @@ The output is similar to:
 ollama version is 0.24.0
 ```
 
-## Start Ollama
+## Check Ollama is running
 
-When Ollama is installed via the official script, it sets up a systemd background service and automatically starts the service. Use the following command to check the status of ollama service.
+When installed using the official script, Ollama registers itself as a systemd service and starts automatically. Verify it is running:
 
 ```bash
 sudo systemctl status ollama
 ```
 
-Leave Terminal B open and don't run any other commands in it. Ollama must stay running throughout the rest of this Learning Path.
-
-## Open a new terminal
-
-Open a second SSH terminal and run:
+If the service is not running, start it:
 
 ```bash
-cd ~/llamaindex-rag
-source rag-env/bin/activate
+sudo systemctl start ollama
 ```
 
 ## Pull an LLM model
 
+With Ollama running, pull the `llama3.2:1b` model. This is a lightweight 1-billion parameter model suitable for local inference on a 16 GB VM:
+
 ```bash
 ollama pull llama3.2:1b
 ```
 
-**Test the model:**
+Test that the model responds correctly:
 
 ```bash
 ollama run llama3.2:1b "Explain RAG in one sentence."
@@ -206,12 +201,13 @@ ollama run llama3.2:1b "Explain RAG in one sentence."
 The output is similar to:
 
 ```output
-RAG (Resource Allocation Group) is a method of allocating resources, such as people or equipment, to tasks based on their criticality and urgency,
-prioritizing high-priority tasks that have significant consequences if not completed on time.
+Retrieval-Augmented Generation (RAG) is a technique that combines a retrieval step, which fetches relevant documents from a knowledge base, with a generation step, where a large language model uses those documents to produce a grounded, context-aware response.
 ```
 
 ## Install LlamaIndex packages
 
+Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. FastAPI and Uvicorn are also installed here because the browser-based application you'll build in the next section uses them as the web server:
+
 ```bash
 pip install llama-index
 pip install llama-index-llms-ollama