first pass

anupras-mohapatra-arm · anupras-mohapatra-arm · commit fc23ef5ed8e1 · 2026-06-09T11:23:59.000-05:00
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md
@@ -1,9 +1,5 @@
 ---
-title: Build RAG applications with LlamaIndex on Google Cloud C4A Axion VM
-
-draft: true
-cascade:
-    draft: true
+title: Build RAG applications with LlamaIndex on a Google Cloud C4A Axion virtual machine
 
 description: Set up LlamaIndex on Google Cloud C4A Axion Arm VMs running SUSE Linux to build browser-based Retrieval-Augmented Generation (RAG) applications using local LLMs, vector databases, and FastAPI.
 
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/background.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/background.md
@@ -13,7 +13,7 @@ The C4A series provides a cost-effective alternative to x86 virtual machines whi
 
 ## LlamaIndex for RAG and context-aware AI applications on Arm
 
-LlamaIndex is an open-source framework designed to build context-aware AI applications using Large Language Models (LLMs). It's widely used for Retrieval-Augmented Generation (RAG), document indexing, vector search, semantic retrieval, and integrating custom data sources with LLMs.
+LlamaIndex is an open-source framework designed to build context-aware AI applications using Large Language Models (LLMs). It's widely used for RAG, document indexing, vector search, semantic retrieval, and integrating custom data sources with LLMs.
 
 LlamaIndex provides a unified framework with components such as:
 
@@ -31,4 +31,4 @@ Common use cases include browser-based AI assistants, document search applicatio
 
 You've now learned about Google Axion C4A Arm-based virtual machines and their performance advantages for AI and RAG workloads. You were also introduced to core LlamaIndex components including document ingestion, indexing pipelines, query engines, vector stores, and LLM integrations.
 
-Next, you'll create a firewall rule in Google Cloud Console to enable remote access to the browser-based LlamaIndex RAG application used in this Learning Path.
+Next, you'll create a firewall rule in Google Cloud Console to enable remote access to the browser-based LlamaIndex RAG application that you'll create in this Learning Path.
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md
@@ -1,14 +1,14 @@
 ---
-title: Build a Browser-Based RAG Application with LlamaIndex
+title: Build and test a browser-based RAG application with LlamaIndex
 weight: 6
 
 ### FIXED, DO NOT MODIFY
 layout: learningpathall
 ---
 
-## Build a Browser-Based RAG Application with LlamaIndex
+## Build a browser-based RAG application 
 
-In this section, you'll build and test a browser-based Retrieval-Augmented Generation (RAG) application using LlamaIndex on Google Cloud Axion Arm64.
+In this section, you'll build and test a browser-based Retrieval-Augmented Generation (RAG) application using LlamaIndex.
 
 You'll:
 
@@ -19,9 +19,9 @@ You'll:
 - Create a FastAPI backend
 - Query documents directly from a web browser
 
-## Architecture
+### Application architecture
 
-The following diagram shows how the components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
+The following flow shows how the application components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
 
 ```text
 Browser UI
@@ -37,7 +37,7 @@ Ollama Local LLM
 Documents
 ```
 
-## Activate the Python environment
+### Activate the Python environment
 
 Activate the Python virtual environment:
 
@@ -46,7 +46,7 @@ cd ~/llamaindex-rag
 source rag-env/bin/activate
 ```
 
-## Create sample documents
+### Create sample documents
 
 Create the first document:
 
@@ -72,7 +72,7 @@ LlamaIndex is a framework for building context-aware LLM applications using inde
 EOF
 ```
 
-## Create the RAG engine
+### Create the RAG engine
 
 Create the main LlamaIndex application:
 
@@ -151,7 +151,7 @@ def build_query_engine():
 EOF
 ```
 
-## Create browser UI
+### Create browser UI
 
 Create a browser-based interface for asking questions:
 
@@ -268,7 +268,7 @@ async function askQuestion() {
 EOF
 ```
 
-## Create FastAPI backend
+### Create FastAPI backend
 
 Create the FastAPI backend application:
 
@@ -341,8 +341,11 @@ INFO:     Application startup complete.
 INFO:     Uvicorn running on http://0.0.0.0:8000
 ```
 
+## Test the browser-based RAG application
 
-## Open browser application
+After starting the application, open the application UI and test the application to make sure it works. 
+
+### Open browser application UI
 
 Open a browser and navigate to:
 
@@ -354,7 +357,7 @@ This opens the browser-based RAG application UI.
 
 ![Browser-based RAG application showing a question input box and generated response using LlamaIndex and Ollama#center](images/rag-browser.png "Browser-based LlamaIndex RAG application")
 
-## Test browser-based Q&A
+### Test browser-based Q&A
 
 Ask the following questions in the browser UI:
 
@@ -380,7 +383,9 @@ The answers will appear directly in the browser interface.
 
 ## Add your own documents
 
-Copy your own files into the data directory:
+After confirming that the application works, you can try adding your own documents.
+
+Copy your own files into the data directory. For example:
 
 ```bash
 cp yourfile.txt ~/llamaindex-rag/data/
@@ -397,3 +402,5 @@ The `build_query_engine()` function runs on startup and reads all documents from
 ## What you've accomplished
 
 You've successfully built a browser-based RAG application using LlamaIndex on a Google Cloud Axion Arm64 VM. You created sample documents, generated embeddings using HuggingFace models, stored vectors in ChromaDB, exposed the backend using FastAPI, and queried custom documents directly from a browser using Ollama.
+
+You can extend this workflow for your own LlamaIndex RAG applications on Arm-based cloud infrastructure. 
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md
@@ -10,24 +10,21 @@ layout: learningpathall
 
 Create a firewall rule in Google Cloud Console to expose the required port for the browser-based LlamaIndex RAG application.
 
-## Configure the firewall rule in Google Cloud Console
+### Configure the firewall rule in Google Cloud Console
 
-To configure a firewall rule for the LlamaIndex browser-based RAG application:
+To configure a firewall rule:
 
-1. Navigate to the [Google Cloud Console](https://console.cloud.google.com/), go to **VPC Network > Firewall**, and select **Create firewall rule**.
+1. Navigate to the [Google Cloud Console](https://console.cloud.google.com/).
+2. Go to **VPC Network > Firewall**, and select **Create firewall rule**.
 
 ![Google Cloud Console VPC Network Firewall page showing the Create firewall rule button in the top menu bar#center](images/firewall-rule.png "Create a firewall rule in Google Cloud Console")
 
-2. Create a firewall rule that exposes the port required for the LlamaIndex browser application.
-
 3. Set **Name** to `allow-llamaindex-port`, then select the network you want to bind to your virtual machine.
-
 4. Set **Direction of traffic** to **Ingress**, set **Action on match** to **Allow**, set **Targets** to **All instances in the network**, and set **Source IPv4 ranges** to **0.0.0.0/0**.
 
 ![Google Cloud Console Create firewall rule form with Name set to allow-llamaindex-port and Direction of traffic set to Ingress#center](images/network-rule.png "Configuring the allow-llamaindex-port firewall rule")
 
 5. Under **Protocols and ports**, select **Specified protocols and ports**.
-
 6. Select the **TCP** checkbox. Port **8000** is used by the FastAPI server that backs the browser-based LlamaIndex RAG application. Enter:
 
 ```text
@@ -37,11 +34,10 @@ To configure a firewall rule for the LlamaIndex browser-based RAG application:
 ![Google Cloud Console Protocols and ports section with TCP selected and port 8000 entered#center](images/network-port.png "Setting the LlamaIndex browser application port in the firewall rule")
 
 7. In the same **TCP** field, also add port `22` to allow SSH access to the VM.
-
 8. Select **Create**.
 
 ## What you've accomplished and what's next
 
-You've created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next step.
+You've now created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next section.
 
 Next, you'll create a Google Cloud Axion C4A virtual machine and connect to it using SSH.
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md
@@ -8,9 +8,9 @@ layout: learningpathall
 
 ## Set up the virtual machine
 
-In this section, you'll create a Google Axion C4A Arm-based virtual machine (VM) on Google Cloud Platform (GCP). You'll use the `c4a-standard-4` machine type, which provides 4 vCPUs and 16 GB of memory. This VM will host your browser-based LlamaIndex RAG application.
+In this section, you'll create a Google Axion C4A Arm-based virtual machine (VM). You'll use the `c4a-standard-4` machine type, which provides 4 vCPUs and 16 GB of memory. This VM will host your browser-based LlamaIndex RAG application.
 
-## Configure the C4A virtual machine in Google Cloud Console
+### Configure the C4A virtual machine in Google Cloud Console
 
 To create a virtual machine based on the C4A instance type in the console:
 
@@ -25,7 +25,7 @@ To create a virtual machine based on the C4A instance type in the console:
 6. For the license type, choose **Pay as you go**.
 7. Increase **Size (GB)** from **10** to **100** to allocate sufficient disk space, and then click **Select**.
 8. Select **Networking** from the column on the left.
-9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous step and allow inbound access to port 8000 for the browser-based LlamaIndex RAG application and port 22 for ssh access.
+9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous section and allow inbound access to port `8000` for the browser-based LlamaIndex RAG application and port `22` for ssh access.
 10. Select **Create** to launch the virtual machine.
 
 After the instance starts, select **SSH** next to the VM in the instance list to open a browser-based terminal session.
diff --git a/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md b/content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md
@@ -1,5 +1,5 @@
 ---
-title: Install and Configure LlamaIndex on Google Cloud Axion
+title: Install and configure LlamaIndex on Google Cloud Axion
 weight: 5
 
 ### FIXED, DO NOT MODIFY
@@ -8,52 +8,17 @@ layout: learningpathall
 
 ## Prepare the environment
 
-In this section, you will prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
+In this section, you'll prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
 
-You will:
+You'll install:
 
-- Verify the VM architecture
-- Install required system packages
-- Install Python 3.11
-- Install Ollama and pull a lightweight LLM model
-- Install LlamaIndex and required Python packages
+- required system packages
+- Python 3.11
+- Ollama 
+- LlamaIndex and required Python packages
 
+### Update the virtual machine
 
-## Target environment
-
-```text
-Cloud: Google Cloud Platform
-VM Type: C4A Axion ARM64
-OS: SUSE Linux Enterprise Server 15 SP5
-Architecture: aarch64
-RAM: 16 GB or higher recommended
-```
-
-## Verify VM architecture
-
-```bash
-uname -m
-cat /etc/os-release
-```
-
-The output is similar to:
-
-```output
-aarch64
-NAME="SLES"
-VERSION="15-SP5"
-VERSION_ID="15.5"
-PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"
-ID="sles"
-ID_LIKE="suse"
-ANSI_COLOR="0;32"
-CPE_NAME="cpe:/o:suse:sles:15:sp5"
-DOCUMENTATION_URL="https://documentation.suse.com/"
-```
-
-This confirms you are on an Arm-based VM.
-
-## Update the VM
 Update all system packages:
 
 ```bash
@@ -63,7 +28,7 @@ sudo zypper update -y
 
 This ensures your system is up to date before installing anything.
 
-## Install required packages
+### Install required packages
 
 Install Python 3.11 and the build tools needed to compile Python packages with native extensions:
 
@@ -99,9 +64,9 @@ Python 3.11.10
 pip 22.3.1 from /usr/lib/python3.11/site-packages/pip (python 3.11)
 ```
 
-## Install Docker
+### (Optional) Install Docker
 
-Docker is installed here so that you can run containerized workloads alongside the RAG pipeline if needed. For this Learning Path, ChromaDB and Ollama run natively, but Docker is available for extended use.
+For this Learning Path, ChromaDB and Ollama run natively. For extended use, you can install Docker so that you can run containerized workloads alongside the RAG pipeline if needed:
 
 ```bash
 sudo zypper install -y docker
@@ -117,7 +82,7 @@ sudo usermod -aG docker $USER
 newgrp docker
 ```
 
-**Test Docker:**
+Test Docker:
 
 ```bash
 docker run hello-world
@@ -130,7 +95,7 @@ Hello from Docker!
 This message shows that your installation appears to be working correctly.
 ```
 
-## Create project directory
+### Create project directory
 
 Create a project directory and a Python virtual environment. The virtual environment isolates the Python packages for this project from your system packages:
 
@@ -152,7 +117,9 @@ Upgrade pip to the latest version:
 pip install --upgrade pip setuptools wheel
 ```
 
-## Install Ollama
+### Install Ollama
+
+Run the following command:
 
 ```bash
 curl -fsSL https://ollama.com/install.sh | sh
@@ -170,7 +137,7 @@ The output is similar to:
 ollama version is 0.24.0
 ```
 
-## Check Ollama is running
+### Check Ollama is running
 
 When installed using the official script, Ollama registers itself as a systemd service and starts automatically. Verify it is running:
 
@@ -184,7 +151,7 @@ If the service is not running, start it:
 sudo systemctl start ollama
 ```
 
-## Pull an LLM model
+### Pull an LLM model
 
 With Ollama running, pull the `llama3.2:1b` model. This is a lightweight 1-billion parameter model suitable for local inference on a 16 GB VM:
 
@@ -204,9 +171,9 @@ The output is similar to:
 Retrieval-Augmented Generation (RAG) is a technique that combines a retrieval step, which fetches relevant documents from a knowledge base, with a generation step, where a large language model uses those documents to produce a grounded, context-aware response.
 ```
 
-## Install LlamaIndex packages
+### Install LlamaIndex packages
 
-Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. FastAPI and Uvicorn are also installed here because the browser-based application you'll build in the next section uses them as the web server:
+Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. You'll also install FastAPI and Uvicorn here because the browser-based application you'll build in the next section uses them as the web server:
 
 ```bash
 pip install llama-index
@@ -221,6 +188,6 @@ pip install uvicorn
 
 ## What you've accomplished and what's next
 
-You've successfully installed and configured LlamaIndex on a Google Cloud Axion Arm64 VM running SUSE Linux with Python 3.11. You installed Docker, configured Ollama for local LLM inference, and prepared the environment for building browser-based RAG applications using LlamaIndex and ChromaDB.
+You've now successfully installed and configured LlamaIndex on a Google Cloud Axion Arm64 VM running SUSE Linux with Python 3.11. You optionally installed Docker, configured Ollama for local LLM inference, and prepared the environment for building browser-based RAG applications using LlamaIndex and ChromaDB.
 
 Next, you'll build the RAG engine, create the browser UI, and query custom documents using a local large language model.