Skip to content

Commit fc23ef5

Browse files
first pass
1 parent 8d7a85e commit fc23ef5

6 files changed

Lines changed: 52 additions & 86 deletions

File tree

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md

Lines changed: 1 addition & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,5 @@
11
---
2-
title: Build RAG applications with LlamaIndex on Google Cloud C4A Axion VM
3-
4-
draft: true
5-
cascade:
6-
draft: true
2+
title: Build RAG applications with LlamaIndex on a Google Cloud C4A Axion virtual machine
73

84
description: Set up LlamaIndex on Google Cloud C4A Axion Arm VMs running SUSE Linux to build browser-based Retrieval-Augmented Generation (RAG) applications using local LLMs, vector databases, and FastAPI.
95

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/background.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ The C4A series provides a cost-effective alternative to x86 virtual machines whi
1313

1414
## LlamaIndex for RAG and context-aware AI applications on Arm
1515

16-
LlamaIndex is an open-source framework designed to build context-aware AI applications using Large Language Models (LLMs). It's widely used for Retrieval-Augmented Generation (RAG), document indexing, vector search, semantic retrieval, and integrating custom data sources with LLMs.
16+
LlamaIndex is an open-source framework designed to build context-aware AI applications using Large Language Models (LLMs). It's widely used for RAG, document indexing, vector search, semantic retrieval, and integrating custom data sources with LLMs.
1717

1818
LlamaIndex provides a unified framework with components such as:
1919

@@ -31,4 +31,4 @@ Common use cases include browser-based AI assistants, document search applicatio
3131

3232
You've now learned about Google Axion C4A Arm-based virtual machines and their performance advantages for AI and RAG workloads. You were also introduced to core LlamaIndex components including document ingestion, indexing pipelines, query engines, vector stores, and LLM integrations.
3333

34-
Next, you'll create a firewall rule in Google Cloud Console to enable remote access to the browser-based LlamaIndex RAG application used in this Learning Path.
34+
Next, you'll create a firewall rule in Google Cloud Console to enable remote access to the browser-based LlamaIndex RAG application that you'll create in this Learning Path.

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md

Lines changed: 20 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
---
2-
title: Build a Browser-Based RAG Application with LlamaIndex
2+
title: Build and test a browser-based RAG application with LlamaIndex
33
weight: 6
44

55
### FIXED, DO NOT MODIFY
66
layout: learningpathall
77
---
88

9-
## Build a Browser-Based RAG Application with LlamaIndex
9+
## Build a browser-based RAG application
1010

11-
In this section, you'll build and test a browser-based Retrieval-Augmented Generation (RAG) application using LlamaIndex on Google Cloud Axion Arm64.
11+
In this section, you'll build and test a browser-based Retrieval-Augmented Generation (RAG) application using LlamaIndex.
1212

1313
You'll:
1414

@@ -19,9 +19,9 @@ You'll:
1919
- Create a FastAPI backend
2020
- Query documents directly from a web browser
2121

22-
## Architecture
22+
### Application architecture
2323

24-
The following diagram shows how the components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
24+
The following flow shows how the application components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
2525

2626
```text
2727
Browser UI
@@ -37,7 +37,7 @@ Ollama Local LLM
3737
Documents
3838
```
3939

40-
## Activate the Python environment
40+
### Activate the Python environment
4141

4242
Activate the Python virtual environment:
4343

@@ -46,7 +46,7 @@ cd ~/llamaindex-rag
4646
source rag-env/bin/activate
4747
```
4848

49-
## Create sample documents
49+
### Create sample documents
5050

5151
Create the first document:
5252

@@ -72,7 +72,7 @@ LlamaIndex is a framework for building context-aware LLM applications using inde
7272
EOF
7373
```
7474

75-
## Create the RAG engine
75+
### Create the RAG engine
7676

7777
Create the main LlamaIndex application:
7878

@@ -151,7 +151,7 @@ def build_query_engine():
151151
EOF
152152
```
153153

154-
## Create browser UI
154+
### Create browser UI
155155

156156
Create a browser-based interface for asking questions:
157157

@@ -268,7 +268,7 @@ async function askQuestion() {
268268
EOF
269269
```
270270

271-
## Create FastAPI backend
271+
### Create FastAPI backend
272272

273273
Create the FastAPI backend application:
274274

@@ -341,8 +341,11 @@ INFO: Application startup complete.
341341
INFO: Uvicorn running on http://0.0.0.0:8000
342342
```
343343

344+
## Test the browser-based RAG application
344345

345-
## Open browser application
346+
After starting the application, open the application UI and test the application to make sure it works.
347+
348+
### Open browser application UI
346349

347350
Open a browser and navigate to:
348351

@@ -354,7 +357,7 @@ This opens the browser-based RAG application UI.
354357

355358
![Browser-based RAG application showing a question input box and generated response using LlamaIndex and Ollama#center](images/rag-browser.png "Browser-based LlamaIndex RAG application")
356359

357-
## Test browser-based Q&A
360+
### Test browser-based Q&A
358361

359362
Ask the following questions in the browser UI:
360363

@@ -380,7 +383,9 @@ The answers will appear directly in the browser interface.
380383

381384
## Add your own documents
382385

383-
Copy your own files into the data directory:
386+
After confirming that the application works, you can try adding your own documents.
387+
388+
Copy your own files into the data directory. For example:
384389

385390
```bash
386391
cp yourfile.txt ~/llamaindex-rag/data/
@@ -397,3 +402,5 @@ The `build_query_engine()` function runs on startup and reads all documents from
397402
## What you've accomplished
398403

399404
You've successfully built a browser-based RAG application using LlamaIndex on a Google Cloud Axion Arm64 VM. You created sample documents, generated embeddings using HuggingFace models, stored vectors in ChromaDB, exposed the backend using FastAPI, and queried custom documents directly from a browser using Ollama.
405+
406+
You can extend this workflow for your own LlamaIndex RAG applications on Arm-based cloud infrastructure.

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,21 @@ layout: learningpathall
1010

1111
Create a firewall rule in Google Cloud Console to expose the required port for the browser-based LlamaIndex RAG application.
1212

13-
## Configure the firewall rule in Google Cloud Console
13+
### Configure the firewall rule in Google Cloud Console
1414

15-
To configure a firewall rule for the LlamaIndex browser-based RAG application:
15+
To configure a firewall rule:
1616

17-
1. Navigate to the [Google Cloud Console](https://console.cloud.google.com/), go to **VPC Network > Firewall**, and select **Create firewall rule**.
17+
1. Navigate to the [Google Cloud Console](https://console.cloud.google.com/).
18+
2. Go to **VPC Network > Firewall**, and select **Create firewall rule**.
1819

1920
![Google Cloud Console VPC Network Firewall page showing the Create firewall rule button in the top menu bar#center](images/firewall-rule.png "Create a firewall rule in Google Cloud Console")
2021

21-
2. Create a firewall rule that exposes the port required for the LlamaIndex browser application.
22-
2322
3. Set **Name** to `allow-llamaindex-port`, then select the network you want to bind to your virtual machine.
24-
2523
4. Set **Direction of traffic** to **Ingress**, set **Action on match** to **Allow**, set **Targets** to **All instances in the network**, and set **Source IPv4 ranges** to **0.0.0.0/0**.
2624

2725
![Google Cloud Console Create firewall rule form with Name set to allow-llamaindex-port and Direction of traffic set to Ingress#center](images/network-rule.png "Configuring the allow-llamaindex-port firewall rule")
2826

2927
5. Under **Protocols and ports**, select **Specified protocols and ports**.
30-
3128
6. Select the **TCP** checkbox. Port **8000** is used by the FastAPI server that backs the browser-based LlamaIndex RAG application. Enter:
3229

3330
```text
@@ -37,11 +34,10 @@ To configure a firewall rule for the LlamaIndex browser-based RAG application:
3734
![Google Cloud Console Protocols and ports section with TCP selected and port 8000 entered#center](images/network-port.png "Setting the LlamaIndex browser application port in the firewall rule")
3835

3936
7. In the same **TCP** field, also add port `22` to allow SSH access to the VM.
40-
4137
8. Select **Create**.
4238

4339
## What you've accomplished and what's next
4440

45-
You've created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next step.
41+
You've now created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next section.
4642

4743
Next, you'll create a Google Cloud Axion C4A virtual machine and connect to it using SSH.

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,9 +8,9 @@ layout: learningpathall
88

99
## Set up the virtual machine
1010

11-
In this section, you'll create a Google Axion C4A Arm-based virtual machine (VM) on Google Cloud Platform (GCP). You'll use the `c4a-standard-4` machine type, which provides 4 vCPUs and 16 GB of memory. This VM will host your browser-based LlamaIndex RAG application.
11+
In this section, you'll create a Google Axion C4A Arm-based virtual machine (VM). You'll use the `c4a-standard-4` machine type, which provides 4 vCPUs and 16 GB of memory. This VM will host your browser-based LlamaIndex RAG application.
1212

13-
## Configure the C4A virtual machine in Google Cloud Console
13+
### Configure the C4A virtual machine in Google Cloud Console
1414

1515
To create a virtual machine based on the C4A instance type in the console:
1616

@@ -25,7 +25,7 @@ To create a virtual machine based on the C4A instance type in the console:
2525
6. For the license type, choose **Pay as you go**.
2626
7. Increase **Size (GB)** from **10** to **100** to allocate sufficient disk space, and then click **Select**.
2727
8. Select **Networking** from the column on the left.
28-
9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous step and allow inbound access to port 8000 for the browser-based LlamaIndex RAG application and port 22 for ssh access.
28+
9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous section and allow inbound access to port `8000` for the browser-based LlamaIndex RAG application and port `22` for ssh access.
2929
10. Select **Create** to launch the virtual machine.
3030

3131
After the instance starts, select **SSH** next to the VM in the instance list to open a browser-based terminal session.

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md

Lines changed: 21 additions & 54 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Install and Configure LlamaIndex on Google Cloud Axion
2+
title: Install and configure LlamaIndex on Google Cloud Axion
33
weight: 5
44

55
### FIXED, DO NOT MODIFY
@@ -8,52 +8,17 @@ layout: learningpathall
88

99
## Prepare the environment
1010

11-
In this section, you will prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
11+
In this section, you'll prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
1212

13-
You will:
13+
You'll install:
1414

15-
- Verify the VM architecture
16-
- Install required system packages
17-
- Install Python 3.11
18-
- Install Ollama and pull a lightweight LLM model
19-
- Install LlamaIndex and required Python packages
15+
- required system packages
16+
- Python 3.11
17+
- Ollama
18+
- LlamaIndex and required Python packages
2019

20+
### Update the virtual machine
2121

22-
## Target environment
23-
24-
```text
25-
Cloud: Google Cloud Platform
26-
VM Type: C4A Axion ARM64
27-
OS: SUSE Linux Enterprise Server 15 SP5
28-
Architecture: aarch64
29-
RAM: 16 GB or higher recommended
30-
```
31-
32-
## Verify VM architecture
33-
34-
```bash
35-
uname -m
36-
cat /etc/os-release
37-
```
38-
39-
The output is similar to:
40-
41-
```output
42-
aarch64
43-
NAME="SLES"
44-
VERSION="15-SP5"
45-
VERSION_ID="15.5"
46-
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP5"
47-
ID="sles"
48-
ID_LIKE="suse"
49-
ANSI_COLOR="0;32"
50-
CPE_NAME="cpe:/o:suse:sles:15:sp5"
51-
DOCUMENTATION_URL="https://documentation.suse.com/"
52-
```
53-
54-
This confirms you are on an Arm-based VM.
55-
56-
## Update the VM
5722
Update all system packages:
5823

5924
```bash
@@ -63,7 +28,7 @@ sudo zypper update -y
6328

6429
This ensures your system is up to date before installing anything.
6530

66-
## Install required packages
31+
### Install required packages
6732

6833
Install Python 3.11 and the build tools needed to compile Python packages with native extensions:
6934

@@ -99,9 +64,9 @@ Python 3.11.10
9964
pip 22.3.1 from /usr/lib/python3.11/site-packages/pip (python 3.11)
10065
```
10166

102-
## Install Docker
67+
### (Optional) Install Docker
10368

104-
Docker is installed here so that you can run containerized workloads alongside the RAG pipeline if needed. For this Learning Path, ChromaDB and Ollama run natively, but Docker is available for extended use.
69+
For this Learning Path, ChromaDB and Ollama run natively. For extended use, you can install Docker so that you can run containerized workloads alongside the RAG pipeline if needed:
10570

10671
```bash
10772
sudo zypper install -y docker
@@ -117,7 +82,7 @@ sudo usermod -aG docker $USER
11782
newgrp docker
11883
```
11984

120-
**Test Docker:**
85+
Test Docker:
12186

12287
```bash
12388
docker run hello-world
@@ -130,7 +95,7 @@ Hello from Docker!
13095
This message shows that your installation appears to be working correctly.
13196
```
13297

133-
## Create project directory
98+
### Create project directory
13499

135100
Create a project directory and a Python virtual environment. The virtual environment isolates the Python packages for this project from your system packages:
136101

@@ -152,7 +117,9 @@ Upgrade pip to the latest version:
152117
pip install --upgrade pip setuptools wheel
153118
```
154119

155-
## Install Ollama
120+
### Install Ollama
121+
122+
Run the following command:
156123

157124
```bash
158125
curl -fsSL https://ollama.com/install.sh | sh
@@ -170,7 +137,7 @@ The output is similar to:
170137
ollama version is 0.24.0
171138
```
172139

173-
## Check Ollama is running
140+
### Check Ollama is running
174141

175142
When installed using the official script, Ollama registers itself as a systemd service and starts automatically. Verify it is running:
176143

@@ -184,7 +151,7 @@ If the service is not running, start it:
184151
sudo systemctl start ollama
185152
```
186153

187-
## Pull an LLM model
154+
### Pull an LLM model
188155

189156
With Ollama running, pull the `llama3.2:1b` model. This is a lightweight 1-billion parameter model suitable for local inference on a 16 GB VM:
190157

@@ -204,9 +171,9 @@ The output is similar to:
204171
Retrieval-Augmented Generation (RAG) is a technique that combines a retrieval step, which fetches relevant documents from a knowledge base, with a generation step, where a large language model uses those documents to produce a grounded, context-aware response.
205172
```
206173

207-
## Install LlamaIndex packages
174+
### Install LlamaIndex packages
208175

209-
Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. FastAPI and Uvicorn are also installed here because the browser-based application you'll build in the next section uses them as the web server:
176+
Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. You'll also install FastAPI and Uvicorn here because the browser-based application you'll build in the next section uses them as the web server:
210177

211178
```bash
212179
pip install llama-index
@@ -221,6 +188,6 @@ pip install uvicorn
221188

222189
## What you've accomplished and what's next
223190

224-
You've successfully installed and configured LlamaIndex on a Google Cloud Axion Arm64 VM running SUSE Linux with Python 3.11. You installed Docker, configured Ollama for local LLM inference, and prepared the environment for building browser-based RAG applications using LlamaIndex and ChromaDB.
191+
You've now successfully installed and configured LlamaIndex on a Google Cloud Axion Arm64 VM running SUSE Linux with Python 3.11. You optionally installed Docker, configured Ollama for local LLM inference, and prepared the environment for building browser-based RAG applications using LlamaIndex and ChromaDB.
225192

226193
Next, you'll build the RAG engine, create the browser UI, and query custom documents using a local large language model.

0 commit comments

Comments
 (0)