Skip to content

Commit 103e2d2

Browse files
authored
Merge pull request #3372 from pareenaverma/content_review
Tech review llama index LP
2 parents 2e38395 + 946379f commit 103e2d2

5 files changed

Lines changed: 53 additions & 62 deletions

File tree

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/_index.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -44,17 +44,17 @@ operatingsystems:
4444

4545
further_reading:
4646
- resource:
47-
title: LlamaIndex official documentation
48-
link: https://docs.llamaindex.ai/en/stable/
49-
type: documentation
47+
title: LlamaIndex official documentation
48+
link: https://docs.llamaindex.ai/en/stable/
49+
type: documentation
5050
- resource:
51-
title: LlamaIndex GitHub repository
52-
link: https://github.com/run-llama/llama_index
53-
type: documentation
51+
title: LlamaIndex GitHub repository
52+
link: https://github.com/run-llama/llama_index
53+
type: documentation
5454
- resource:
55-
title: Ollama documentation
56-
link: https://ollama.com/library
57-
type: documentation
55+
title: Ollama documentation
56+
link: https://ollama.com/library
57+
type: documentation
5858
- resource:
5959
title: Introducing Google Axion Processors, our new Arm-based CPUs
6060
link: https://cloud.google.com/blog/products/compute/introducing-googles-new-arm-based-cpu

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/build-browser-rag-app.md

Lines changed: 12 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -19,17 +19,10 @@ You'll:
1919
- Create a FastAPI backend
2020
- Query documents directly from a web browser
2121

22-
## Terminal usage
23-
24-
You'll use:
25-
26-
- **Terminal A** → FastAPI, file creation, and testing
27-
- **Terminal B** → Ollama server
28-
29-
Leave Terminal B running throughout the rest of this Learning Path.
30-
3122
## Architecture
3223

24+
The following diagram shows how the components interact. A request from the browser reaches FastAPI, which calls LlamaIndex to retrieve relevant chunks from ChromaDB and passes them to the Ollama local LLM for answer generation:
25+
3326
```text
3427
Browser UI
3528
@@ -42,11 +35,11 @@ ChromaDB Vector Store
4235
Ollama Local LLM
4336
4437
Documents
45-
````
38+
```
4639

4740
## Activate the Python environment
4841

49-
Open Terminal A and activate the Python virtual environment:
42+
Activate the Python virtual environment:
5043

5144
```bash
5245
cd ~/llamaindex-rag
@@ -320,9 +313,13 @@ EOF
320313

321314
## Start the browser-based RAG application
322315

323-
Make sure Ollama is still running in Terminal B.
316+
Verify that Ollama is still running before starting the application:
317+
318+
```bash
319+
sudo systemctl status ollama
320+
```
324321

325-
In Terminal A run:
322+
Activate the virtual environment and navigate to the project directory:
326323

327324
```bash
328325
cd ~/llamaindex-rag
@@ -389,13 +386,13 @@ Copy your own files into the data directory:
389386
cp yourfile.txt ~/llamaindex-rag/data/
390387
```
391388

392-
First stop the server and then restart FastAPI:
389+
Stop the running FastAPI server by pressing `Ctrl+C` in the terminal where Uvicorn is running. Then restart it:
393390

394391
```bash
395392
uvicorn api:app --host 0.0.0.0 --port 8000
396393
```
397394

398-
The application automatically indexes the new documents and makes them searchable through the browser UI.
395+
The `build_query_engine()` function runs on startup and reads all documents from the `data/` directory each time the server starts. Restarting the server causes LlamaIndex to ingest the new file, generate its embeddings, and store them in ChromaDB, making the new document searchable through the browser UI.
399396

400397
## What you've accomplished
401398

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/firewall.md

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -28,22 +28,20 @@ To configure a firewall rule for the LlamaIndex browser-based RAG application:
2828

2929
5. Under **Protocols and ports**, select **Specified protocols and ports**.
3030

31-
6. Select the **TCP** checkbox and enter:
31+
6. Select the **TCP** checkbox. Port **8000** is used by the FastAPI server that backs the browser-based LlamaIndex RAG application. Enter:
3232

3333
```text
3434
8000
35-
````
36-
37-
Use port mapping **8000** for the browser-based LlamaIndex RAG application running with FastAPI.
35+
```
3836

3937
![Google Cloud Console Protocols and ports section with TCP selected and port 8000 entered#center](images/network-port.png "Setting the LlamaIndex browser application port in the firewall rule")
4038

41-
7. Also add port 22 in **TCP** checkbox for ssh access.
39+
7. In the same **TCP** field, also add port `22` to allow SSH access to the VM.
4240

4341
8. Select **Create**.
4442

4543
## What you've accomplished and what's next
4644

47-
You've created a firewall rule to expose the browser-based LlamaIndex RAG application. You also enabled external access to query documents and interact with the application directly from a web browser.
45+
You've created a firewall rule that exposes port 8000 for the browser-based LlamaIndex RAG application and port 22 for SSH. The firewall rule uses the network tag `allow-llamaindex-port`, which you'll attach to your virtual machine in the next step.
4846

49-
Next, you'll access the browser-based RAG application using the external IP address of your Google Cloud Axion virtual machine.
47+
Next, you'll create a Google Cloud Axion C4A virtual machine and connect to it using SSH.

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/instance.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ To create a virtual machine based on the C4A instance type in the console:
2424
5. Under **OS and storage**, select **Change** and then choose an Arm64-based operating system image. For this Learning Path, select **SUSE Linux Enterprise Server**.
2525
6. For the license type, choose **Pay as you go**.
2626
7. Increase **Size (GB)** from **10** to **100** to allocate sufficient disk space, and then click **Select**.
27-
8. Select **Networking** from column on the left
27+
8. Select **Networking** from the column on the left.
2828
9. Under **Network tags**, enter `allow-llamaindex-port` to link the VM to the firewall rule from the previous step and allow inbound access to port 8000 for the browser-based LlamaIndex RAG application and port 22 for ssh access.
2929
10. Select **Create** to launch the virtual machine.
3030

content/learning-paths/servers-and-cloud-computing/llamaindex-rag-axion/setup-llamaindex-rag.md

Lines changed: 26 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -6,18 +6,16 @@ weight: 5
66
layout: learningpathall
77
---
88

9-
## Install and Configure LlamaIndex on Google Cloud Axion
9+
## Prepare the environment
1010

1111
In this section, you will prepare a Google Cloud Axion Arm64 VM for running a browser-based RAG application using LlamaIndex.
1212

1313
You will:
1414

1515
- Verify the VM architecture
1616
- Install required system packages
17-
- Install Docker
1817
- Install Python 3.11
19-
- Install Ollama
20-
- Pull a lightweight LLM model
18+
- Install Ollama and pull a lightweight LLM model
2119
- Install LlamaIndex and required Python packages
2220

2321

@@ -26,16 +24,11 @@ You will:
2624
```text
2725
Cloud: Google Cloud Platform
2826
VM Type: C4A Axion ARM64
29-
OS: SUSE Linux Enterprise Server 15 SP6
27+
OS: SUSE Linux Enterprise Server 15 SP5
3028
Architecture: aarch64
3129
RAM: 16 GB or higher recommended
3230
```
3331

34-
## Terminal usage You'll use
35-
36-
- **Terminal A** → setup, package installation, FastAPI, and testing
37-
- **Terminal B** → Ollama server Open both terminals connected to the VM before starting.
38-
3932
## Verify VM architecture
4033

4134
```bash
@@ -70,8 +63,9 @@ sudo zypper update -y
7063

7164
This ensures your system is up to date before installing anything.
7265

73-
## Install required packages:
74-
Now install Python 3.11 and other tools:
66+
## Install required packages
67+
68+
Install Python 3.11 and the build tools needed to compile Python packages with native extensions:
7569

7670
```bash
7771
sudo zypper install -y \
@@ -92,7 +86,7 @@ python311-setuptools \
9286
python311-wheel
9387
```
9488

95-
**Verify Python:**
89+
Verify Python is installed correctly:
9690

9791
```bash
9892
python3.11 --version
@@ -105,15 +99,17 @@ Python 3.11.10
10599
pip 22.3.1 from /usr/lib/python3.11/site-packages/pip (python 3.11)
106100
```
107101

108-
## Install Docker and Add current user to Docker group
102+
## Install Docker
103+
104+
Docker is installed here so that you can run containerized workloads alongside the RAG pipeline if needed. For this Learning Path, ChromaDB and Ollama run natively, but Docker is available for extended use.
109105

110106
```bash
111107
sudo zypper install -y docker
112108
sudo systemctl enable docker
113109
sudo systemctl start docker
114110
```
115111

116-
**Check Docker Add current user to Docker group:**
112+
Verify Docker is running and add your user to the `docker` group so you don't need `sudo` for Docker commands:
117113

118114
```bash
119115
sudo systemctl status docker
@@ -136,19 +132,21 @@ This message shows that your installation appears to be working correctly.
136132

137133
## Create project directory
138134

135+
Create a project directory and a Python virtual environment. The virtual environment isolates the Python packages for this project from your system packages:
136+
139137
```bash
140138
mkdir -p ~/llamaindex-rag/data
141139
cd ~/llamaindex-rag
142140
```
143141

144-
**Create and Activate Python virtual environment:**
142+
Create and activate the Python virtual environment:
145143

146144
```bash
147145
python3.11 -m venv rag-env
148146
source rag-env/bin/activate
149147
```
150148

151-
**Upgrade pip:**
149+
Upgrade pip to the latest version:
152150

153151
```bash
154152
pip install --upgrade pip setuptools wheel
@@ -160,7 +158,7 @@ pip install --upgrade pip setuptools wheel
160158
curl -fsSL https://ollama.com/install.sh | sh
161159
```
162160

163-
**Verify:**
161+
Verify the Ollama version:
164162

165163
```bash
166164
ollama -v
@@ -172,32 +170,29 @@ The output is similar to:
172170
ollama version is 0.24.0
173171
```
174172

175-
## Start Ollama
173+
## Check Ollama is running
176174

177-
When Ollama is installed via the official script, it sets up a systemd background service and automatically starts the service. Use the following command to check the status of ollama service.
175+
When installed using the official script, Ollama registers itself as a systemd service and starts automatically. Verify it is running:
178176

179177
```bash
180178
sudo systemctl status ollama
181179
```
182180

183-
Leave Terminal B open and don't run any other commands in it. Ollama must stay running throughout the rest of this Learning Path.
184-
185-
## Open a new terminal
186-
187-
Open a second SSH terminal and run:
181+
If the service is not running, start it:
188182

189183
```bash
190-
cd ~/llamaindex-rag
191-
source rag-env/bin/activate
184+
sudo systemctl start ollama
192185
```
193186

194187
## Pull an LLM model
195188

189+
With Ollama running, pull the `llama3.2:1b` model. This is a lightweight 1-billion parameter model suitable for local inference on a 16 GB VM:
190+
196191
```bash
197192
ollama pull llama3.2:1b
198193
```
199194

200-
**Test the model:**
195+
Test that the model responds correctly:
201196

202197
```bash
203198
ollama run llama3.2:1b "Explain RAG in one sentence."
@@ -206,12 +201,13 @@ ollama run llama3.2:1b "Explain RAG in one sentence."
206201
The output is similar to:
207202

208203
```output
209-
RAG (Resource Allocation Group) is a method of allocating resources, such as people or equipment, to tasks based on their criticality and urgency,
210-
prioritizing high-priority tasks that have significant consequences if not completed on time.
204+
Retrieval-Augmented Generation (RAG) is a technique that combines a retrieval step, which fetches relevant documents from a knowledge base, with a generation step, where a large language model uses those documents to produce a grounded, context-aware response.
211205
```
212206

213207
## Install LlamaIndex packages
214208

209+
Install the LlamaIndex core library along with the integrations needed for Ollama, HuggingFace embeddings, and ChromaDB. FastAPI and Uvicorn are also installed here because the browser-based application you'll build in the next section uses them as the web server:
210+
215211
```bash
216212
pip install llama-index
217213
pip install llama-index-llms-ollama

0 commit comments

Comments
 (0)