Name	Name	Last commit message	Last commit date
parent directory ..
assets	assets
data_ingestion	data_ingestion
pathrag_with_bigquery	pathrag_with_bigquery
README.md	README.md
requirements.txt	requirements.txt

PathRAG Agent with BigQuery Graph

This project demonstrates how to implement a PathRAG (Path-based Retrieval Augmented Generation) agent using the Agent Development Kit (ADK) with Google Cloud BigQuery as the storage backend.

It leverages the PathRAG library with the pathrag-bigquery storage plugin and LiteLLM for Gemini model integration.

Architecture

Image Source: "PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths"

User Query
    |
    v
ADK Agent (Gemini 2.5 Flash)
    |  tool call
    v
pathrag_tool(query)
    |
    v
PathRAG.aquery(only_need_context=True)
    |-- Keyword Extraction (LLM)
    |-- Graph Search (BigQuery Property Graph)
    |-- Vector Search (BigQuery Vector Search)
    +-- Context assembly and return
    |
    v
ADK Agent generates final answer based on context

QueryParam(only_need_context=True) skips answer generation inside PathRAG, letting the ADK Agent's LLM generate the final answer from the retrieved context.

How It Works

User sends a query to the ADK Agent.
Agent calls pathrag_tool with the query.
PathRAG processes the query:
- Extracts keywords (high-level & low-level) using LLM.
- Searches the BigQuery Property Graph (entities, relationships, paths).
- Searches the BigQuery Vector Store (semantic similarity).
- Combines results into structured context.
Context is returned to the Agent (no LLM answer generation inside PathRAG).
Agent generates the final answer using the retrieved context.

Project Structure

pathrag-with-bigquery/
├── pathrag_with_bigquery/            # ADK Agent directory
│   ├── __init__.py
│   ├── agent.py                     # ADK Agent definition (root_agent)
│   ├── prompt.py                    # Agent system instructions
│   ├── tools.py                     # pathrag_tool - context retrieval via PathRAG
│   └── .env.example                 # Environment variables template
├── data_ingestion/                  # Data ingestion directory
│   └── insert.py                    # Script to ingest documents
├── requirements.txt                 # Project dependencies
└── README.md

Key Files

File	Description
`pathrag_with_bigquery/agent.py`	`root_agent` definition using Gemini 2.5 Flash and `pathrag_tool`
`pathrag_with_bigquery/tools.py`	`pathrag_tool` function, extracts context from PathRAG
`pathrag_with_bigquery/prompt.py`	System instruction guiding the Agent to answer based on tool-retrieved context
`data_ingestion/insert.py`	Script to ingest documents into the PathRAG Knowledge Graph

Storage Backend

This project uses Google Cloud BigQuery for scalable, serverless storage. Tables and Property Graph are automatically created by pathrag-bigquery on first use via lazy initialization (_ensure_schema()).

Component	Backend
KV Storage	`BigQueryKVStorage`
Vector Storage	`BigQueryVectorDBStorage`
Graph Storage	`BigQueryGraphStorage`

Prerequisites

Before you begin, ensure you have the following tools installed:

uv (for Python package management)
Google Cloud SDK (gcloud)

1. Configure your Google Cloud project

First, authenticate with Google Cloud:

gcloud auth application-default login

Next, set up your project and enable the necessary APIs:

export PROJECT_ID=$(gcloud config get-value project)

gcloud services enable \
  bigquery.googleapis.com \
  aiplatform.googleapis.com

2. Create a BigQuery Dataset

Create a BigQuery dataset using the gcloud CLI.

# Set environment variables
export BIGQUERY_PROJECT=$PROJECT_ID
export BIGQUERY_DATASET="pathrag"
export BIGQUERY_LOCATION="us-central1"

# Create the BigQuery dataset
bq --location=$BIGQUERY_LOCATION mk \
  --dataset \
  --description="PathRAG Dataset" \
  ${BIGQUERY_PROJECT}:${BIGQUERY_DATASET}

3. Set Environment Variables

Copy the example file and edit it:

cp pathrag_with_bigquery/.env.example pathrag_with_bigquery/.env

export GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
export GOOGLE_GENAI_USE_VERTEXAI="1"
export BIGQUERY_PROJECT="your-project-id"
export BIGQUERY_DATASET="pathrag"

Setup

1. Install Dependencies

This project uses uv to manage the Python virtual environment and package dependencies.

Create and activate the virtual environment:

# Create the virtual environment
uv venv

# Activate the virtual environment
source .venv/bin/activate

Install dependencies:

uv pip install -r requirements.txt

2. Data Ingestion

First, load the environment variables from the .env file:

source pathrag_with_bigquery/.env

Ingest documents into the PathRAG Knowledge Graph.

# Ingest sample documents (Apple, Steve Jobs, Google)
python data_ingestion/insert.py --sample

# Or ingest your own document
python data_ingestion/insert.py --file your_document.txt

3. Run the Agent

You can run the agent using either the command-line interface or a web-based interface.

Using the Command-Line Interface (CLI)

adk run pathrag_with_bigquery

Using the Web Interface

adk web

Screenshot:

Figure 1. PathRAG with BigQuery - ADK Web UI
Figure 2. PathRAG with BigQuery - Storages	Figure 3. PathRAG with BigQuery - ADK Log

References

PathRAG GitHub: Knowledge Graph-based RAG system that uses path-based retrieval through knowledge graphs for more accurate, explainable, and context-aware LLM responses.
pathrag-bigquery GitHub: Google Cloud BigQuery storage backend for PathRAG.
Intro to GraphRAG - A dive into GraphRAG pattern details
Google ADK Documentation
BigQuery Graph
Vertex AI Gemini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

PathRAG Agent with BigQuery Graph

Architecture

How It Works

Project Structure

Key Files

Storage Backend

Prerequisites

1. Configure your Google Cloud project

2. Create a BigQuery Dataset

3. Set Environment Variables

Setup

1. Install Dependencies

2. Data Ingestion

3. Run the Agent

Using the Command-Line Interface (CLI)

Using the Web Interface

References

FilesExpand file tree

pathrag-with-bigquery

Directory actions

More options

Directory actions

More options

Latest commit

History

pathrag-with-bigquery

Folders and files

parent directory

README.md

PathRAG Agent with BigQuery Graph

Architecture

How It Works

Project Structure

Key Files

Storage Backend

Prerequisites

1. Configure your Google Cloud project

2. Create a BigQuery Dataset

3. Set Environment Variables

Setup

1. Install Dependencies

2. Data Ingestion

3. Run the Agent

Using the Command-Line Interface (CLI)

Using the Web Interface

References