This project demonstrates how to implement a PathRAG (Path-based Retrieval Augmented Generation) agent using the Agent Development Kit (ADK) with Google Cloud BigQuery as the storage backend.
It leverages the PathRAG library with the pathrag-bigquery storage plugin and LiteLLM for Gemini model integration.
![]() Image Source: "PathRAG: Pruning Graph-based Retrieval Augmented Generation with Relational Paths" |
User Query
|
v
ADK Agent (Gemini 2.5 Flash)
| tool call
v
pathrag_tool(query)
|
v
PathRAG.aquery(only_need_context=True)
|-- Keyword Extraction (LLM)
|-- Graph Search (BigQuery Property Graph)
|-- Vector Search (BigQuery Vector Search)
+-- Context assembly and return
|
v
ADK Agent generates final answer based on context
|
- User sends a query to the ADK Agent.
- Agent calls
pathrag_toolwith the query. - PathRAG processes the query:
- Extracts keywords (high-level & low-level) using LLM.
- Searches the BigQuery Property Graph (entities, relationships, paths).
- Searches the BigQuery Vector Store (semantic similarity).
- Combines results into structured context.
- Context is returned to the Agent (no LLM answer generation inside PathRAG).
- Agent generates the final answer using the retrieved context.
pathrag-with-bigquery/
├── pathrag_with_bigquery/ # ADK Agent directory
│ ├── __init__.py
│ ├── agent.py # ADK Agent definition (root_agent)
│ ├── prompt.py # Agent system instructions
│ ├── tools.py # pathrag_tool - context retrieval via PathRAG
│ └── .env.example # Environment variables template
├── data_ingestion/ # Data ingestion directory
│ └── insert.py # Script to ingest documents
├── requirements.txt # Project dependencies
└── README.md
| File | Description |
|---|---|
pathrag_with_bigquery/agent.py |
root_agent definition using Gemini 2.5 Flash and pathrag_tool |
pathrag_with_bigquery/tools.py |
pathrag_tool function, extracts context from PathRAG |
pathrag_with_bigquery/prompt.py |
System instruction guiding the Agent to answer based on tool-retrieved context |
data_ingestion/insert.py |
Script to ingest documents into the PathRAG Knowledge Graph |
This project uses Google Cloud BigQuery for scalable, serverless storage. Tables and Property Graph are automatically created by pathrag-bigquery on first use via lazy initialization (_ensure_schema()).
| Component | Backend |
|---|---|
| KV Storage | BigQueryKVStorage |
| Vector Storage | BigQueryVectorDBStorage |
| Graph Storage | BigQueryGraphStorage |
Before you begin, ensure you have the following tools installed:
- uv (for Python package management)
- Google Cloud SDK (gcloud)
First, authenticate with Google Cloud:
gcloud auth application-default loginNext, set up your project and enable the necessary APIs:
export PROJECT_ID=$(gcloud config get-value project)
gcloud services enable \
bigquery.googleapis.com \
aiplatform.googleapis.comCreate a BigQuery dataset using the gcloud CLI.
# Set environment variables
export BIGQUERY_PROJECT=$PROJECT_ID
export BIGQUERY_DATASET="pathrag"
export BIGQUERY_LOCATION="us-central1"
# Create the BigQuery dataset
bq --location=$BIGQUERY_LOCATION mk \
--dataset \
--description="PathRAG Dataset" \
${BIGQUERY_PROJECT}:${BIGQUERY_DATASET}Copy the example file and edit it:
cp pathrag_with_bigquery/.env.example pathrag_with_bigquery/.envexport GOOGLE_CLOUD_PROJECT="your-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
export GOOGLE_GENAI_USE_VERTEXAI="1"
export BIGQUERY_PROJECT="your-project-id"
export BIGQUERY_DATASET="pathrag"This project uses uv to manage the Python virtual environment and package dependencies.
Create and activate the virtual environment:
# Create the virtual environment
uv venv
# Activate the virtual environment
source .venv/bin/activateInstall dependencies:
uv pip install -r requirements.txtFirst, load the environment variables from the .env file:
source pathrag_with_bigquery/.envIngest documents into the PathRAG Knowledge Graph.
# Ingest sample documents (Apple, Steve Jobs, Google)
python data_ingestion/insert.py --sample
# Or ingest your own document
python data_ingestion/insert.py --file your_document.txtYou can run the agent using either the command-line interface or a web-based interface.
adk run pathrag_with_bigqueryadk webScreenshot:
![]() Figure 1. PathRAG with BigQuery - ADK Web UI |
|
![]() Figure 2. PathRAG with BigQuery - Storages |
![]() Figure 3. PathRAG with BigQuery - ADK Log |
PathRAG GitHub: Knowledge Graph-based RAG system that uses path-based retrieval through knowledge graphs for more accurate, explainable, and context-aware LLM responses.
pathrag-bigquery GitHub: Google Cloud BigQuery storage backend for PathRAG.- Intro to GraphRAG - A dive into GraphRAG pattern details
- Google ADK Documentation
- BigQuery Graph
- Vertex AI Gemini



