Name	Name	Last commit message	Last commit date
parent directory ..
assets	assets
data_ingestion	data_ingestion
graph_rag_with_bigquery	graph_rag_with_bigquery
notebooks	notebooks
README.md	README.md

Agentic Graph RAG Project with BigQuery Graph

This project is a sample implementation of an Agentic Graph RAG using the Agent Development Kit (ADK) and the Graph feature of Google Cloud BigQuery, powered by the langchain-bigquery-graph library.

Project Structure

/graph-rag-with-bigquery
├── assets/                  # Images for README
├── data_ingestion/          # Data ingestion directory
│   ├── ingest.py            # Data ingestion script
│   └── requirements.txt     # Data ingestion script dependencies
├── graph_rag_with_bigquery/  # ADK Agent directory
│   ├── agent.py
│   ├── prompt.py
│   ├── requirements.txt     # Agent dependencies
│   └── tools.py
├── notebooks/               # Jupyter notebooks for exploration
│   ├── graph_rag_with_bigquery.ipynb
│   └── requirements.txt
└── README.md

Prerequisites

Before you begin, you need to have an active Google Cloud project with BigQuery enabled.

1. Configure your Google Cloud project

First, you need to authenticate with Google Cloud. Run the following command and follow the instructions to log in.

gcloud auth application-default login

Next, set up your project and enable the necessary APIs.

# Set your project ID
export PROJECT_ID=$(gcloud config get-value project)

# Enable the required APIs
gcloud services enable \
  bigquery.googleapis.com \
  aiplatform.googleapis.com \
  cloudresourcemanager.googleapis.com

2. Create a BigQuery Dataset

Create a BigQuery dataset. The property graph and tables will be created automatically during data ingestion.

# Set environment variables
export BIGQUERY_DATASET="graph_rag_demo"
export BIGQUERY_GRAPH_NAME="retail_graph"
export BIGQUERY_LOCATION="us-central1"

# Create the dataset using bq CLI
bq --location=$BIGQUERY_LOCATION mk --dataset $PROJECT_ID:$BIGQUERY_DATASET

3. Grant Agent Engine permissions to BigQuery

To allow the deployed Agent Engine to connect to your BigQuery dataset, you must grant the necessary IAM roles to the Agent Engine's service account.

export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format="value(projectNumber)")

# Grant BigQuery Data Viewer role
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
    --role="roles/bigquery.dataViewer"

# Grant BigQuery Job User role
gcloud projects add-iam-policy-binding $PROJECT_ID \
    --member="serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-aiplatform-re.iam.gserviceaccount.com" \
    --role="roles/bigquery.jobUser"

Setup

1. Install Dependencies

This project uses uv to manage the Python virtual environment and package dependencies.

Create and activate the virtual environment:

# Create the virtual environment
uv venv

# Activate the virtual environment (macOS/Linux)
source .venv/bin/activate
# Activate the virtual environment (Windows)
.venv\Scripts\activate

Install dependencies:

# Install agent dependencies
uv pip install -r graph_rag_with_bigquery/requirements.txt

# Install data ingestion script dependencies
uv pip install -r data_ingestion/requirements.txt

2. Data Ingestion

Run the data_ingestion/ingest.py script to load the documents into BigQuery Graph.

First, you need to create a .env file for the data ingestion script by copying the example file and filling in the required values.

cp .env.example .env
# Now, open .env in an editor and modify the values.

Basic Usage:

python data_ingestion/ingest.py

Custom Configuration:

python data_ingestion/ingest.py \
  --project_id="your-gcp-project-id" \
  --dataset_id="graph_rag_demo" \
  --graph_name="retail_graph" \
  --location="us-central1"

Additional Options:

--cleanup: Delete existing graph data before ingestion.
--print-graph: Print the transformed graph documents before ingestion (useful for debugging).
--llm_model: Specify the LLM model for graph transformation (default: gemini-2.5-flash).
--embedding_model: Specify the embedding model for node properties (default: gemini-embedding-001).

Example with all options:

python data_ingestion/ingest.py \
  --cleanup \
  --print-graph \
  --llm_model="gemini-2.5-pro" \
  --embedding_model="gemini-embedding-001"

3. Run the Agent Locally

Before running the agent, you need to create a .env file in the graph_rag_with_bigquery directory.

You can run the agent using either the command-line interface or a web-based interface.

Using the Command-Line Interface (CLI)

Run the agent in your terminal using the adk run command.

adk run graph_rag_with_bigquery

Using the Web Interface

You can also interact with the agent through a web interface using the adk web command.

adk web

Screenshot:

Figure 1: ADK Web Interface for Graph RAG with BigQuery

Figure 2: Retail Graph in BigQuery

Figure 3: Retail Tables in BigQuery

Deployment

The Graph RAG with BigQuery agent can be deployed to Vertex AI Agent Engine using the following commands.

1. Set Environment Variables

Before running the deployment script, you need to set the following environment variables.

export GOOGLE_CLOUD_PROJECT=$(gcloud config get-value project)
export GOOGLE_CLOUD_LOCATION="us-central1"
export GOOGLE_CLOUD_STORAGE_BUCKET="your-gcs-bucket-for-staging"

2. Run the Deployment Command

Deploy the agent using the ADK CLI. You will need to provide a GCS bucket for staging the deployment artifacts.

adk deploy agent_engine \
  --staging_bucket gs://$GOOGLE_CLOUD_STORAGE_BUCKET \
  --display_name "Graph RAG Agent with BigQuery" \
  graph_rag_with_bigquery

This command packages the agent located in the graph_rag_with_bigquery directory and deploys it to Vertex AI Agent Engine.

When the deployment finishes, it will print a line like this: Successfully created remote agent: projects/<PROJECT_NUMBER>/locations/<LOCATION>/agentEngines/<AGENT_ENGINE_ID>

Make a note of the AGENT_ENGINE_ID.

3. Interact with the Deployed Agent

You can interact with your deployed agent using a simple Python script.

a. Set Environment Variables: Ensure the following environment variables are set in your terminal. You will need the AGENT_ENGINE_ID from the deployment step.

export GOOGLE_CLOUD_PROJECT="your-gcp-project-id"
export GOOGLE_CLOUD_LOCATION="us-central1"
export AGENT_ENGINE_ID="your-agent-engine-id"

b. Create and Run the Python Script: Create a file named query_agent.py and add the following code.

import asyncio
import os
import vertexai

async def query_remote_agent(project_id, location, agent_id, user_query):
    """Initializes Vertex AI and sends a query to the deployed agent."""
    vertexai.init(project=project_id, location=location)

    # Initialize the client
    client = vertexai.Client(project=project_id, location=location)

    # Construct the full resource name
    agent_name = f"projects/{project_id}/locations/{location}/reasoningEngines/{agent_id}"

    # Get the deployed agent
    remote_agent = client.agent_engines.get(name=agent_name)

    # Create a session for this user
    remote_session = await remote_agent.async_create_session(user_id="u_123")

    print(f"Querying agent: '{user_query}'...")

    # Stream the query and print the response
    try:
        async for event in remote_agent.async_stream_query(
            user_id="u_123",
            session_id=remote_session["id"],
            message=user_query
        ):
            if "content" in event and event["content"] and "parts" in event["content"]:
                for part in event["content"]["parts"]:
                    if "text" in part:
                        print(part["text"], end="", flush=True)
        print("\n")
    except Exception as e:
        print(f"Error querying agent: {e}")

if __name__ == "__main__":
    project = os.getenv("GOOGLE_CLOUD_PROJECT")
    loc = os.getenv("GOOGLE_CLOUD_LOCATION")
    agent = os.getenv("AGENT_ENGINE_ID")
    
    if not all([project, loc, agent]):
        print("Error: GOOGLE_CLOUD_PROJECT, GOOGLE_CLOUD_LOCATION, and AGENT_ENGINE_ID environment variables must be set.")
    else:
        query = "Give me recommendations for a beginner drone"
        asyncio.run(query_remote_agent(project, loc, agent, query))

c. Run the script:

python query_agent.py

References

Google Cloud & BigQuery Graph

⭐ langchain-bigquery-graph (GitHub) - LangChain integration for BigQuery Property Graph — GraphStore, vector context retriever, and text-to-GQL retriever for building Graph RAG applications
BigQuery Property Graph Overview
BigQuery Graph RAG Example (Jupyter Notebook)
Build GraphRAG applications using Spanner Graph and LangChain (2025-03-22)
- LangChain LLMGraphTransformer - System Prompt to extract nodes and edges from text
Gemini Enterprise Agent Platform - A fully managed environment for scaling AI agents in production, handling testing, release management, and reliability

GraphRAG Frameworks & Implementations

Intro to GraphRAG - A dive into GraphRAG pattern details
GraphRAG (Microsoft) - A structured RAG approach by Microsoft that builds knowledge graphs from private datasets to enhance LLM reasoning and holistic understanding of complex data collections
GraphRAG (Microsoft) GitHub - A modular graph-based Retrieval-Augmented Generation (RAG) system
LightRAG - Simple and Fast Retrieval-Augmented Generation that incorporates graph structures into text indexing and retrieval processes.
- LightRAG GitHub
PathRAG - PathRAG (Path-based Retrieval Augmented Generation) is an advanced approach to knowledge retrieval and generation that combines the power of knowledge graphs with large language models (LLMs)

Practical Guides & Case Studies

Building GraphRAG System Step by Step Approach (2025-12-09) - Step-by-Step Implementation of GraphRAG with LlamaIndex
- Hands-on Tutorial for Building a GraphRAG System (GitHub)
Enhancing RAG-based applications accuracy by constructing and leveraging knowledge graphs (2025-03-15) - A practical guide to constructing and retrieving information from knowledge graphs in RAG applications with Neo4j and LangChain
Building knowledge graphs with LLM Graph Transformer (2024-06-26) - A deep dive into LangChain's implementation of graph construction with LLMs

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Agentic Graph RAG Project with BigQuery Graph

Project Structure

Prerequisites

1. Configure your Google Cloud project

2. Create a BigQuery Dataset

3. Grant Agent Engine permissions to BigQuery

Setup

1. Install Dependencies

2. Data Ingestion

3. Run the Agent Locally

Using the Command-Line Interface (CLI)

Using the Web Interface

Deployment

1. Set Environment Variables

2. Run the Deployment Command

3. Interact with the Deployed Agent

References

Google Cloud & BigQuery Graph

GraphRAG Frameworks & Implementations

Practical Guides & Case Studies

FilesExpand file tree

graph-rag-with-bigquery

Directory actions

More options

Directory actions

More options

Latest commit

History

graph-rag-with-bigquery

Folders and files

parent directory

README.md

Agentic Graph RAG Project with BigQuery Graph

Project Structure

Prerequisites

1. Configure your Google Cloud project

2. Create a BigQuery Dataset

3. Grant Agent Engine permissions to BigQuery

Setup

1. Install Dependencies

2. Data Ingestion

3. Run the Agent Locally

Using the Command-Line Interface (CLI)

Using the Web Interface

Deployment

1. Set Environment Variables

2. Run the Deployment Command

3. Interact with the Deployed Agent

References

Google Cloud & BigQuery Graph

GraphRAG Frameworks & Implementations

Practical Guides & Case Studies