diff --git a/README.md b/README.md index fa92132..635df76 100644 --- a/README.md +++ b/README.md @@ -69,6 +69,9 @@ Deploy Memgraph using methods that suit your environment, whether it's container ### NeoDash - [Connect Memgraph with NeoDash](./neodash/) +### Power BI +- [Power BI integration (Python script, REST API, ODBC)](./powerbi_integration/) + ### Python - [GQLAlchemy basic example of creating and reading nodes](./python/querying/creating_and_reading_nodes/) diff --git a/powerbi_integration/DESKTOP.md b/powerbi_integration/DESKTOP.md new file mode 100644 index 0000000..245bf0b --- /dev/null +++ b/powerbi_integration/DESKTOP.md @@ -0,0 +1,79 @@ +# Power BI Desktop (Windows) + +Three ways to connect Power BI Desktop to Memgraph. Make sure you've completed the [Quick Start](README.md#quick-start) first. + +--- + +## Approach 1: Python Script (simplest) + +Power BI Desktop can run Python scripts directly as a data source. No middleware needed. + +### Steps + +1. Configure Python in Power BI: **File > Options > Python scripting** — point it to the venv where `neo4j` and `pandas` are installed +2. Go to **Get Data > Python script** +3. Paste the contents of `direct_query.py` +4. Power BI will detect the DataFrames and offer them as tables: + - `nodes` — all nodes (persons, companies, products) + - `edges` — all relationships + - `person_purchases` — persons with their product purchases (flattened for charts) + - `company_employees` — companies with employee details +5. Select the tables you want and click **Load** +6. Build your visualizations (e.g. bar chart of total spent per person from `person_purchases`) + +**Limitations:** Scheduled refresh requires a Personal Gateway with the same Python environment on a Windows machine. + +--- + +## Approach 2: REST API + +Power BI connects to the FastAPI service via the Web connector. + +### Steps + +1. Start the REST API (see [README.md](README.md#rest-api)) +2. In Power BI Desktop, go to **Get Data > Web** +3. Enter the URL: `http://localhost:8000/person-purchases` +4. Power BI will parse the JSON array into a table — click **Load** +5. Repeat for other endpoints as needed (`/nodes`, `/edges`, `/company-employees`) + +**Tip:** For scheduled refresh, deploy the API to a server reachable by Power BI Gateway. + +--- + +## Approach 3: ODBC + +Use a third-party ODBC driver to connect Power BI directly to Memgraph's Bolt endpoint. No Python or middleware needed. + +### Steps + +1. Install one of the ODBC drivers: + + | Driver | Type | SQL-to-Cypher | + |--------|------|---------------| + | Simba Neo4j ODBC | Commercial | Yes | + | CData Neo4j ODBC | Commercial | Yes | + | Devart Neo4j ODBC | Commercial | Yes | + +2. Open **Windows ODBC Data Source Administrator** (64-bit) +3. Add a new System DSN with these settings: + - **Host:** `localhost` + - **Port:** `7687` + - **Auth:** No authentication (or empty username/password) +4. In Power BI Desktop, go to **Get Data > ODBC** +5. Select the DSN you created +6. Write SQL queries — the driver translates them to Cypher automatically + +**Note:** These drivers are built for Neo4j. Since Memgraph is Bolt-compatible, basic queries work, but some advanced SQL-to-Cypher translations or metadata queries may not be fully compatible. Test thoroughly before relying on this in production. + +--- + +## Comparison + +| | Python Script | REST API | ODBC | +|---|---|---|---| +| Setup effort | Low | Medium | Medium | +| Cost | Free | Free | Paid (driver license) | +| Scheduled refresh | Gateway + Python env | Gateway + API server | Gateway + DSN | +| Flexibility | Full Cypher | Pre-built + custom Cypher | SQL (translated to Cypher) | +| Memgraph compatibility | Native | Native | Partial (Bolt-compatible) | diff --git a/powerbi_integration/Dockerfile b/powerbi_integration/Dockerfile new file mode 100644 index 0000000..70bf615 --- /dev/null +++ b/powerbi_integration/Dockerfile @@ -0,0 +1,11 @@ +FROM python:3.12-slim + +WORKDIR /app + +COPY requirements.txt . +RUN pip install --no-cache-dir -r requirements.txt + +COPY rest_api.py . +COPY seed.py . + +CMD ["uvicorn", "rest_api:app", "--host", "0.0.0.0", "--port", "8000"] diff --git a/powerbi_integration/README.md b/powerbi_integration/README.md new file mode 100644 index 0000000..dbae629 --- /dev/null +++ b/powerbi_integration/README.md @@ -0,0 +1,74 @@ +# Power BI <-> Memgraph Integration + +Three approaches to visualize Memgraph graph data in Microsoft Power BI. + +- **Power BI Desktop** (Windows) — see [DESKTOP.md](DESKTOP.md) +- **Power BI Service** (web, any OS) — see [SERVICE.md](SERVICE.md) + +## Prerequisites + +- Docker (for Memgraph) +- Python 3.10+ + +## Quick Start + +```bash +# 1. Start Memgraph +docker compose up -d memgraph + +# 2. Install Python dependencies +python -m venv .venv +source .venv/bin/activate # Windows: .venv\Scripts\activate +pip install -r requirements.txt + +# 3. Seed sample data (5 persons, 3 companies, 3 products, 16 relationships) +python seed.py +``` + +## Approaches + +| Approach | Description | Desktop | Service | Cost | +|----------|-------------|---------|---------|------| +| Python Script | Query Memgraph directly from Power BI's Python data source | Yes | No | Free | +| REST API | FastAPI middleware exposes Memgraph data as JSON endpoints | Yes | Yes (via Dataflow) | Free | +| ODBC | Third-party driver translates SQL to Cypher over Bolt | Yes | Yes (via Gateway) | Paid | + +## REST API + +The REST API (Approach 2) is shared across Desktop and Service. Start it before connecting from either. + +```bash +# Option A: Run locally +uvicorn rest_api:app --host 0.0.0.0 --port 8000 + +# Option B: Run everything with Docker Compose (Memgraph + API) +docker compose up -d + +# Verify +curl http://localhost:8000/person-purchases +``` + +### Available endpoints + +| Endpoint | Method | Description | +|----------|--------|-------------| +| `/nodes` | GET | All nodes as flat rows | +| `/edges` | GET | All edges as flat rows | +| `/person-purchases` | GET | Persons with purchases (good for charts) | +| `/company-employees` | GET | Companies with employee details | +| `/query` | POST | Run arbitrary Cypher, returns list of dicts | + +## Files + +``` +powerbi_integration/ +├── docker-compose.yml # Memgraph + FastAPI service +├── Dockerfile # FastAPI container +├── requirements.txt # Python deps +├── seed.py # Seed sample data into Memgraph +├── rest_api.py # FastAPI service +├── direct_query.py # Power BI Desktop Python script +├── push_to_powerbi.py # Push data to Power BI Service datasets +├── DESKTOP.md # Guide for Power BI Desktop (Windows) +└── SERVICE.md # Guide for Power BI Service (web, any OS) +``` diff --git a/powerbi_integration/SERVICE.md b/powerbi_integration/SERVICE.md new file mode 100644 index 0000000..28b8852 --- /dev/null +++ b/powerbi_integration/SERVICE.md @@ -0,0 +1,103 @@ +# Power BI Service (Web — any OS) + +Two ways to get Memgraph data into Power BI Service without needing Windows. Make sure you've completed the [Quick Start](README.md#quick-start) first. + +--- + +## Approach 1: Push Dataset (recommended for Linux) + +Push data directly from a Python script to a Power BI dataset via the REST API. No Gateway, no Windows machine needed. Works from any OS. + +### Azure AD setup (one-time) + +1. Go to [Azure Portal](https://portal.azure.com) > **App registrations** > **New registration** +2. Name it (e.g. `memgraph-powerbi-push`) +3. Under **API permissions**, add: **Power BI Service > Dataset.ReadWrite.All** (Application) +4. Grant admin consent +5. Under **Certificates & secrets**, create a new client secret +6. Note down: + - **Tenant ID** (from Overview) + - **Client ID** (from Overview) + - **Client secret** (from Certificates & secrets) + +### Steps + +```bash +# 1. Set Azure credentials +export AZURE_TENANT_ID="your-tenant-id" +export AZURE_CLIENT_ID="your-client-id" +export AZURE_CLIENT_SECRET="your-client-secret" + +# 2. (Optional) Target a specific workspace +export POWERBI_WORKSPACE_ID="your-workspace-id" + +# 3. Push Memgraph data to Power BI +python push_to_powerbi.py +``` + +The script will: +- Authenticate with Azure AD +- Create a Push Dataset called "Memgraph Data" (or reuse an existing one) +- Query Memgraph for person purchases and company employees +- Push the data to Power BI + +4. Open [Power BI Service](https://app.powerbi.com) +5. Find the **Memgraph Data** dataset in your workspace +6. Click **Create report** and build your visualizations + +### Scheduled refresh + +Since this is a push model, schedule the script with cron: + +```bash +# Push fresh data every hour +0 * * * * cd /path/to/powerbi_integration && .venv/bin/python push_to_powerbi.py +``` + +No Power BI Gateway needed. + +--- + +## Approach 2: REST API + Dataflow + +Use Power BI Dataflows to pull data from the FastAPI service. The API must be reachable from the internet. + +### Steps + +1. Start the REST API and make it publicly accessible (see [README.md](README.md#rest-api)) + - For testing: use a tunnel like ngrok (`ngrok http 8000`) + - For production: deploy behind a reverse proxy with HTTPS +2. In Power BI Service, go to your workspace +3. Click **New > Dataflow** +4. Choose **Define new tables** +5. Select **Web** as the data source +6. Enter the API URL: `https://your-api-host:8000/person-purchases` +7. Power BI will preview the JSON data — click **Transform data** if needed +8. Save the dataflow +9. Repeat for other endpoints (`/nodes`, `/edges`, `/company-employees`) +10. Set a refresh schedule for the dataflow (e.g. daily) +11. Create a report from the dataflow tables + +### Scheduled refresh + +Dataflows refresh on their own schedule (configured in Power BI Service). No Gateway needed as long as the API is publicly accessible. + +--- + +## Approach 3: ODBC via Gateway + +If you have access to a Windows machine, you can set up a Power BI Gateway with an ODBC driver. See [DESKTOP.md](DESKTOP.md#approach-3-odbc) for driver setup, then configure the Gateway to use that DSN. + +This is the least practical option for Linux-only environments. + +--- + +## Comparison + +| | Push Dataset | REST API + Dataflow | ODBC + Gateway | +|---|---|---|---| +| Works on Linux | Yes | Yes | No (Gateway is Windows) | +| Gateway needed | No | No (if API is public) | Yes | +| Cost | Free | Free | Paid (driver + Gateway) | +| Refresh model | Push (cron) | Pull (Dataflow schedule) | Pull (Gateway schedule) | +| Setup effort | Medium (Azure AD app) | Medium (public API) | High | diff --git a/powerbi_integration/direct_query.py b/powerbi_integration/direct_query.py new file mode 100644 index 0000000..a4bf35f --- /dev/null +++ b/powerbi_integration/direct_query.py @@ -0,0 +1,74 @@ +""" +Power BI Python Data Source script. + +Usage: + 1. In Power BI Desktop, go to Get Data > Python script + 2. Paste this script (or reference this file) + 3. Power BI will pick up all pandas DataFrames defined in the script + +Prerequisites: + - Python environment with neo4j and pandas installed + - Memgraph running on localhost:7687 +""" + +import pandas as pd +from neo4j import GraphDatabase + +MEMGRAPH_URI = "bolt://localhost:7687" + +driver = GraphDatabase.driver(MEMGRAPH_URI, auth=("", "")) + + +def query_to_df(cypher: str) -> pd.DataFrame: + with driver.session() as session: + result = session.run(cypher) + return pd.DataFrame([record.data() for record in result]) + + +# Power BI will detect each DataFrame variable and offer it as a table. + +# All nodes +nodes = query_to_df(""" + MATCH (n) + RETURN n.id AS id, + labels(n)[0] AS label, + n.name AS name, + n.age AS age, + n.revenue AS revenue, + n.price AS price +""") + +# All edges +edges = query_to_df(""" + MATCH (a)-[r]->(b) + RETURN a.id AS source_id, + a.name AS source_name, + type(r) AS relationship, + b.id AS target_id, + b.name AS target_name, + r.since AS since, + r.quantity AS quantity +""") + +# Person purchases (flattened for charts) +person_purchases = query_to_df(""" + MATCH (p:Person)-[b:BOUGHT]->(prod:Product) + RETURN p.name AS person, + p.age AS age, + prod.name AS product, + prod.price AS unit_price, + b.quantity AS quantity, + prod.price * b.quantity AS total_spent +""") + +# Company employees +company_employees = query_to_df(""" + MATCH (p:Person)-[w:WORKS_AT]->(c:Company) + RETURN c.name AS company, + c.revenue AS company_revenue, + p.name AS employee, + p.age AS employee_age, + w.since AS employed_since +""") + +driver.close() diff --git a/powerbi_integration/docker-compose.yml b/powerbi_integration/docker-compose.yml new file mode 100644 index 0000000..b055814 --- /dev/null +++ b/powerbi_integration/docker-compose.yml @@ -0,0 +1,24 @@ +services: + memgraph: + image: memgraph/memgraph-mage:3.9.0 + container_name: memgraph-powerbi + ports: + - "7687:7687" + - "3000:3000" + command: + [ + "--telemetry-enabled=false", + "--log-level=TRACE", + "--also-log-to-stderr=true", + ] + + api: + build: . + container_name: memgraph-powerbi-api + ports: + - "8000:8000" + environment: + - MEMGRAPH_HOST=memgraph + - MEMGRAPH_PORT=7687 + depends_on: + - memgraph diff --git a/powerbi_integration/push_to_powerbi.py b/powerbi_integration/push_to_powerbi.py new file mode 100644 index 0000000..9b1b1a2 --- /dev/null +++ b/powerbi_integration/push_to_powerbi.py @@ -0,0 +1,175 @@ +""" +Push Memgraph data to a Power BI Push Dataset via the Power BI REST API. + +This is the recommended approach for Power BI Service on Linux — no Gateway needed. + +Setup: + 1. Register an app in Azure AD (Microsoft Entra ID): + - Go to https://portal.azure.com > App registrations > New registration + - Add API permission: Power BI Service > Dataset.ReadWrite.All + - Create a client secret + 2. In Power BI Service, create a Push Dataset (or let this script create one) + 3. Set environment variables (see below) + +Usage: + python push_to_powerbi.py + +Environment variables: + MEMGRAPH_HOST (default: localhost) + MEMGRAPH_PORT (default: 7687) + AZURE_TENANT_ID — Azure AD tenant ID + AZURE_CLIENT_ID — App registration client ID + AZURE_CLIENT_SECRET — App registration client secret + POWERBI_WORKSPACE_ID — Power BI workspace (group) ID +""" + +import os +import sys + +import requests +from neo4j import GraphDatabase + +MG_HOST = os.getenv("MEMGRAPH_HOST", "localhost") +MG_PORT = int(os.getenv("MEMGRAPH_PORT", "7687")) +TENANT_ID = os.getenv("AZURE_TENANT_ID", "") +CLIENT_ID = os.getenv("AZURE_CLIENT_ID", "") +CLIENT_SECRET = os.getenv("AZURE_CLIENT_SECRET", "") +WORKSPACE_ID = os.getenv("POWERBI_WORKSPACE_ID", "") + +DATASET_NAME = "Memgraph Data" +POWERBI_API = "https://api.powerbi.com/v1.0/myorg" + + +def get_access_token() -> str: + """Get an OAuth2 token from Azure AD.""" + url = f"https://login.microsoftonline.com/{TENANT_ID}/oauth2/v2.0/token" + resp = requests.post(url, data={ + "grant_type": "client_credentials", + "client_id": CLIENT_ID, + "client_secret": CLIENT_SECRET, + "scope": "https://analysis.windows.net/powerbi/api/.default", + }) + resp.raise_for_status() + return resp.json()["access_token"] + + +def get_headers(token: str) -> dict: + return {"Authorization": f"Bearer {token}", "Content-Type": "application/json"} + + +def query_memgraph(cypher: str, params: dict | None = None) -> list[dict]: + driver = GraphDatabase.driver(f"bolt://{MG_HOST}:{MG_PORT}", auth=("", "")) + with driver.session() as session: + result = session.run(cypher, params or {}) + rows = [record.data() for record in result] + driver.close() + return rows + + +def find_or_create_dataset(token: str) -> tuple[str, dict[str, str]]: + """Find existing dataset or create a new Push Dataset. Returns (dataset_id, table_name_map).""" + headers = get_headers(token) + base = f"{POWERBI_API}/groups/{WORKSPACE_ID}" if WORKSPACE_ID else f"{POWERBI_API}" + + # Check if dataset already exists + resp = requests.get(f"{base}/datasets", headers=headers) + resp.raise_for_status() + for ds in resp.json().get("value", []): + if ds["name"] == DATASET_NAME: + print(f"Found existing dataset: {ds['id']}") + return ds["id"], {} + + # Create Push Dataset with two tables + dataset_def = { + "name": DATASET_NAME, + "defaultMode": "Push", + "tables": [ + { + "name": "PersonPurchases", + "columns": [ + {"name": "person", "dataType": "String"}, + {"name": "age", "dataType": "Int64"}, + {"name": "product", "dataType": "String"}, + {"name": "unit_price", "dataType": "Double"}, + {"name": "quantity", "dataType": "Int64"}, + {"name": "total_spent", "dataType": "Double"}, + ], + }, + { + "name": "CompanyEmployees", + "columns": [ + {"name": "company", "dataType": "String"}, + {"name": "company_revenue", "dataType": "Int64"}, + {"name": "employee", "dataType": "String"}, + {"name": "employee_age", "dataType": "Int64"}, + {"name": "employed_since", "dataType": "Int64"}, + ], + }, + ], + } + + resp = requests.post(f"{base}/datasets", headers=headers, json=dataset_def) + resp.raise_for_status() + dataset_id = resp.json()["id"] + print(f"Created dataset: {dataset_id}") + return dataset_id, {} + + +def push_rows(token: str, dataset_id: str, table_name: str, rows: list[dict]) -> None: + """Push rows to a Power BI Push Dataset table.""" + headers = get_headers(token) + base = f"{POWERBI_API}/groups/{WORKSPACE_ID}" if WORKSPACE_ID else f"{POWERBI_API}" + url = f"{base}/datasets/{dataset_id}/tables/{table_name}/rows" + + # Power BI Push API accepts max 10,000 rows per request + batch_size = 10_000 + for i in range(0, len(rows), batch_size): + batch = rows[i:i + batch_size] + resp = requests.post(url, headers=headers, json={"rows": batch}) + resp.raise_for_status() + + print(f" Pushed {len(rows)} rows to {table_name}") + + +def main() -> None: + for var in ["AZURE_TENANT_ID", "AZURE_CLIENT_ID", "AZURE_CLIENT_SECRET"]: + if not os.getenv(var): + print(f"Error: {var} environment variable is required.") + print("See the docstring at the top of this file for setup instructions.") + sys.exit(1) + + print("Authenticating with Azure AD ...") + token = get_access_token() + + print("Setting up Power BI dataset ...") + dataset_id, _ = find_or_create_dataset(token) + + print("Querying Memgraph ...") + purchases = query_memgraph(""" + MATCH (p:Person)-[b:BOUGHT]->(prod:Product) + RETURN p.name AS person, + p.age AS age, + prod.name AS product, + prod.price AS unit_price, + b.quantity AS quantity, + prod.price * b.quantity AS total_spent + """) + + employees = query_memgraph(""" + MATCH (p:Person)-[w:WORKS_AT]->(c:Company) + RETURN c.name AS company, + c.revenue AS company_revenue, + p.name AS employee, + p.age AS employee_age, + w.since AS employed_since + """) + + print("Pushing data to Power BI ...") + push_rows(token, dataset_id, "PersonPurchases", purchases) + push_rows(token, dataset_id, "CompanyEmployees", employees) + + print("\nDone. Open Power BI Service to build reports on the dataset.") + + +if __name__ == "__main__": + main() diff --git a/powerbi_integration/requirements.txt b/powerbi_integration/requirements.txt new file mode 100644 index 0000000..79df02b --- /dev/null +++ b/powerbi_integration/requirements.txt @@ -0,0 +1,5 @@ +neo4j>=5.0.0 +pandas>=2.0.0 +fastapi>=0.110.0 +uvicorn>=0.29.0 +requests>=2.31.0 diff --git a/powerbi_integration/rest_api.py b/powerbi_integration/rest_api.py new file mode 100644 index 0000000..bb064b4 --- /dev/null +++ b/powerbi_integration/rest_api.py @@ -0,0 +1,118 @@ +""" +FastAPI service that exposes Memgraph query results as tabular JSON. + +Power BI connects to this via Get Data > Web connector. + +Endpoints: + GET /nodes — all nodes as flat rows + GET /edges — all edges as flat rows + GET /person-purchases — persons with their purchases (flattened) + GET /company-employees — companies with employee count and list + POST /query — run arbitrary Cypher, returns list of dicts +""" + +import os + +from fastapi import FastAPI, HTTPException +from neo4j import GraphDatabase +from pydantic import BaseModel + +MG_HOST = os.getenv("MEMGRAPH_HOST", "localhost") +MG_PORT = int(os.getenv("MEMGRAPH_PORT", "7687")) + +app = FastAPI(title="Memgraph Power BI API") + +_driver = None + + +def get_driver(): + global _driver + if _driver is None: + _driver = GraphDatabase.driver(f"bolt://{MG_HOST}:{MG_PORT}", auth=("", "")) + return _driver + + +def run_query(cypher: str, params: dict | None = None) -> list[dict]: + with get_driver().session() as session: + result = session.run(cypher, params or {}) + return [record.data() for record in result] + + +# --------------------------------------------------------------------------- +# Pre-built endpoints (recommended for Power BI) +# --------------------------------------------------------------------------- + + +@app.get("/nodes") +def get_nodes(): + """Return all nodes as flat rows with id, label, and properties.""" + return run_query(""" + MATCH (n) + RETURN n.id AS id, + labels(n)[0] AS label, + n.name AS name, + n.age AS age, + n.revenue AS revenue, + n.price AS price + """) + + +@app.get("/edges") +def get_edges(): + """Return all edges as flat rows.""" + return run_query(""" + MATCH (a)-[r]->(b) + RETURN a.id AS source_id, + a.name AS source_name, + type(r) AS relationship, + b.id AS target_id, + b.name AS target_name, + r.since AS since, + r.quantity AS quantity + """) + + +@app.get("/person-purchases") +def get_person_purchases(): + """Persons with their product purchases — good for Power BI tables/charts.""" + return run_query(""" + MATCH (p:Person)-[b:BOUGHT]->(prod:Product) + RETURN p.name AS person, + p.age AS age, + prod.name AS product, + prod.price AS unit_price, + b.quantity AS quantity, + prod.price * b.quantity AS total_spent + """) + + +@app.get("/company-employees") +def get_company_employees(): + """Companies with employee details.""" + return run_query(""" + MATCH (p:Person)-[w:WORKS_AT]->(c:Company) + RETURN c.name AS company, + c.revenue AS company_revenue, + p.name AS employee, + p.age AS employee_age, + w.since AS employed_since + """) + + +# --------------------------------------------------------------------------- +# Flexible Cypher endpoint +# --------------------------------------------------------------------------- + + +class CypherRequest(BaseModel): + query: str + params: dict | None = None + + +@app.post("/query") +def run_custom_query(req: CypherRequest): + """Run arbitrary Cypher and return results as a list of dicts.""" + try: + return run_query(req.query, req.params) + except Exception as e: + raise HTTPException(status_code=400, detail=str(e)) diff --git a/powerbi_integration/seed.py b/powerbi_integration/seed.py new file mode 100644 index 0000000..90116db --- /dev/null +++ b/powerbi_integration/seed.py @@ -0,0 +1,96 @@ +"""Seed Memgraph with sample data for the Power BI demo.""" + +import os + +from neo4j import GraphDatabase + +MG_HOST = os.getenv("MEMGRAPH_HOST", "localhost") +MG_PORT = int(os.getenv("MEMGRAPH_PORT", "7687")) + + +def seed(): + driver = GraphDatabase.driver(f"bolt://{MG_HOST}:{MG_PORT}", auth=("", "")) + + with driver.session() as session: + session.run("MATCH (n) DETACH DELETE n") + + session.run("CREATE INDEX ON :Person(id)") + session.run("CREATE INDEX ON :Company(id)") + session.run("CREATE INDEX ON :Product(id)") + + session.run(""" + UNWIND $rows AS r + CREATE (p:Person {id: r.id, name: r.name, age: r.age}) + """, rows=[ + {"id": 1, "name": "Alice", "age": 34}, + {"id": 2, "name": "Bob", "age": 28}, + {"id": 3, "name": "Carol", "age": 45}, + {"id": 4, "name": "Dave", "age": 31}, + {"id": 5, "name": "Eve", "age": 52}, + ]) + + session.run(""" + UNWIND $rows AS r + CREATE (c:Company {id: r.id, name: r.name, revenue: r.revenue}) + """, rows=[ + {"id": 100, "name": "Acme Corp", "revenue": 500000}, + {"id": 101, "name": "Globex Inc", "revenue": 1200000}, + {"id": 102, "name": "Initech", "revenue": 300000}, + ]) + + session.run(""" + UNWIND $rows AS r + CREATE (p:Product {id: r.id, name: r.name, price: r.price}) + """, rows=[ + {"id": 200, "name": "Widget", "price": 49.99}, + {"id": 201, "name": "Gadget", "price": 149.99}, + {"id": 202, "name": "Gizmo", "price": 29.99}, + ]) + + # Relationships + session.run(""" + UNWIND $rows AS r + MATCH (a:Person {id: r.src}), (b:Company {id: r.dst}) + CREATE (a)-[:WORKS_AT {since: r.since}]->(b) + """, rows=[ + {"src": 1, "dst": 100, "since": 2018}, + {"src": 2, "dst": 100, "since": 2020}, + {"src": 3, "dst": 101, "since": 2015}, + {"src": 4, "dst": 101, "since": 2021}, + {"src": 5, "dst": 102, "since": 2019}, + ]) + + session.run(""" + UNWIND $rows AS r + MATCH (a:Person {id: r.src}), (b:Product {id: r.dst}) + CREATE (a)-[:BOUGHT {quantity: r.qty}]->(b) + """, rows=[ + {"src": 1, "dst": 200, "qty": 3}, + {"src": 2, "dst": 201, "qty": 1}, + {"src": 3, "dst": 200, "qty": 5}, + {"src": 3, "dst": 202, "qty": 2}, + {"src": 4, "dst": 201, "qty": 1}, + {"src": 5, "dst": 202, "qty": 4}, + ]) + + session.run(""" + UNWIND $rows AS r + MATCH (a:Person {id: r.src}), (b:Person {id: r.dst}) + CREATE (a)-[:KNOWS]->(b) + """, rows=[ + {"src": 1, "dst": 2}, + {"src": 1, "dst": 3}, + {"src": 2, "dst": 4}, + {"src": 3, "dst": 5}, + {"src": 4, "dst": 5}, + ]) + + node_count = driver.session().run("MATCH (n) RETURN count(n) AS cnt").single()["cnt"] + edge_count = driver.session().run("MATCH ()-[r]->() RETURN count(r) AS cnt").single()["cnt"] + print(f"Seeded {node_count} nodes and {edge_count} edges.") + + driver.close() + + +if __name__ == "__main__": + seed()