Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,9 @@ Deploy Memgraph using methods that suit your environment, whether it's container
### NeoDash
- [Connect Memgraph with NeoDash](./neodash/)

### Power BI
- [Power BI integration (Python script, REST API, ODBC)](./powerbi_integration/)

### Python
- [GQLAlchemy basic example of creating and reading nodes](./python/querying/creating_and_reading_nodes/)

Expand Down
79 changes: 79 additions & 0 deletions powerbi_integration/DESKTOP.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# Power BI Desktop (Windows)

Three ways to connect Power BI Desktop to Memgraph. Make sure you've completed the [Quick Start](README.md#quick-start) first.

---

## Approach 1: Python Script (simplest)

Power BI Desktop can run Python scripts directly as a data source. No middleware needed.

### Steps

1. Configure Python in Power BI: **File > Options > Python scripting** — point it to the venv where `neo4j` and `pandas` are installed
2. Go to **Get Data > Python script**
3. Paste the contents of `direct_query.py`
4. Power BI will detect the DataFrames and offer them as tables:
- `nodes` — all nodes (persons, companies, products)
- `edges` — all relationships
- `person_purchases` — persons with their product purchases (flattened for charts)
- `company_employees` — companies with employee details
5. Select the tables you want and click **Load**
6. Build your visualizations (e.g. bar chart of total spent per person from `person_purchases`)

**Limitations:** Scheduled refresh requires a Personal Gateway with the same Python environment on a Windows machine.

---

## Approach 2: REST API

Power BI connects to the FastAPI service via the Web connector.

### Steps

1. Start the REST API (see [README.md](README.md#rest-api))
2. In Power BI Desktop, go to **Get Data > Web**
3. Enter the URL: `http://localhost:8000/person-purchases`
4. Power BI will parse the JSON array into a table — click **Load**
5. Repeat for other endpoints as needed (`/nodes`, `/edges`, `/company-employees`)

**Tip:** For scheduled refresh, deploy the API to a server reachable by Power BI Gateway.

---

## Approach 3: ODBC

Use a third-party ODBC driver to connect Power BI directly to Memgraph's Bolt endpoint. No Python or middleware needed.

### Steps

1. Install one of the ODBC drivers:

| Driver | Type | SQL-to-Cypher |
|--------|------|---------------|
| Simba Neo4j ODBC | Commercial | Yes |
| CData Neo4j ODBC | Commercial | Yes |
| Devart Neo4j ODBC | Commercial | Yes |

2. Open **Windows ODBC Data Source Administrator** (64-bit)
3. Add a new System DSN with these settings:
- **Host:** `localhost`
- **Port:** `7687`
- **Auth:** No authentication (or empty username/password)
4. In Power BI Desktop, go to **Get Data > ODBC**
5. Select the DSN you created
6. Write SQL queries — the driver translates them to Cypher automatically

**Note:** These drivers are built for Neo4j. Since Memgraph is Bolt-compatible, basic queries work, but some advanced SQL-to-Cypher translations or metadata queries may not be fully compatible. Test thoroughly before relying on this in production.

---

## Comparison

| | Python Script | REST API | ODBC |
|---|---|---|---|
| Setup effort | Low | Medium | Medium |
| Cost | Free | Free | Paid (driver license) |
| Scheduled refresh | Gateway + Python env | Gateway + API server | Gateway + DSN |
| Flexibility | Full Cypher | Pre-built + custom Cypher | SQL (translated to Cypher) |
| Memgraph compatibility | Native | Native | Partial (Bolt-compatible) |
11 changes: 11 additions & 0 deletions powerbi_integration/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
FROM python:3.12-slim

WORKDIR /app

COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY rest_api.py .
COPY seed.py .

CMD ["uvicorn", "rest_api:app", "--host", "0.0.0.0", "--port", "8000"]
74 changes: 74 additions & 0 deletions powerbi_integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Power BI <-> Memgraph Integration

Three approaches to visualize Memgraph graph data in Microsoft Power BI.

- **Power BI Desktop** (Windows) — see [DESKTOP.md](DESKTOP.md)
- **Power BI Service** (web, any OS) — see [SERVICE.md](SERVICE.md)

## Prerequisites

- Docker (for Memgraph)
- Python 3.10+

## Quick Start

```bash
# 1. Start Memgraph
docker compose up -d memgraph

# 2. Install Python dependencies
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txt

# 3. Seed sample data (5 persons, 3 companies, 3 products, 16 relationships)
python seed.py
```

## Approaches

| Approach | Description | Desktop | Service | Cost |
|----------|-------------|---------|---------|------|
| Python Script | Query Memgraph directly from Power BI's Python data source | Yes | No | Free |
| REST API | FastAPI middleware exposes Memgraph data as JSON endpoints | Yes | Yes (via Dataflow) | Free |
| ODBC | Third-party driver translates SQL to Cypher over Bolt | Yes | Yes (via Gateway) | Paid |

## REST API

The REST API (Approach 2) is shared across Desktop and Service. Start it before connecting from either.

```bash
# Option A: Run locally
uvicorn rest_api:app --host 0.0.0.0 --port 8000

# Option B: Run everything with Docker Compose (Memgraph + API)
docker compose up -d

# Verify
curl http://localhost:8000/person-purchases
```

### Available endpoints

| Endpoint | Method | Description |
|----------|--------|-------------|
| `/nodes` | GET | All nodes as flat rows |
| `/edges` | GET | All edges as flat rows |
| `/person-purchases` | GET | Persons with purchases (good for charts) |
| `/company-employees` | GET | Companies with employee details |
| `/query` | POST | Run arbitrary Cypher, returns list of dicts |

## Files

```
powerbi_integration/
├── docker-compose.yml # Memgraph + FastAPI service
├── Dockerfile # FastAPI container
├── requirements.txt # Python deps
├── seed.py # Seed sample data into Memgraph
├── rest_api.py # FastAPI service
├── direct_query.py # Power BI Desktop Python script
├── push_to_powerbi.py # Push data to Power BI Service datasets
├── DESKTOP.md # Guide for Power BI Desktop (Windows)
└── SERVICE.md # Guide for Power BI Service (web, any OS)
```
103 changes: 103 additions & 0 deletions powerbi_integration/SERVICE.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
# Power BI Service (Web — any OS)

Two ways to get Memgraph data into Power BI Service without needing Windows. Make sure you've completed the [Quick Start](README.md#quick-start) first.

---

## Approach 1: Push Dataset (recommended for Linux)

Push data directly from a Python script to a Power BI dataset via the REST API. No Gateway, no Windows machine needed. Works from any OS.

### Azure AD setup (one-time)

1. Go to [Azure Portal](https://portal.azure.com) > **App registrations** > **New registration**
2. Name it (e.g. `memgraph-powerbi-push`)
3. Under **API permissions**, add: **Power BI Service > Dataset.ReadWrite.All** (Application)
4. Grant admin consent
5. Under **Certificates & secrets**, create a new client secret
6. Note down:
- **Tenant ID** (from Overview)
- **Client ID** (from Overview)
- **Client secret** (from Certificates & secrets)

### Steps

```bash
# 1. Set Azure credentials
export AZURE_TENANT_ID="your-tenant-id"
export AZURE_CLIENT_ID="your-client-id"
export AZURE_CLIENT_SECRET="your-client-secret"

# 2. (Optional) Target a specific workspace
export POWERBI_WORKSPACE_ID="your-workspace-id"

# 3. Push Memgraph data to Power BI
python push_to_powerbi.py
```

The script will:
- Authenticate with Azure AD
- Create a Push Dataset called "Memgraph Data" (or reuse an existing one)
- Query Memgraph for person purchases and company employees
- Push the data to Power BI

4. Open [Power BI Service](https://app.powerbi.com)
5. Find the **Memgraph Data** dataset in your workspace
6. Click **Create report** and build your visualizations

### Scheduled refresh

Since this is a push model, schedule the script with cron:

```bash
# Push fresh data every hour
0 * * * * cd /path/to/powerbi_integration && .venv/bin/python push_to_powerbi.py
```

No Power BI Gateway needed.

---

## Approach 2: REST API + Dataflow

Use Power BI Dataflows to pull data from the FastAPI service. The API must be reachable from the internet.

### Steps

1. Start the REST API and make it publicly accessible (see [README.md](README.md#rest-api))
- For testing: use a tunnel like ngrok (`ngrok http 8000`)
- For production: deploy behind a reverse proxy with HTTPS
2. In Power BI Service, go to your workspace
3. Click **New > Dataflow**
4. Choose **Define new tables**
5. Select **Web** as the data source
6. Enter the API URL: `https://your-api-host:8000/person-purchases`
7. Power BI will preview the JSON data — click **Transform data** if needed
8. Save the dataflow
9. Repeat for other endpoints (`/nodes`, `/edges`, `/company-employees`)
10. Set a refresh schedule for the dataflow (e.g. daily)
11. Create a report from the dataflow tables

### Scheduled refresh

Dataflows refresh on their own schedule (configured in Power BI Service). No Gateway needed as long as the API is publicly accessible.

---

## Approach 3: ODBC via Gateway

If you have access to a Windows machine, you can set up a Power BI Gateway with an ODBC driver. See [DESKTOP.md](DESKTOP.md#approach-3-odbc) for driver setup, then configure the Gateway to use that DSN.

This is the least practical option for Linux-only environments.

---

## Comparison

| | Push Dataset | REST API + Dataflow | ODBC + Gateway |
|---|---|---|---|
| Works on Linux | Yes | Yes | No (Gateway is Windows) |
| Gateway needed | No | No (if API is public) | Yes |
| Cost | Free | Free | Paid (driver + Gateway) |
| Refresh model | Push (cron) | Pull (Dataflow schedule) | Pull (Gateway schedule) |
| Setup effort | Medium (Azure AD app) | Medium (public API) | High |
74 changes: 74 additions & 0 deletions powerbi_integration/direct_query.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
"""
Power BI Python Data Source script.

Usage:
1. In Power BI Desktop, go to Get Data > Python script
2. Paste this script (or reference this file)
3. Power BI will pick up all pandas DataFrames defined in the script

Prerequisites:
- Python environment with neo4j and pandas installed
- Memgraph running on localhost:7687
"""

import pandas as pd
from neo4j import GraphDatabase

MEMGRAPH_URI = "bolt://localhost:7687"

driver = GraphDatabase.driver(MEMGRAPH_URI, auth=("", ""))


def query_to_df(cypher: str) -> pd.DataFrame:
with driver.session() as session:
result = session.run(cypher)
return pd.DataFrame([record.data() for record in result])


# Power BI will detect each DataFrame variable and offer it as a table.

# All nodes
nodes = query_to_df("""
MATCH (n)
RETURN n.id AS id,
labels(n)[0] AS label,
n.name AS name,
n.age AS age,
n.revenue AS revenue,
n.price AS price
""")

# All edges
edges = query_to_df("""
MATCH (a)-[r]->(b)
RETURN a.id AS source_id,
a.name AS source_name,
type(r) AS relationship,
b.id AS target_id,
b.name AS target_name,
r.since AS since,
r.quantity AS quantity
""")

# Person purchases (flattened for charts)
person_purchases = query_to_df("""
MATCH (p:Person)-[b:BOUGHT]->(prod:Product)
RETURN p.name AS person,
p.age AS age,
prod.name AS product,
prod.price AS unit_price,
b.quantity AS quantity,
prod.price * b.quantity AS total_spent
""")

# Company employees
company_employees = query_to_df("""
MATCH (p:Person)-[w:WORKS_AT]->(c:Company)
RETURN c.name AS company,
c.revenue AS company_revenue,
p.name AS employee,
p.age AS employee_age,
w.since AS employed_since
""")

driver.close()
24 changes: 24 additions & 0 deletions powerbi_integration/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
services:
memgraph:
image: memgraph/memgraph-mage:3.9.0
container_name: memgraph-powerbi
ports:
- "7687:7687"
- "3000:3000"
command:
[
"--telemetry-enabled=false",
"--log-level=TRACE",
"--also-log-to-stderr=true",
]

api:
build: .
container_name: memgraph-powerbi-api
ports:
- "8000:8000"
environment:
- MEMGRAPH_HOST=memgraph
- MEMGRAPH_PORT=7687
depends_on:
- memgraph
Loading