Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -828,3 +828,7 @@ ObjectPool
QuickJS
ACLs
filesystem
erroring
namespaces
tracebacks
eval
9 changes: 8 additions & 1 deletion index.md
Original file line number Diff line number Diff line change
Expand Up @@ -258,7 +258,14 @@ The full list and links can be found on the [Client Libraries](/getting-started/

## Data import

When loading large graphs from CSV files, we recommend using [falkordb-bulk-loader](https://github.com/falkordb/falkordb-bulk-loader)
When loading large graphs from CSV files, use the [falkordb-bulk-loader](https://github.com/falkordb/falkordb-bulk-loader):

```sh
pip install falkordb-bulk-loader
falkordb-bulk-insert GRAPHNAME -n nodes.csv -r edges.csv
```

See the [Bulk Loader documentation](/integration/bulk-loader) for the full reference.

## GitHub Discussions

Expand Down
122 changes: 122 additions & 0 deletions integration/bulk-loader.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,122 @@
---
title: "Bulk Loader"
description: "Import large graphs from CSV files into FalkorDB using the falkordb-bulk-loader tool"
Comment thread
gkorland marked this conversation as resolved.
parent: "Integration"
nav_order: 8
---

# Bulk Loader

The [falkordb-bulk-loader](https://github.com/falkordb/falkordb-bulk-loader) is a Python utility for building FalkorDB graphs from CSV files. It uses the `GRAPH.BULK` endpoint to import nodes and relationships efficiently in binary batches — much faster than issuing individual `CREATE` queries.

## Requirements

- Python 3.10 or later
- A running FalkorDB instance (see [Get Started](/getting-started))

## Installation

```sh
pip install falkordb-bulk-loader
```

## Quick Start

Given two CSV files — `Person.csv` (nodes) and `KNOWS.csv` (relationships) — import them into a graph named `SocialGraph`:

```sh
falkordb-bulk-insert SocialGraph \
-n Person.csv \
-r KNOWS.csv
```

The label (for nodes) and relationship type (for relationships) are derived from the CSV filename. Multiple node and relation files can be provided by repeating the flags:

```sh
falkordb-bulk-insert SocialGraph \
-n Person.csv \
-n Country.csv \
-r KNOWS.csv \
-r VISITED.csv
```

## Connecting to FalkorDB

By default the loader connects to `redis://127.0.0.1:6379`. Use `--server-url` to point it at a different instance:

```sh
falkordb-bulk-insert SocialGraph \
--server-url redis://myhost:6379 \
-n Person.csv
```

## Key Options

| Flag | Extended flag | Description |
|:----:|---------------|-------------|
| `-u` | `--server-url TEXT` | Server URL (default: `redis://127.0.0.1:6379`) |
| `-n` | `--nodes TEXT` | Node CSV file (filename → label) |
| `-N` | `--nodes-with-label TEXT` | Explicit label followed by node CSV file |
| `-r` | `--relations TEXT` | Relationship CSV file (filename → type) |
| `-R` | `--relations-with-type TEXT` | Explicit type followed by relationship CSV file |
| `-o` | `--separator CHAR` | Field delimiter (default: `,`) |
| `-d` | `--enforce-schema` | Require typed column headers (see below) |
| `-j` | `--id-type TEXT` | Type of node ID property: `STRING` or `INTEGER` |
| `-s` | `--skip-invalid-nodes` | Skip duplicate node IDs instead of erroring |
| `-e` | `--skip-invalid-edges` | Skip edges with unknown endpoints instead of erroring |
| `-i` | `--index Label:Property` | Create a range index after import |
| `-f` | `--full-text-index Label:Property` | Create a full-text index after import |

## Enforcing a Schema

By default the loader infers each property's type. Use `--enforce-schema` (`-d`) when you want explicit control. Column headers must follow the `name:TYPE` format:

**User.csv**
```csv
:ID(User),name:STRING,rank:INT
0,"Alice",5
1,"Bob",8
```

**FOLLOWS.csv**
```csv
:START_ID(User),:END_ID(User),weight:DOUBLE
0,1,0.9
1,0,0.4
```

```sh
falkordb-bulk-insert SocialGraph \
--enforce-schema \
-n User.csv \
-r FOLLOWS.csv
```

Accepted type strings: `ID`, `START_ID`, `END_ID`, `IGNORE`, `STRING`, `INT` / `INTEGER` / `LONG`, `DOUBLE` / `FLOAT`, `BOOL` / `BOOLEAN`, `ARRAY`.

## Bulk Updates

The companion command `falkordb-bulk-update` reads a CSV in batches and issues a parameterized Cypher query for each row — useful for incremental updates or when you want full control over the Cypher:

```sh
falkordb-bulk-update SocialGraph \
--csv User.csv \
--query "MERGE (:User {id: row[0], name: row[1], rank: row[2]})"
```

> **Note:** `falkordb-bulk-update` commits changes incrementally. Sanitize your CSV inputs beforehand to avoid leaving the graph in a partially-updated state.

## Diagnostics

Both `falkordb-bulk-insert` and `falkordb-bulk-update` install a `SIGUSR1` handler at startup. Sending `SIGUSR1` to a running loader process writes the tracebacks of all Python threads to `stderr`, which is useful for diagnosing hangs or unexpectedly slow loads without attaching a debugger:

```sh
kill -SIGUSR1 <pid>
```

This relies on Python's `faulthandler` module and is only available on platforms that support `SIGUSR1` (i.e., not Windows). On unsupported platforms, registration is silently skipped.

## Further Reading

- [GitHub repository](https://github.com/falkordb/falkordb-bulk-loader) — full CLI reference, input constraints, and ID namespaces
- [GRAPH.BULK specification](/design/bulk-spec) — technical wire-format specification for the underlying endpoint
1 change: 1 addition & 0 deletions integration/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,6 @@ Learn how to leverage FalkorDB's flexible APIs and SDKs to build high-performanc
- [Spring Data FalkorDB](./spring-data-falkordb.md): Learn how to use FalkorDB with Spring Data for JPA-style object-graph mapping.
- [Snowflake Integration](./snowflake.md): Learn how to run graph database operations directly within your Snowflake environment using the FalkorDB Native App.
- [PyTorch Geometric](./pyg.md): Train Graph Neural Networks directly on graphs stored in FalkorDB using PyG's remote backend interface.
- [Bulk Loader](./bulk-loader.md): Import large graphs from CSV files using the falkordb-bulk-loader Python utility.


Loading