-
Notifications
You must be signed in to change notification settings - Fork 9
docs(integration): add dedicated bulk-loader page and expand Data import section #442
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
+135
−1
Merged
Changes from all commits
Commits
Show all changes
7 commits
Select commit
Hold shift + click to select a range
72272a9
docs(integration): add dedicated bulk-loader page and expand Data imp…
gkorland 5ccb9fd
docs(wordlist): add erroring and namespaces for bulk-loader page
gkorland 9e64a26
Potential fix for pull request finding
gkorland 1e7d128
fix(bulk-loader): correct --server-url flag name (was --redis-url)
gkorland dcaa373
docs(integration): address bulk-loader PR review feedback
gkorland 749056d
docs(spellcheck): add 'eval' to wordlist for cypher/known-limitations.md
gkorland e035798
Merge branch 'main' into docs/bulk-loader-page
gkorland File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -828,3 +828,7 @@ ObjectPool | |
| QuickJS | ||
| ACLs | ||
| filesystem | ||
| erroring | ||
| namespaces | ||
| tracebacks | ||
| eval | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,122 @@ | ||
| --- | ||
| title: "Bulk Loader" | ||
| description: "Import large graphs from CSV files into FalkorDB using the falkordb-bulk-loader tool" | ||
| parent: "Integration" | ||
| nav_order: 8 | ||
| --- | ||
|
|
||
| # Bulk Loader | ||
|
|
||
| The [falkordb-bulk-loader](https://github.com/falkordb/falkordb-bulk-loader) is a Python utility for building FalkorDB graphs from CSV files. It uses the `GRAPH.BULK` endpoint to import nodes and relationships efficiently in binary batches — much faster than issuing individual `CREATE` queries. | ||
|
|
||
| ## Requirements | ||
|
|
||
| - Python 3.10 or later | ||
| - A running FalkorDB instance (see [Get Started](/getting-started)) | ||
|
|
||
| ## Installation | ||
|
|
||
| ```sh | ||
| pip install falkordb-bulk-loader | ||
| ``` | ||
|
|
||
| ## Quick Start | ||
|
|
||
| Given two CSV files — `Person.csv` (nodes) and `KNOWS.csv` (relationships) — import them into a graph named `SocialGraph`: | ||
|
|
||
| ```sh | ||
| falkordb-bulk-insert SocialGraph \ | ||
| -n Person.csv \ | ||
| -r KNOWS.csv | ||
| ``` | ||
|
|
||
| The label (for nodes) and relationship type (for relationships) are derived from the CSV filename. Multiple node and relation files can be provided by repeating the flags: | ||
|
|
||
| ```sh | ||
| falkordb-bulk-insert SocialGraph \ | ||
| -n Person.csv \ | ||
| -n Country.csv \ | ||
| -r KNOWS.csv \ | ||
| -r VISITED.csv | ||
| ``` | ||
|
|
||
| ## Connecting to FalkorDB | ||
|
|
||
| By default the loader connects to `redis://127.0.0.1:6379`. Use `--server-url` to point it at a different instance: | ||
|
|
||
| ```sh | ||
| falkordb-bulk-insert SocialGraph \ | ||
| --server-url redis://myhost:6379 \ | ||
| -n Person.csv | ||
| ``` | ||
|
|
||
| ## Key Options | ||
|
|
||
| | Flag | Extended flag | Description | | ||
| |:----:|---------------|-------------| | ||
| | `-u` | `--server-url TEXT` | Server URL (default: `redis://127.0.0.1:6379`) | | ||
| | `-n` | `--nodes TEXT` | Node CSV file (filename → label) | | ||
| | `-N` | `--nodes-with-label TEXT` | Explicit label followed by node CSV file | | ||
| | `-r` | `--relations TEXT` | Relationship CSV file (filename → type) | | ||
| | `-R` | `--relations-with-type TEXT` | Explicit type followed by relationship CSV file | | ||
| | `-o` | `--separator CHAR` | Field delimiter (default: `,`) | | ||
| | `-d` | `--enforce-schema` | Require typed column headers (see below) | | ||
| | `-j` | `--id-type TEXT` | Type of node ID property: `STRING` or `INTEGER` | | ||
| | `-s` | `--skip-invalid-nodes` | Skip duplicate node IDs instead of erroring | | ||
| | `-e` | `--skip-invalid-edges` | Skip edges with unknown endpoints instead of erroring | | ||
| | `-i` | `--index Label:Property` | Create a range index after import | | ||
| | `-f` | `--full-text-index Label:Property` | Create a full-text index after import | | ||
|
|
||
| ## Enforcing a Schema | ||
|
|
||
| By default the loader infers each property's type. Use `--enforce-schema` (`-d`) when you want explicit control. Column headers must follow the `name:TYPE` format: | ||
|
|
||
| **User.csv** | ||
| ```csv | ||
| :ID(User),name:STRING,rank:INT | ||
| 0,"Alice",5 | ||
| 1,"Bob",8 | ||
| ``` | ||
|
|
||
| **FOLLOWS.csv** | ||
| ```csv | ||
| :START_ID(User),:END_ID(User),weight:DOUBLE | ||
| 0,1,0.9 | ||
| 1,0,0.4 | ||
| ``` | ||
|
|
||
| ```sh | ||
| falkordb-bulk-insert SocialGraph \ | ||
| --enforce-schema \ | ||
| -n User.csv \ | ||
| -r FOLLOWS.csv | ||
| ``` | ||
|
|
||
| Accepted type strings: `ID`, `START_ID`, `END_ID`, `IGNORE`, `STRING`, `INT` / `INTEGER` / `LONG`, `DOUBLE` / `FLOAT`, `BOOL` / `BOOLEAN`, `ARRAY`. | ||
|
|
||
| ## Bulk Updates | ||
|
|
||
| The companion command `falkordb-bulk-update` reads a CSV in batches and issues a parameterized Cypher query for each row — useful for incremental updates or when you want full control over the Cypher: | ||
|
|
||
| ```sh | ||
| falkordb-bulk-update SocialGraph \ | ||
| --csv User.csv \ | ||
| --query "MERGE (:User {id: row[0], name: row[1], rank: row[2]})" | ||
| ``` | ||
|
|
||
| > **Note:** `falkordb-bulk-update` commits changes incrementally. Sanitize your CSV inputs beforehand to avoid leaving the graph in a partially-updated state. | ||
|
|
||
| ## Diagnostics | ||
|
|
||
| Both `falkordb-bulk-insert` and `falkordb-bulk-update` install a `SIGUSR1` handler at startup. Sending `SIGUSR1` to a running loader process writes the tracebacks of all Python threads to `stderr`, which is useful for diagnosing hangs or unexpectedly slow loads without attaching a debugger: | ||
|
|
||
| ```sh | ||
| kill -SIGUSR1 <pid> | ||
| ``` | ||
|
|
||
| This relies on Python's `faulthandler` module and is only available on platforms that support `SIGUSR1` (i.e., not Windows). On unsupported platforms, registration is silently skipped. | ||
|
|
||
| ## Further Reading | ||
|
|
||
| - [GitHub repository](https://github.com/falkordb/falkordb-bulk-loader) — full CLI reference, input constraints, and ID namespaces | ||
| - [GRAPH.BULK specification](/design/bulk-spec) — technical wire-format specification for the underlying endpoint | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.