|
| 1 | +# Benchmark Artifacts for "Logica-TGD: Transforming Graph Databases Logically" |
| 2 | + |
| 3 | +This directory contains reproducible benchmark notebooks for the paper: |
| 4 | + |
| 5 | +> **Logica-TGD: Transforming Graph Databases Logically** |
| 6 | +> Evgeny Skvortsov, Yilin Xia, Bertram Ludäscher, Shawn Bowers |
| 7 | +> *TGDK, 2026* |
| 8 | +
|
| 9 | +## Benchmarks |
| 10 | + |
| 11 | +We compare four systems on graph computation problems (transitive closure, |
| 12 | +pairwise distances, same generation): |
| 13 | + |
| 14 | +- **Logica** — compiling to DuckDB SQL |
| 15 | +- **Soufflé** — Datalog engine with parallel evaluation |
| 16 | +- **DuckPGQ** — DuckDB extension implementing SQL/PGQ (Cypher-style queries) |
| 17 | +- **Nemo** — single-threaded Rust rule engine (for the Nemo column only) |
| 18 | + |
| 19 | +All benchmarks were run on a Google Cloud **c2d-standard-32** instance |
| 20 | +(32 vCPUs, 128 GB RAM) using Logica 1.3.1415926535897, DuckDB 1.3.2, |
| 21 | +Soufflé 2.4, and Nemo 0.10.0. |
| 22 | + |
| 23 | +### Main notebooks |
| 24 | + |
| 25 | +| Notebook | Description | |
| 26 | +|----------|-------------| |
| 27 | +| `benchmark_logica.ipynb` | Logica benchmarks (all problems). **Run this first** — it generates input data (CSV files and `graphs.db`) used by the other notebooks. | |
| 28 | +| `benchmark_souffle.ipynb` | Soufflé benchmarks (compiled mode) | |
| 29 | +| `benchmark_cypher.ipynb` | DuckPGQ / Cypher benchmarks | |
| 30 | + |
| 31 | +### Auxiliary materials |
| 32 | + |
| 33 | +| File | Description | |
| 34 | +|------|-------------| |
| 35 | +| `auxiliary/benchmark_souffle_interpreted.ipynb` | Soufflé benchmarks in interpreted mode (used in the original submission) | |
| 36 | +| `auxiliary/benchmark_logica_with_output_sizes.ipynb` | Logica notebook computing output sizes for the table in the paper | |
| 37 | +| `auxiliary/souffle_compiled_vs_interpreted.md` | Comparison of Soufflé compiled vs. interpreted modes | |
| 38 | + |
| 39 | +## Reproducing the results |
| 40 | + |
| 41 | +1. Install Jupyter Notebook: |
| 42 | + ``` |
| 43 | + python3 -m pip install notebook |
| 44 | + ``` |
| 45 | + |
| 46 | +2. Install DuckDB: |
| 47 | + ``` |
| 48 | + python3 -m pip install duckdb |
| 49 | + ``` |
| 50 | + |
| 51 | +3. Install Soufflé (v2.4 was used) by following the instructions at |
| 52 | + [souffle-lang.github.io](https://souffle-lang.github.io/install). |
| 53 | + |
| 54 | +4. Clone this repository: |
| 55 | + ``` |
| 56 | + git clone https://github.com/EvgSkv/logica |
| 57 | + ``` |
| 58 | + |
| 59 | +5. Start the notebook server from the repository root, so that Logica |
| 60 | + is importable: |
| 61 | + ``` |
| 62 | + cd logica |
| 63 | + python3 -m notebook examples/graph/tgdk |
| 64 | + ``` |
| 65 | + Alternatively, install Logica with `python3 -m pip install logica` and start |
| 66 | + the notebook from anywhere. |
| 67 | + |
| 68 | +6. Run the notebooks starting with `benchmark_logica.ipynb` — it |
| 69 | + generates the input data (CSV files and `graphs.db`) used by the |
| 70 | + Soufflé and DuckPGQ notebooks. Then proceed to `benchmark_souffle.ipynb` |
| 71 | + and `benchmark_cypher.ipynb`. |
| 72 | + |
| 73 | +For the Nemo comparison, see the [Nemo section](#nemo-comparison) below. |
| 74 | + |
| 75 | +## Nemo comparison |
| 76 | + |
| 77 | +| File | Description | |
| 78 | +|------|-------------| |
| 79 | +| `benchmark_and_collect.py` | Runs all TC and SG benchmarks on both Logica and Nemo, collects wall/CPU times and fact counts into `benchmark_results.txt` (ASCII table) and `benchmark_results.csv`. Generates the `.l` and `.nemo` programs from templates. | |
| 80 | +| `tc_g1k.l`, `tc_g1k.nemo` | Example Logica and Nemo programs for transitive closure (shown for reference — the script regenerates all sizes). | |
| 81 | +| `sg_tree7.l`, `sg_tree7.nemo` | Example Logica and Nemo programs for same generation. | |
| 82 | +| `benchmark_results.txt` | Output of `benchmark_and_collect.py` from our run. | |
| 83 | + |
| 84 | +To run the Nemo comparison: |
| 85 | + |
| 86 | +1. Install Nemo 0.10.0 (see [nemo rule engine](https://github.com/knowsys/nemo)). |
| 87 | + The `nmo` binary must be on `PATH` (we invoke it as `nemo` in the script — |
| 88 | + adjust the command there if your binary is named `nmo`). |
| 89 | +2. Make sure the CSV inputs (`g1k.csv`..`g5k.csv`, `tree7.csv`..`tree12.csv`) |
| 90 | + are present in the same directory. They are produced by running |
| 91 | + `benchmark_logica.ipynb`. |
| 92 | +3. Run the script from this directory: |
| 93 | + ``` |
| 94 | + python3 benchmark_and_collect.py |
| 95 | + ``` |
| 96 | + It writes `benchmark_results.txt` and `benchmark_results.csv`. |
0 commit comments