|
| 1 | +# Kademlia DHT Interoperability Tests |
| 2 | + |
| 3 | +Bash-driven interoperability tests for libp2p Kademlia DHT implementations. The suite mirrors the overall design of [`transport/`](../transport/README.md): Docker Compose per test, a shared Redis for coordination, and a generated test matrix from `images.yaml`. |
| 4 | + |
| 5 | +## Overview |
| 6 | + |
| 7 | +Each test runs **three containers** in distinct roles: |
| 8 | + |
| 9 | +| Role | Purpose | |
| 10 | +|------|---------| |
| 11 | +| **bootstrap** | Seeds the DHT; publishes its multiaddr to Redis and stays up for the test | |
| 12 | +| **provider** | Connects to the bootstrap, announces a provider record, and stores a DHT value | |
| 13 | +| **querier** | Connects to the bootstrap, looks up the provider record, and reads the stored value | |
| 14 | + |
| 15 | +The goal is to verify that implementations can interoperate across the full bootstrap → provide → query path, not only within a single language. |
| 16 | + |
| 17 | +**Current implementations** (see [`images.yaml`](images.yaml)): |
| 18 | + |
| 19 | +- **py** — py-libp2p (`kad-dht-py`) |
| 20 | +- **dotnet** — NethermindEth/dotnet-libp2p (`kad-dht-dotnet`) |
| 21 | + |
| 22 | +## Test naming |
| 23 | + |
| 24 | +Test IDs follow: |
| 25 | + |
| 26 | +``` |
| 27 | +{bootstrap}_x_{provider}_x_{querier} |
| 28 | +``` |
| 29 | + |
| 30 | +The `_x_` separator is literal text (not multiplication). Each segment is an implementation id from `images.yaml`. |
| 31 | + |
| 32 | +**Example:** `py_x_py_x_dotnet` |
| 33 | + |
| 34 | +| Segment | Role | Implementation | |
| 35 | +|---------|------|----------------| |
| 36 | +| 1st | bootstrap | `py` | |
| 37 | +| 2nd | provider | `py` | |
| 38 | +| 3rd | querier | `dotnet` | |
| 39 | + |
| 40 | +So this test asks: *can a .NET querier find a record published by a Python provider on a Python-bootstrapped DHT?* |
| 41 | + |
| 42 | +## Test matrix size |
| 43 | + |
| 44 | +`lib/generate-tests.sh` builds the **full permutation** of bootstrap × provider × querier for every implementation in `images.yaml`: |
| 45 | + |
| 46 | +``` |
| 47 | +number of tests = N³ |
| 48 | +``` |
| 49 | + |
| 50 | +where `N` is the number of implementations. |
| 51 | + |
| 52 | +| Implementations | Tests | |
| 53 | +|-----------------|-------| |
| 54 | +| 2 (`py`, `dotnet`) | 8 | |
| 55 | +| 3 (e.g. + `go`) | 27 | |
| 56 | +| 4 (e.g. + `rust`) | 64 | |
| 57 | +| 5 (e.g. + `js`) | 125 | |
| 58 | + |
| 59 | +Adding a new implementation does not change the naming scheme — only the ids used in each position. Every ordered triple is a distinct interoperability scenario because failures can be role-specific (e.g. .NET as bootstrap with Python as provider may behave differently from the reverse). |
| 60 | + |
| 61 | +For large `N`, run the full matrix in nightly or manual CI jobs and use filtering on pull requests (see below). |
| 62 | + |
| 63 | +## What each test does |
| 64 | + |
| 65 | +1. **Generate matrix** — `lib/generate-tests.sh` writes `test-matrix.yaml` |
| 66 | +2. **Build images** — Docker images for implementations required by the selected tests |
| 67 | +3. **Start Redis** — global `transport-redis` on `transport-network` |
| 68 | +4. **Per test** — `lib/run-single-test.sh` creates an isolated Compose stack: |
| 69 | + - `ROLE=bootstrap|provider|querier` |
| 70 | + - `TEST_KEY` — short hash of the test name; namespaces Redis keys per test |
| 71 | +5. **Bootstrap** — starts libp2p, writes `{TEST_KEY}_bootstrap_addr` to Redis |
| 72 | +6. **Provider** — waits for bootstrap addr, connects, runs DHT provide/put, sets `{TEST_KEY}_provider_done` |
| 73 | +7. **Querier** — waits for bootstrap addr and provider done, connects, runs DHT find/get, prints result to stdout |
| 74 | +8. **Pass/fail** — harness treats a test as failed if the querier exits non-zero **or** prints `status: fail` in its logs |
| 75 | + |
| 76 | +### DHT operations exercised |
| 77 | + |
| 78 | +Both implementations currently run: |
| 79 | + |
| 80 | +- **Test 1** — provider announces key `interop-test-key` (`Provide` / `find_providers`) |
| 81 | +- **Test 3/4** — provider stores and querier reads `/example/data` (`put_value` / `get_value`) |
| 82 | + |
| 83 | +## How to run |
| 84 | + |
| 85 | +### Prerequisites |
| 86 | + |
| 87 | +```bash |
| 88 | +./run.sh --check-deps |
| 89 | +``` |
| 90 | + |
| 91 | +Required: bash 4.0+, docker 20.10+, docker compose, yq 4.0+, git. |
| 92 | + |
| 93 | +### Basic usage |
| 94 | + |
| 95 | +```bash |
| 96 | +# Full matrix (8 tests with py + dotnet) |
| 97 | +./run.sh |
| 98 | + |
| 99 | +# Help and discovery |
| 100 | +./run.sh --help |
| 101 | +./run.sh --list-images |
| 102 | +./run.sh --list-tests |
| 103 | + |
| 104 | +# Single test |
| 105 | +./run.sh --test-select "py_x_py_x_dotnet" |
| 106 | + |
| 107 | +# Rebuild images (e.g. after changing node source) |
| 108 | +./run.sh --force-image-rebuild |
| 109 | +``` |
| 110 | + |
| 111 | +Docker images are cached between runs. Vendored sources (e.g. `dotnet-libp2p`) are cloned into `kad-dht/.cache/git-repos/` and copied into the build context only when the pinned commit changes. Use `--force-image-rebuild` to force a Docker rebuild. |
| 112 | + |
| 113 | +### Filtering |
| 114 | + |
| 115 | +| Option | Description | |
| 116 | +|--------|-------------| |
| 117 | +| `--test-select` | Run only tests whose id matches a pattern (pipe-separated) | |
| 118 | +| `--test-ignore` | Skip tests matching a pattern | |
| 119 | +| `--impl-select` | Limit which implementations are built | |
| 120 | +| `--impl-ignore` | Exclude implementations from the build set | |
| 121 | + |
| 122 | +Examples: |
| 123 | + |
| 124 | +```bash |
| 125 | +./run.sh --test-select "py_x_*" # bootstrap is always py |
| 126 | +./run.sh --test-ignore "*_x_dotnet_x_*" # skip dotnet queriers |
| 127 | +./run.sh --impl-select "py" --test-select "py_x_py_x_py" |
| 128 | +``` |
| 129 | + |
| 130 | +## Results |
| 131 | + |
| 132 | +Each run writes a timestamped directory: |
| 133 | + |
| 134 | +``` |
| 135 | +kad-dht/results/<HHMMSS>-<DD>-<MM>-<YYYY>/ |
| 136 | +├── results.yaml # summary + per-test status |
| 137 | +├── test-matrix.yaml # generated matrix |
| 138 | +├── logs/<test-id>.log # full Docker output per test |
| 139 | +├── docker-compose/ # generated compose files |
| 140 | +└── results/<test-id>.yaml |
| 141 | +``` |
| 142 | + |
| 143 | +`run.sh` prints the results path when finished. |
| 144 | + |
| 145 | +## CI |
| 146 | + |
| 147 | +Pull requests that touch `kad-dht/**` run [`.github/workflows/kad-dht-interop-pr.yml`](../.github/workflows/kad-dht-interop-pr.yml) on self-hosted runners. The composite action [`.github/actions/run-bash-kad-dht-test`](../.github/actions/run-bash-kad-dht-test/action.yml) executes `./run.sh` and uploads the results directory as an artifact. |
| 148 | + |
| 149 | +## Adding a new implementation |
| 150 | + |
| 151 | +1. Add a node under `images/<id>/` implementing all three roles via `ROLE` env var |
| 152 | +2. Add an entry to [`images.yaml`](images.yaml) with `id`, `imageName`, and `buildContext` |
| 153 | +3. For vendored upstream repos, use the `source` block (see `dotnet` for `repo`, `commit`, `patchPath`, `patchFile`, `vendorDir`) |
| 154 | +4. Re-run `./run.sh --list-tests` — the matrix grows to `N³` automatically |
| 155 | + |
| 156 | +Each new node must: |
| 157 | + |
| 158 | +- Publish bootstrap multiaddr to Redis key `{TEST_KEY}_bootstrap_addr` |
| 159 | +- Set `{TEST_KEY}_provider_done` after successful provide/put |
| 160 | +- Print `status: pass` or `status: fail` (and optional `error:` lines) on stdout for the querier role |
| 161 | + |
| 162 | +## Directory layout |
| 163 | + |
| 164 | +``` |
| 165 | +kad-dht/ |
| 166 | +├── README.md # this file |
| 167 | +├── run.sh # entry point |
| 168 | +├── images.yaml # implementation definitions |
| 169 | +├── images/ |
| 170 | +│ ├── py/ # py-libp2p node |
| 171 | +│ └── dotnet/ # dotnet-libp2p node (+ interop-fix.patch) |
| 172 | +├── lib/ |
| 173 | +│ ├── generate-tests.sh |
| 174 | +│ ├── run-single-test.sh |
| 175 | +│ └── build-images.sh |
| 176 | +└── results/ # test output (gitignored) |
| 177 | +``` |
| 178 | + |
| 179 | +## Relation to transport tests |
| 180 | + |
| 181 | +| | **transport/** | **kad-dht/** | |
| 182 | +|--|----------------|--------------| |
| 183 | +| Roles | dialer × listener | bootstrap × provider × querier | |
| 184 | +| Matrix axes | impl × transport × secure × muxer | impl³ (role permutations) | |
| 185 | +| Coordination | Redis multiaddr handoff | Redis bootstrap addr + provider done | |
| 186 | +| Success signal | dialer ping/pong | querier DHT lookup + `status: pass` | |
| 187 | + |
| 188 | +Transport verifies connection establishment; kad-dht verifies DHT record propagation across implementations. |
0 commit comments