|
| 1 | +# Feature Composition Testing Plan: PR #846 + #852 |
| 2 | + |
| 3 | +## Status Summary |
| 4 | + |
| 5 | +| Item | Status | Notes | |
| 6 | +|------|--------|-------| |
| 7 | +| P0/P1 Tests for #846 | ✅ DONE | 8 tests added; 6 xfail (bugs found), 2 passing | |
| 8 | +| Multi-hop bugs filed | ✅ DONE | Issue #872 created | |
| 9 | +| Alloy README update | ✅ DONE | Scope/limitations documented | |
| 10 | +| Meta-issue roadmap | ✅ DONE | Issue #871 created | |
| 11 | + |
| 12 | +## Issues Created |
| 13 | + |
| 14 | +- **#871**: Meta: GFQL Testing & Verification Roadmap |
| 15 | +- **#872**: Fix multi-hop + WHERE backward prune bugs in cuDF executor |
| 16 | + |
| 17 | +## Branch Structure |
| 18 | + |
| 19 | +``` |
| 20 | +master (includes PR #851 hop ranges - MERGED) |
| 21 | + └── PR #846: feat/issue-837-cudf-hop-executor (same-path executor) |
| 22 | + └── PR #852: feat/issue-838-alloy-fbf-where (alloy proof) ← CURRENT |
| 23 | +``` |
| 24 | + |
| 25 | +## Execution Order |
| 26 | + |
| 27 | +### Phase 1: PR #846 Tests (on branch `feat/issue-837-cudf-hop-executor`) |
| 28 | + |
| 29 | +**Status: ✅ COMPLETE** |
| 30 | + |
| 31 | +Tests added to `tests/gfql/ref/test_cudf_executor_inputs.py`: |
| 32 | + |
| 33 | +| # | Test | Status | Notes | |
| 34 | +|---|------|--------|-------| |
| 35 | +| 1 | WHERE respected after min_hops backtracking | xfail | Bug #872 | |
| 36 | +| 2 | Reverse direction + hop range + WHERE | xfail | Bug #872 | |
| 37 | +| 3 | Non-adjacent alias WHERE | xfail | Bug #872 | |
| 38 | +| 4 | Oracle vs cuDF parity comprehensive | xfail | Bug #872 | |
| 39 | +| 5 | Multi-hop edge WHERE filtering | xfail | Bug #872 | |
| 40 | +| 6 | Output slicing + WHERE | ✅ PASS | Works correctly | |
| 41 | +| 7 | label_seeds + output_min_hops | ✅ PASS | Works correctly | |
| 42 | +| 8 | Multiple WHERE + mixed hop ranges | xfail | Bug #872 | |
| 43 | + |
| 44 | +**Key Finding**: The cuDF executor has architectural limitations with multi-hop edges + WHERE: |
| 45 | +- Backward prune doesn't trace through intermediate edges |
| 46 | +- `_is_single_hop()` gates WHERE filtering |
| 47 | +- Non-adjacent alias WHERE not applied |
| 48 | + |
| 49 | +These are documented in issue #872 for future fix. |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +### Phase 2: Rebase PR #852 onto master |
| 54 | + |
| 55 | +```bash |
| 56 | +git checkout feat/issue-838-alloy-fbf-where |
| 57 | +git fetch origin |
| 58 | +git rebase origin/master |
| 59 | +# Resolve any conflicts |
| 60 | +git push origin feat/issue-838-alloy-fbf-where --force-with-lease |
| 61 | +``` |
| 62 | + |
| 63 | +--- |
| 64 | + |
| 65 | +### Phase 3: PR #852 Verification Updates (on branch `feat/issue-838-alloy-fbf-where`) |
| 66 | + |
| 67 | +**Status: ✅ COMPLETE** |
| 68 | + |
| 69 | +| # | Change | File | Status | |
| 70 | +|---|--------|------|--------| |
| 71 | +| 1 | Clarify hop ranges NOT formally verified | `alloy/README.md` | ✅ DONE | |
| 72 | +| 2 | Note reliance on Python parity tests | `alloy/README.md` | ✅ DONE | |
| 73 | +| 3 | State verified fragment precisely | `alloy/README.md` | ✅ DONE | |
| 74 | + |
| 75 | +**P1 - Add scenario checks (optional, strengthens claims)** - Deferred to future work. |
| 76 | + |
| 77 | +**Next steps:** |
| 78 | +```bash |
| 79 | +git checkout feat/issue-837-cudf-hop-executor |
| 80 | +git stash pop # Apply the test changes |
| 81 | +git add -A && git commit |
| 82 | +git push origin feat/issue-837-cudf-hop-executor |
| 83 | +# Wait for CI green, then merge PR #846 to master |
| 84 | +``` |
| 85 | + |
| 86 | +--- |
| 87 | + |
| 88 | +## Test Implementation Details |
| 89 | + |
| 90 | +### Test 1: WHERE after min_hops backtracking |
| 91 | + |
| 92 | +```python |
| 93 | +def test_where_respected_after_backtracking(): |
| 94 | + """ |
| 95 | + Graph: a -> b -> c -> d (3 hops) |
| 96 | + a -> x -> y (2 hops, dead end for min_hops=3) |
| 97 | +
|
| 98 | + WHERE: a.value < d.value |
| 99 | +
|
| 100 | + Backtracking for min_hops=3 should: |
| 101 | + 1. Prune x,y branch (doesn't reach 3 hops) |
| 102 | + 2. Keep a,b,c,d path |
| 103 | + 3. THEN apply WHERE to filter paths where a.value < d.value |
| 104 | +
|
| 105 | + If WHERE not re-applied after backtracking, invalid paths may remain. |
| 106 | + """ |
| 107 | +``` |
| 108 | + |
| 109 | +### Test 2: Reverse direction + WHERE |
| 110 | + |
| 111 | +```python |
| 112 | +def test_reverse_direction_where_semantics(): |
| 113 | + """ |
| 114 | + Graph: a -> b -> c -> d (forward edges) |
| 115 | +
|
| 116 | + Chain: [n(name='start'), e_reverse(min_hops=2), n(name='end')] |
| 117 | + WHERE: start.value > end.value |
| 118 | +
|
| 119 | + Starting at 'd', reverse traversal reaches: |
| 120 | + - c at hop 1, b at hop 2, a at hop 3 |
| 121 | +
|
| 122 | + With min_hops=2, valid endpoints are b (hop 2) and a (hop 3). |
| 123 | + WHERE compares start (d) vs end (b or a). |
| 124 | +
|
| 125 | + Verify WHERE semantics are consistent regardless of traversal direction. |
| 126 | + """ |
| 127 | +``` |
| 128 | + |
| 129 | +### Test 3: Non-adjacent alias WHERE |
| 130 | + |
| 131 | +```python |
| 132 | +def test_non_adjacent_alias_where(): |
| 133 | + """ |
| 134 | + Chain: [n(name='a'), e_forward(), n(name='b'), e_forward(), n(name='c')] |
| 135 | + WHERE: a.id == c.id (aliases 2 edges apart) |
| 136 | +
|
| 137 | + This WHERE clause should filter to paths where the first and last |
| 138 | + nodes have the same id (e.g., cycles back to start). |
| 139 | +
|
| 140 | + Risk: cuDF backward prune only applies WHERE to adjacent aliases. |
| 141 | + """ |
| 142 | +``` |
| 143 | + |
| 144 | +### Test 4: Oracle vs cuDF parity (parametrized) |
| 145 | + |
| 146 | +```python |
| 147 | +@pytest.mark.parametrize("scenario", COMPOSITION_SCENARIOS) |
| 148 | +def test_oracle_cudf_parity(scenario): |
| 149 | + """ |
| 150 | + Run same query with Oracle and cuDF executor. |
| 151 | + Verify identical results. |
| 152 | +
|
| 153 | + Scenarios cover all combinations of: |
| 154 | + - Directions: forward, reverse, undirected |
| 155 | + - Hop ranges: min_hops, max_hops, output slicing |
| 156 | + - WHERE operators: ==, !=, <, <=, >, >= |
| 157 | + - Topologies: linear, branch, cycle, disconnected |
| 158 | + """ |
| 159 | +``` |
| 160 | + |
| 161 | +--- |
| 162 | + |
| 163 | +## README Update for PR #852 |
| 164 | + |
| 165 | +```markdown |
| 166 | +## Scope and Limitations |
| 167 | + |
| 168 | +### What IS Formally Verified |
| 169 | + |
| 170 | +- WHERE clause lowering to per-alias value summaries |
| 171 | +- Equality (==, !=) via bitset filtering |
| 172 | +- Inequality (<, <=, >, >=) via min/max summaries |
| 173 | +- Multi-step chains with cross-alias comparisons |
| 174 | +- Graph topologies: fan-out, fan-in, cycles, parallel edges, disconnected |
| 175 | + |
| 176 | +### What is NOT Formally Verified |
| 177 | + |
| 178 | +- **Hop ranges** (`min_hops`, `max_hops`): Approximated by unrolling to fixed-length chains |
| 179 | +- **Output slicing** (`output_min_hops`, `output_max_hops`): Treated as post-filter |
| 180 | +- **Hop labeling** (`label_node_hops`, `label_edge_hops`, `label_seeds`): Not modeled |
| 181 | +- **Null/NaN semantics**: Verified in Python tests |
| 182 | + |
| 183 | +### Test Coverage for Unverified Features |
| 184 | + |
| 185 | +Hop ranges and output slicing are covered by Python parity tests: |
| 186 | +- `tests/gfql/ref/test_enumerator_parity.py`: 11+ hop range scenarios |
| 187 | +- `tests/gfql/ref/test_cudf_executor_inputs.py`: 8+ WHERE + hop range scenarios |
| 188 | + |
| 189 | +These tests verify the cuDF executor matches the reference oracle implementation. |
| 190 | +``` |
| 191 | + |
| 192 | +--- |
| 193 | + |
| 194 | +## Priority Summary |
| 195 | + |
| 196 | +| Priority | Branch | Items | Blocks | |
| 197 | +|----------|--------|-------|--------| |
| 198 | +| **P0** | #846 | 4 tests | Merge of #846 | |
| 199 | +| **P1** | #846 | 4 tests | - | |
| 200 | +| **P0** | #852 | README scope update | Merge of #852 | |
| 201 | +| **P1** | #852 | Alloy scenario checks | - | |
| 202 | + |
| 203 | +--- |
| 204 | + |
| 205 | +## Success Criteria |
| 206 | + |
| 207 | +### PR #846 Ready to Merge When: |
| 208 | +- [ ] All 8 new tests pass |
| 209 | +- [ ] Existing tests still pass |
| 210 | +- [ ] CI green |
| 211 | + |
| 212 | +### PR #852 Ready to Merge When: |
| 213 | +- [ ] README accurately describes verified scope |
| 214 | +- [ ] Alloy checks pass (existing + any new scenarios) |
| 215 | +- [ ] CI green |
| 216 | + |
| 217 | +--- |
| 218 | + |
| 219 | +## Resume Context |
| 220 | + |
| 221 | +### Current State (as of session end) |
| 222 | +- **Current branch**: `feat/issue-838-alloy-fbf-where` (PR #852) |
| 223 | +- **Stash**: Test changes stashed on `feat/issue-837-cudf-hop-executor` (stash@{0}) |
| 224 | +- **Uncommitted**: `alloy/README.md` changes (scope/limitations section added) |
| 225 | + |
| 226 | +### Git State Summary |
| 227 | +``` |
| 228 | +feat/issue-838-alloy-fbf-where: |
| 229 | + - Modified: alloy/README.md (scope/limitations section) |
| 230 | + - Untracked: PLAN-846-852-feature-composition.md (this file) |
| 231 | +
|
| 232 | +feat/issue-837-cudf-hop-executor (stash@{0}): |
| 233 | + - 8 new tests in tests/gfql/ref/test_cudf_executor_inputs.py |
| 234 | + - TestP0FeatureComposition class (4 tests, 3 xfail + 1 passing) |
| 235 | + - TestP1FeatureComposition class (4 tests, 3 xfail + 1 passing) |
| 236 | +``` |
| 237 | + |
| 238 | +### Key Files Modified |
| 239 | +1. `tests/gfql/ref/test_cudf_executor_inputs.py` - Added 8 feature composition tests |
| 240 | +2. `alloy/README.md` - Added scope/limitations section |
| 241 | +3. `PLAN-846-852-feature-composition.md` - This tracking document |
| 242 | + |
| 243 | +### Bug Details (Issue #872) |
| 244 | +Root cause in `graphistry/compute/gfql/cudf_executor.py`: |
| 245 | +- `_backward_prune()` lines 312-393: Assumes single-hop edges |
| 246 | +- `_is_single_hop()` gates WHERE filtering |
| 247 | +- Multi-hop edges break backward prune path tracing |
| 248 | + |
| 249 | +### To Resume Work |
| 250 | +```bash |
| 251 | +# 1. Commit alloy README changes on current branch |
| 252 | +git add alloy/README.md |
| 253 | +git commit -m "docs(alloy): add scope and limitations section" |
| 254 | +git push origin feat/issue-838-alloy-fbf-where |
| 255 | + |
| 256 | +# 2. Switch to #846 branch and apply stashed tests |
| 257 | +git checkout feat/issue-837-cudf-hop-executor |
| 258 | +git stash pop |
| 259 | + |
| 260 | +# 3. Commit and push test changes |
| 261 | +git add tests/gfql/ref/test_cudf_executor_inputs.py |
| 262 | +git commit -m "test(gfql): add 8 feature composition tests for hop ranges + WHERE |
| 263 | +
|
| 264 | +Adds P0/P1 tests for PR #846 same-path executor with hop ranges. |
| 265 | +6 tests xfail documenting known bugs (see issue #872). |
| 266 | +2 tests pass verifying output slicing and label_seeds work correctly." |
| 267 | +git push origin feat/issue-837-cudf-hop-executor |
| 268 | + |
| 269 | +# 4. Wait for CI, then merge PRs in order: #846 first, then rebase/merge #852 |
| 270 | +``` |
| 271 | + |
| 272 | +### Related Issues |
| 273 | +- **#871**: Meta: GFQL Testing & Verification Roadmap (future work) |
| 274 | +- **#872**: Fix multi-hop + WHERE backward prune bugs in cuDF executor |
0 commit comments