feat(cypher): ReentryPlan + multi-whole-row prefix WITH (#989 slices 4.1+4.3a)#1248
Merged
feat(cypher): ReentryPlan + multi-whole-row prefix WITH (#989 slices 4.1+4.3a)#1248
Conversation
…#989 slice 1) Slice 4.1 of the #989 row-carrier cleanup: add an explicit ReentryPlan + CarriedAlias contract for the bounded MATCH-after-WITH path, populated alongside the existing scalar_reentry_alias / scalar_reentry_columns fields on CompiledCypherExecutionExtras. This commit is additive and a no-behavior-change refactor: * New module graphistry/compute/gfql/cypher/reentry_plan.py * New field reentry_plan: Optional[ReentryPlan] on CompiledCypherExecutionExtras * New property accessor on CompiledCypherQuery * Population at the bounded-reentry construction site; propagation through _map_terminal_reentry_query and _attach_graph_context Read sites still consume scalar_reentry_alias / scalar_reentry_columns and _compiled_query_reentry_contract; later slices switch readers to the plan and lift the single-whole-row constraint. Tests: graphistry/tests/compute/gfql/cypher/ — 1091 passed, 66 skipped, 15 xfailed (no regressions). Refs #987 #989 #999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…es unused (#989 slice 4.3a) Lift the single-whole-row gate at lowering.py:7207 (compile-time) and gfql_unified.py:1220 (runtime contract). Multi-whole-row prefix WITH (e.g. WITH a, x, y) now compiles when only the trailing-MATCH source alias is referenced downstream. For the case IC3 actually needs — non-source whole-row aliases referenced in trailing WHERE / RETURN / WITH — emit an actionable failfast pointing to #989. The full row-carrier rewrite (slice 4.3b) lands the carry of non-source alias properties as hidden columns; that is tracked under #989 as the architectural exit gate. Implementation: * `_bounded_reentry_carry_columns`: returns `(reentry_alias, carried_scalars, non_source_alias_names)` instead of raising on N>1 whole-rows. Picks `reentry_alias` from the trailing-MATCH first node alias when it is a carried whole-row, else falls back to the first. * `_collect_non_source_alias_references`: scans trailing WHERE (predicates + tree path), trailing WITH stages, RETURN items, ORDER BY items, and reentry UNWINDs for references to non-source aliases. * `_iter_property_refs`: walks structured `WhereClause.predicates` for `PropertyRef` nodes (the alias-keyed reference shape). * `_compiled_query_reentry_contract` (gfql_unified): consults the ReentryPlan field added in the prior slice and picks the recorded reentry_alias_name when present. * `ReentryPlan.aliases` now includes a `CarriedAlias` entry per non-source whole-row alias for future slices to populate `carried_properties`. Tests: graphistry/tests/compute/gfql/cypher/ — 1093 passed (+2 new focused cases for multi-whole-row admit + non-source-alias failfast). Refs #987 #989 #999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…urce-alias scanner (#989 slice 4.3a follow-up) Review of PR #1248 surfaced a latent AttributeError in _collect_non_source_alias_references: stage.where on ProjectionStage is Optional[ExpressionText] (raw text node), not Optional[WhereClause]. The prior code passed it to _scan_where(...) which crashes on .expr_tree on any multi-whole-row prefix followed by a downstream WITH ... WHERE stage. Fix: scan stage.where.text directly (symmetric to the unwind / return / order_by branches in the same loop). Add regression test test_string_cypher_admits_multi_whole_row_prefix_with_downstream_stage_where exercising the previously-broken code path on a green-execution shape. Tests: 1094 passed (+1 new regression), no regressions. Refs #989 #999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ields, add plan structure test, comment cleanup (#989) Wave 2 review of PR #1248 (parallel per-dimension subagents) surfaced two IMPORTANT findings that survived the adversarial pass: * Spec + Quality consensus: `CarriedAlias.id_column` (Optional[str] = None) and `carried_properties` (Tuple[str, ...] = ()) were declared but never set, violating CLAUDE.md "Don't design for hypothetical future requirements." Removed; will land in slice 4.3b at first set. * Tests: no test asserted `compiled_query.reentry_plan` structure, leaving silent-regression risk for the plan field in slice 4.3b. Added `test_compile_cypher_records_reentry_plan_for_multi_whole_row_prefix` covering reentry_alias_name, scalar_only, alias output_names, and is_reentry_alias flag distribution. Stacking related Quality SUGGESTIONs: * Replaced "Slice 4.1+4.3" inline planning context with #989-anchored comment per the skill's "ephemeral reference leakage" rule. * Hoisted `dataclasses.fields` to module-top import, dropped the misleading "to avoid top-level shadowing" comment. * Narrowed `_iter_property_refs(node: Any)` to `node: object` and rewrote the docstring to enumerate the actual leaf types it cannot recurse into. Adversarial pass REJECTED a wave-2 BLOCKER claim of infinite recursion in `_iter_property_refs` — frozen dataclasses with PropertyRef leaves are acyclic by construction; the trailing fall-through is correct. Tests: 1095 passed (+1 new) on graphistry/tests/compute/gfql/cypher/. Refs #987 #989 #999. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…lice 4.3b) When a prefix `WITH a, x, y` projects multiple whole-row aliases, rewrite the prefix at compile time to project the non-source aliases as scalar property carries: `WITH a, x.id AS x__id, ...`. Trailing `<non_source>.<prop>` references rewrite to a property access on the reentry-alias's hidden column (`<reentry_alias>.<__cypher_reentry_<alias>__<prop>__>`). The existing scalar carry plumbing forwards the values onto the reentry-alias's row table; no runtime changes needed. Closes the previously-locked #1026 regression (multi-alias WITH + OPTIONAL MATCH) which now executes correctly instead of raising. Slice 4.3a's failfast tightens to bare-reference cases only — bare `<non_source>` carries (and IC3's multi-stage `WITH a, x, y` chain that re-projects through stages) remain out of scope for this slice and now point users at the still-pending follow-up work under #989. Implementation: * `_reentry_property_carry_name(alias, prop)` helper for the unwrapped carry alias used in the prefix WITH; the existing `_reentry_hidden_column_name` wraps it at runtime to the final hidden column name. * `_rewrite_multi_whole_row_prefix(prefix_stage, ...)` rewrites the prefix items: drops bare non-source aliases that have property refs downstream, inserts scalar projection per (alias, property) pair. * `_collect_non_source_alias_property_refs` returns (props_by_alias, bare_referenced) — the dict drives the prefix rewrite, bare_referenced gates the failfast. * `_rewrite_reentry_expr_to_hidden_properties` extended with optional `non_source_carried_props` arg; AST-level rewrite of `PropertyAccessExpr(Identifier(non_source), prop)` runs BEFORE bare-identifier substitution to avoid double-mangling. * `ReentryPlan.aliases` now records `CarriedAlias.carried_properties` for the rewritten non-source aliases (field re-introduced from slice 4.1, now used). Tests: 1096 passed (+1 net new positive — `_admits_non_source_alias_property_carry_through_reentry`, `_failfast_..._bare_referenced` adapted from prior failfast). The `test_issue_1026_multi_alias_with_optional_match_rejects_cleanly` flipped to `_carries_non_source_property` asserting correct execution. Refs #987 #989 #999 #1026. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…terminism comment, ORDER BY failfast test (#989) Wave 3 review of slice 4.3b (commit c6a9852) flagged two SUGGESTION-level items resolvable in a small follow-up: * Carry name `<alias>__<prop>` could collide with a user-projected scalar in the same prefix WITH (hypothetical, low-likelihood). Tightened to `__carry_<alias>__<prop>__` so the namespace can't overlap with any unquoted user identifier. * Sort-determinism comment on the property-tuple ordering inside `_rewrite_multi_whole_row_prefix` so future readers know the order is cosmetic and that downstream AST rewrites match by name. Plus a missing failfast-scope test: * `test_string_cypher_failfast_rejects_multi_whole_row_prefix_when_non_source_alias_is_bare_referenced_in_order_by` exercises the `query.order_by` branch of `_collect_non_source_alias_property_refs`. Adversarial pass on wave 3 confirmed the remaining flagged items (double-rewrite safety, bare-vs-property filter logic, end-to-end carry correctness, flipped #1026 test) are all CORRECT. Tests: 1097 passed (+1 new failfast scope test), no regressions. Refs #989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…am WITH stages (#989 slice 4.3c) Slice 4.3c fold-up: when a downstream WITH stage forwards carried whole-row aliases as bare items (`WITH a, x, y, collect(...)`), drop them at compile time. Their properties are already carried onto the reentry-alias's row table as hidden columns; the bare items are pure forwarding noise. Without this, the slice 4.3a/4.3b bare-reference failfast fires on IC3-shaped multi-stage chains where each `WITH` re-projects the carried aliases for downstream use. With this drop, those queries pass the failfast and proceed to the property-access rewrite layer. Note: this slice does NOT close the full IC3 lane. Multi-stage chains that span more than one reentry MATCH (`MATCH (a)-[:KNOWS]-(friend) ... MATCH (friend)<-[:HAS_CREATOR]-(message)`) still fail at a deeper runtime boundary — the hidden columns carrying `x.id`/`y.id` live on `a`'s row table and don't follow forward to `friend`'s row table after the first reentry's join. That cross-reentry carry-forward is the remaining slice 4.3d work tracked under #989. Implementation: * `_drop_bare_alias_items_from_stage(stage, aliases, *, identifier_re)`: pure helper that drops bare-identifier projection items whose name is in the carried set. Items with explicit aliases (`x AS xref`) are preserved — those are uses, not forwards. * `_rewrite_multi_whole_row_prefix(...)` extended to also return the rewritten downstream WITH stages tail. The bare-ref scanner now runs on the cleaned tail so forwarding patterns don't false-positive. * Caller threads the rewritten tail back into `query.with_stages` before the prefix compile + suffix construction. Tests: 1097 passed. The bare-ref failfast test updated to use `WHERE x = b` (a true USE) since `WITH b, x` is now correctly admitted as a forwarding pattern. Refs #989. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov
commented
May 3, 2026
| if [[ -n "${tck_sha_override}" ]]; then | ||
| tck_ref="${tck_ref}+sha-override" | ||
| tck_sha="${tck_sha_override}" | ||
| fi |
…arry slices Adds an Internal entry under [Development] covering PR #1248's contributions beyond the #1071 lift already on master: - ReentryPlan + CarriedAlias dataclass IR (#987 step 1 / #989 slice 4.1) - Multi-whole-row admit when non-source aliases unused (slice 4.3a) - Compile-time property carry rewrite via __carry_<alias>__<prop>__ hidden columns, closing the multi-alias case of the #1026 regression-lock (slice 4.3b) - Bare-item drop in downstream forwarding WITH stages (slice 4.3c) Cross-reentry-boundary carry forwarding (IC3 / #999 slice 4.3d) is called out as remaining follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 3, 2026
Open
Cypher/GFQL: support multi-MATCH-source row projection in RETURN/WITH (IC5 residual from #999)
#1257
Open
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Foundation work toward
#989row-carrier IR. Two slices on top ofmaster@6891d39e4:Slice 4.1 — Introduce
ReentryPlan+CarriedAliasdataclasses (graphistry/compute/gfql/cypher/reentry_plan.py). Addreentry_plan: Optional[ReentryPlan]toCompiledCypherExecutionExtrasand propagate through_execution_extras_with/_map_terminal_reentry_query/_attach_graph_context. No behavior change — read sites still consumescalar_reentry_alias/scalar_reentry_columns.Slice 4.3a — Lift the
len(whole_row_columns) != 1gate atlowering.py:7207and the matching runtime gate atgfql_unified.py:1220. Multi-whole-row prefixWITH(WITH a, x, y) now compiles when only the trailing-MATCH source alias is referenced downstream. For the IC3 case (non-source whole-row alias referenced in trailingWHERE/RETURN), emit an actionable failfast pointing to#989for the slice 4.3b follow-up that lands the actual property carry.What this PR does and does not close
#987ReentryPlan handshake cleanup — partially addressed (Step 1 dataclass + scalar/whole-row migration to plan field). Steps 2–5 (replace hidden-column protocol, move runtime stitching, split lowering.py) remain follow-up.#989row-carrier IR — partial. The plan dataclass and the multi-whole-row compile path are in place; the property carry for non-source aliases (slice 4.3b) is the remaining architectural exit gate.#999multi-stage MATCH/WITH/MATCH — not yet closed. IC3 still hits the new failfast (correctly: it referencesx.id/y.idin trailing). IC5 still hits the multi-MATCH-source row projection gate atlowering.py:1914(Slice 2 follow-up).The win in this PR: actionable diagnostics and a reusable plan contract for the rest of the row-carrier work.
Scope rationale
See
plans/989-reentryplan-multi-alias-carry/{plan.md,design/current-shape.md,design/reentry-plan.md}(gitignored) for the full design. Short version: the parser-surface "Path A" referenced in#999was empty — both literal IC3 and IC5 fall through to row-lowering invariants. Slice 4.3a is the smallest meaningful step that establishes the plan contract without writing throwaway hidden-column code.Test plan
pytest -q graphistry/tests/compute/gfql/cypher/— 1093 passed (+2 new), 66 skipped, 15 xfailedtest_string_cypher_admits_multi_whole_row_prefix_when_non_source_aliases_are_unusedtest_string_cypher_failfast_rejects_multi_whole_row_prefix_when_non_source_alias_is_referencedFollow-up plan
_reentry_hidden_property_column(alias, prop), scanner that returnsDict[alias, Set[prop]], AST-level rewriter forPropertyAccessExpr(<non_source>, prop)→PropertyAccessExpr(<reentry_alias>, hidden_name), runtime extension of_compiled_query_reentry_stateto merge hidden columns from each non-source alias's meta entry. Closes IC3 lane of#999.lowering.py:1914). Larger scope; separate PR under#989.#987Steps 2–5 (extract reentry runtime toreentry_runtime.py, remove the legacyscalar_reentry_*fields after consumers migrate tocompiled_query.reentry_plan).Refs
#987#989#999.