Skip to content

fix(gfql/row): resolve bare relationship alias in row expressions (#1072)#1249

Merged
lmeyerov merged 7 commits intomasterfrom
fix/1072-cypher-edge-alias-row-expr
May 3, 2026
Merged

fix(gfql/row): resolve bare relationship alias in row expressions (#1072)#1249
lmeyerov merged 7 commits intomasterfrom
fix/1072-cypher-edge-alias-row-expr

Conversation

@lmeyerov
Copy link
Copy Markdown
Contributor

@lmeyerov lmeyerov commented May 2, 2026

Summary

Fixes #1072. Bare edge aliases in row expressions (e.g. select([("rel", "workAt")]), where_rows("workAt IS NOT NULL")) raised unsupported token in row expression: 'workAt' because _gfql_resolve_token's bare-alias fallback only knew about node aliases ({alias}.{node_id} lookup).

Before / after

Before — direct GFQL chain over a bindings row table:

ops = [
    n({"label__Person": True}, name="friend"),
    e_forward(edge_match={"type": "WORKS_AT"}, name="workAt"),
    n({"label__Company": True}, name="company"),
    rows(table="nodes", binding_ops=[...]),
    select(items=[("rel", "workAt")]),
]
g.gfql(ops)
# GFQLTypeError: Error executing 'select': unsupported token in row expression: 'workAt'

After: returns a Cypher-style rendering, matching what RETURN <relAlias> already produces via result_postprocess:

                            rel
0  [:WORKS_AT {workFrom: 2010}]

Change

graphistry/compute/gfql/row/pipeline.py:

  • Extend the bare-alias fallback in _gfql_resolve_token so an edge alias resolves to either the edge-id column ({alias}.{_edge}) when present, or to a rendered [:TYPE {props}] string via the new _gfql_render_relationship_alias helper.
  • The renderer skips _source / _destination / _edge / __* columns and the alias-indicator column ({alias}.{alias}); rows whose alias columns are all null become pd.NA.

Tests

graphistry/tests/compute/gfql/test_row_pipeline_ops.py::TestRelationshipAliasInRowExpression (3 new tests):

  • bare select(items=[("rel", "workAt")]) renders the Cypher string,
  • select(items=[("yr", "workAt.workFrom")]) keeps working (regression guard),
  • where_rows("workAt IS NOT NULL") keeps the row.

Full graphistry/tests/compute/gfql/ suite: 1576 passed, 87 skipped, 15 xfailed.

Test plan

  • Run python -m pytest graphistry/tests/compute/gfql/
  • ./bin/lint.sh and ./bin/mypy.sh on touched files

@lmeyerov lmeyerov force-pushed the fix/1072-cypher-edge-alias-row-expr branch from 0de4097 to cd955fe Compare May 3, 2026 05:22
lmeyerov and others added 7 commits May 3, 2026 13:54
)

Bare edge aliases in select / where_rows / order_by row expressions raised
"unsupported token in row expression" because _gfql_resolve_token only had a
node-id fallback for the bare-alias case. Add an edge fallback: prefer the
edge-id column, otherwise render the relationship as a Cypher-style
[:TYPE {props}] string, mirroring what cypher RETURN <relAlias> already
produces via result_postprocess. Direct GFQL chains and cypher-lowered chains
now see the same column shape.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…entry (#1071)

Pre-rewrite the prefix WITH stage in _compile_bounded_reentry_query so
secondary whole-row aliases are demoted to scalar property carries
(S.X AS __cypher_reentry_<S>_<X>__) and downstream S.X references compose
with the existing single-whole-row machinery (#1047/#1068). RETURN of a
secondary whole-row alias, and re-binding a secondary alias as a node
variable in the trailing MATCH, remain unsupported with precise errors.

Closes #1071.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sions

Records the user-facing fix and amplifications for #1072 under
## [Development] → ### Fixed: bare edge-alias resolution in row
expressions, shared vectorized renderer in row/entity_props.py,
edge-alias gating via _gfql_rows_edge_aliases, string-escape /
float / empty-type rendering normalization, and re-entry WHERE
secondary-alias predicate rewriting via synthesized text.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ring

Wave 4 finding T2 (IMPORTANT): TestRelationshipAliasInRowExpression
had 14 new tests in PR #1249 but zero cuDF coverage on the new
edge-alias rendering path, despite Wave 2 vectorization being
specifically motivated by cuDF support (Wave 1 finding #4).

Adds test_select_bare_relationship_alias_renders_on_cudf_when_available
following the sibling pattern at test_row_pipeline_vectorized_cudf_when_available
(pytest.importorskip("cudf")). Asserts: (a) result keeps cuDF backend,
(b) bare-alias rendering produces "[:WORKS_AT {workFrom: 2010}]",
(c) select/return cell-for-cell parity holds on cuDF.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wave 4 finding D1 (SUGGESTION): the conditional at pipeline.py:2310 and
the raise at pipeline.py:2312 produce identical errors. After the
node-id branch (lines 2306-2309) returns, the only remaining outcome
is "unsupported token", regardless of edge_aliases membership — both
branches raised the same f-string. Collapsed to a single raise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lmeyerov lmeyerov force-pushed the fix/1072-cypher-edge-alias-row-expr branch from cd955fe to 5770312 Compare May 3, 2026 23:16
@lmeyerov lmeyerov merged commit ef42830 into master May 3, 2026
133 checks passed
@lmeyerov lmeyerov deleted the fix/1072-cypher-edge-alias-row-expr branch May 3, 2026 23:27
lmeyerov added a commit that referenced this pull request May 3, 2026
…ing failfast (post-#1249 merge)

After merging origin/master (which brought in #1249's row pipeline changes),
the chained-reentry-same-primary positive test now hits the pre-existing
"unique carried node rows" failfast at gfql_unified.py:~1091. This is not a
regression introduced by slice 4.3d.2 — the carry-uniqueness constraint is
fundamental to the current scalar-carry runtime model. Multiple rows sharing
the same `a` value (one per friend after the first reentry) cannot be carried
through to a second reentry that re-uses `a` as the source.

The rebinding shape (Q2-style `(a)-[:R]->(friend) ... (friend)-[:S]->(c)`)
remains validated and continues to pass — that's the actual cross-boundary
case the slice closes.

Retarget the test to assert the unique-rows failfast fires, locking the known
limitation so a future slice that lifts the constraint must update this test
alongside the runtime change.

Refs #1256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
lmeyerov added a commit that referenced this pull request May 3, 2026
…ry (#1256) (#1258)

* feat(cypher): forward secondary whole-row carry through chained reentry (#1256)

Closes the chained-reentry portion of slice 4.3d (#1256) so multi-alias whole-row
carry survives more than one reentry boundary.

Two extensions to `_demote_secondary_whole_row_aliases` (the active rewrite path
on master after #1071 superseded slice 4.3b's `_rewrite_multi_whole_row_prefix`):

1. **Forwarding-item drop** — bare-identifier projection items in downstream
   `WITH` stages whose name is a secondary alias are dropped at compile time
   *before* `_collect_secondary_property_refs` walks them for bare references.
   Same intent as slice 4.3c, integrated into the active path. Without this,
   forwarding patterns like `WITH a, x, friend, ...` tripped the bare-ref
   failfast even though the property carry already lives as a hidden scalar.

2. **Hidden-column forwarding** — for every `(secondary_alias, prop)` collected
   from trailing clauses, the synthesized `__cypher_reentry_<S>_<X>__` hidden
   alias is appended as a bare passthrough item to every downstream `WITH`
   stage so each recursive `compile_cypher_query` call sees it as a scalar
   carry. Without this, the inner compile failed alias resolution on the
   rewritten hidden identifier once a chained stage narrowed projection scope.

New positive tests cover three previously-blocked shapes:
* `test_string_cypher_admits_multi_stage_secondary_alias_carry_through_chained_reentry`
  — chained reentry where the trailing MATCH continues to use the primary
* `test_string_cypher_admits_secondary_alias_carry_across_reentry_source_rebinding`
  — source rebound between boundaries (`(a)-[:R]->(friend) ... (friend)-[:S]->(c)`)
* `test_string_cypher_admits_multi_alias_distinct_forwarding_through_reentry`
  — multi-alias `DISTINCT` forwarding through a single boundary

New failfast test pins the remaining gap (free-form intermediate MATCH that
does not start from any carried alias — the LDBC SNB IC3 prefix shape) to the
existing scoped error so future closure is regression-locked.

Validation:
* 780 cypher lowering tests pass (was 776; +4 new).
* Touched-module mypy clean.
* Repro `plans/1256-cross-reentry-boundary-carry/repro/repro_ic3.py`: simple
  rebinding cases now compile + execute correctly; literal IC3 still fails at
  the free-form intermediate MATCH gate (separate follow-up).

Refs #1256 (closes the chained-reentry portion; free-form intermediate MATCH
remains open), #999 (IC3 partial), #989 (row-carrier IR umbrella).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test+docs(cypher): pin aggregate-failfast interaction with chained-reentry carry (#1256 W1-I1)

Wave-1 review flagged that hidden-column forwarding could in principle silently
mutate grouping when the appended `__cypher_reentry_<S>_<X>__` bare item lands
inside a `WITH a, friend, count(*) AS n` chain — the carry would join the
group key set.

Empirically those queries already fail at the pre-existing
"aggregate would need repeated MATCH rows from a relationship pattern"
failfast, so no silent wrong-result path exists today. Pin that behavior with
a regression-lock test so a future change that lifts the aggregate failfast
must reckon with carry-grouping interaction explicitly.

Also document the DISTINCT/aggregate interaction in `_demote_secondary_whole_row_aliases`
inline so the next reader sees the design tradeoff.

Refs #1256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(cypher): scoped failfast for chained-reentry carry + aggregate WITH (#1256 W2-IMPORTANT-1)

Wave-2 review uncovered a silent wrong-result for the corner shape
`MATCH (a),(x) WITH a, x MATCH (a) WITH a, x, count(*) AS n RETURN x.id, n`
(node-only trailing MATCH + aggregating downstream WITH + carry forward).
PR HEAD returned `xid=None, n=1`; master rejects the same query at the
secondary-whole-row failfast.

Add a guard in `_demote_secondary_whole_row_aliases`: when carry forwarding is
about to append a hidden column to a downstream WITH stage, refuse if any
item in that stage contains an aggregate function call. Raise a scoped #1256
failfast pointing at the gap. The relationship-pattern aggregate path is also
now covered by this guard (the existing "aggregate would need repeated MATCH
rows" failfast still fires for non-carry queries; the scoped error is clearer
when a carry is actually involved).

Test changes:
- Rename `..._with_aggregate_hits_existing_aggregate_failfast` to
  `..._with_aggregate_relationship_match_failfast` and update the regex to
  match the new scoped error.
- New `..._with_aggregate_node_only_match_failfast` regression-locks the
  W2-IMPORTANT-1 case.

Validation:
- 782 cypher lowering tests pass (was 780 before slice 4.3d.2; +2 net).
- Touched-module mypy clean.
- Scoped failfast wording is symmetric with the slice 4.3a/b admit-gate
  failfast (#989) so error consumers can pattern-match.

Refs #1256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(cypher): retarget chained-reentry-same-primary test to pre-existing failfast (post-#1249 merge)

After merging origin/master (which brought in #1249's row pipeline changes),
the chained-reentry-same-primary positive test now hits the pre-existing
"unique carried node rows" failfast at gfql_unified.py:~1091. This is not a
regression introduced by slice 4.3d.2 — the carry-uniqueness constraint is
fundamental to the current scalar-carry runtime model. Multiple rows sharing
the same `a` value (one per friend after the first reentry) cannot be carried
through to a second reentry that re-uses `a` as the source.

The rebinding shape (Q2-style `(a)-[:R]->(friend) ... (friend)-[:S]->(c)`)
remains validated and continues to pass — that's the actual cross-boundary
case the slice closes.

Retarget the test to assert the unique-rows failfast fires, locking the known
limitation so a future slice that lifts the constraint must update this test
alongside the runtime change.

Refs #1256.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cypher: relationship alias unsupported in row expression select (blocks IC11 job-referral)

1 participant