|
| 1 | +--- |
| 2 | +id: var-length-return-projection |
| 3 | +level: task |
| 4 | +title: "Var-length RETURN: projection treats CTE alias as edge row instead of emitting list of edges" |
| 5 | +short_code: "GQLITE-T-0309" |
| 6 | +created_at: 2026-05-21T16:05:19.794253+00:00 |
| 7 | +updated_at: 2026-05-21T16:05:19.794253+00:00 |
| 8 | +parent: |
| 9 | +blocked_by: [] |
| 10 | +archived: false |
| 11 | + |
| 12 | +tags: |
| 13 | + - "#task" |
| 14 | + - "#phase/backlog" |
| 15 | + - "#bug" |
| 16 | + |
| 17 | + |
| 18 | +exit_criteria_met: false |
| 19 | +initiative_id: NULL |
| 20 | +--- |
| 21 | + |
| 22 | +# Var-length RETURN projects single-edge JSON against varlen CTE alias |
| 23 | + |
| 24 | +## Reproducer |
| 25 | + |
| 26 | +```sql |
| 27 | +.load build/graphqlite.dylib |
| 28 | +SELECT cypher('CREATE ()-[:T]->()'); |
| 29 | +SELECT cypher('MATCH (a)-[r*1..1]->(b) RETURN r'); |
| 30 | +-- Error: "no such column: _gql_default_alias_2.id" |
| 31 | +``` |
| 32 | + |
| 33 | +Expected: `[[{type: "T", ...}]]` (list containing one path of edges). |
| 34 | + |
| 35 | +## Root cause |
| 36 | + |
| 37 | +For a variable-length pattern `[r*1..1]`, the FROM clause aliases the |
| 38 | +recursive CTE as `_varlen_path_1 AS _gql_default_alias_2`. The CTE has |
| 39 | +columns `start_id, end_id, depth, path_ids, visited` — but the RETURN |
| 40 | +projection emits a single-edge JSON template referencing |
| 41 | +`_gql_default_alias_2.id`, `_gql_default_alias_2.type`, etc. as if the |
| 42 | +alias were a row from the `edges` table. |
| 43 | + |
| 44 | +Concrete bad SQL (formatted): |
| 45 | + |
| 46 | +```sql |
| 47 | +WITH RECURSIVE _varlen_path_1(start_id, end_id, depth, path_ids, visited) AS (...) |
| 48 | +SELECT (CASE WHEN _gql_default_alias_2.id IS NULL THEN NULL |
| 49 | + ELSE json_object('id', _gql_default_alias_2.id, |
| 50 | + 'type', _gql_default_alias_2.type, ...) END) |
| 51 | +FROM nodes AS _gql_default_alias_0 |
| 52 | +CROSS JOIN _varlen_path_1 AS _gql_default_alias_2 |
| 53 | +CROSS JOIN nodes AS _gql_default_alias_1 |
| 54 | +WHERE _gql_default_alias_2.start_id = _gql_default_alias_0.id |
| 55 | + AND _gql_default_alias_2.end_id = _gql_default_alias_1.id |
| 56 | + AND _gql_default_alias_2.depth BETWEEN 1 AND 1 |
| 57 | +-- Errors: no such column: _gql_default_alias_2.id |
| 58 | +``` |
| 59 | + |
| 60 | +## Fix sketch |
| 61 | + |
| 62 | +When `r` is bound to a variable-length pattern (the var's kind is |
| 63 | +VAR_KIND_EDGE but it represents a sequence of edges, the alias points |
| 64 | +at the path CTE), the RETURN projection must emit a JSON array |
| 65 | +constructed from `path_ids` rather than a single edge JSON object: |
| 66 | + |
| 67 | +```sql |
| 68 | +SELECT (SELECT json_group_array(json_object( |
| 69 | + 'id', e.id, 'type', e.type, 'startNodeId', e.source_id, |
| 70 | + 'endNodeId', e.target_id, 'properties', ...)) |
| 71 | + FROM edges e |
| 72 | + WHERE e.id IN ( |
| 73 | + SELECT CAST(value AS INTEGER) |
| 74 | + FROM json_each('[' || REPLACE(_gql_default_alias_2.path_ids, ',', ',') || ']') |
| 75 | + )) |
| 76 | +``` |
| 77 | + |
| 78 | +(or similar — use a helper UDF `_gql_path_edges_json(path_ids)`). |
| 79 | + |
| 80 | +This needs the projection code (transform_return.c, or wherever edge |
| 81 | +JSON is emitted) to: |
| 82 | +1. Detect that the variable being projected is a varlen-bound edge |
| 83 | + (transform_var->is_varlen flag or similar marker exists?). |
| 84 | +2. Branch to the list-of-edges projection instead of single-edge. |
| 85 | + |
| 86 | +## Affected TCK (22 errors) |
| 87 | + |
| 88 | +Match4 [1] [5] [6] [7], Match7 [13] [20], Match9 [1] [2] [3] [4] [5] |
| 89 | +[6] [7], and likely additional follow-ons that surface after this |
| 90 | +fix. ~22 error scenarios share this root cause. |
| 91 | + |
| 92 | +## Affected files |
| 93 | + |
| 94 | +- `src/backend/transform/transform_return.c` — the edge-projection |
| 95 | + template emission (json_object with edge fields). |
| 96 | +- `src/backend/transform/transform_match.c` — sets up the varlen CTE |
| 97 | + and alias; the var_kind / is_varlen info has to flow to the |
| 98 | + projector. |
| 99 | + |
| 100 | +## Acceptance Criteria |
| 101 | + |
| 102 | +- [ ] `MATCH (a)-[r*1..1]->(b) RETURN r` returns `[[{type: 'T'}]]` for |
| 103 | + a single edge of type T. |
| 104 | +- [ ] `MATCH (a)-[r*1..2]->(b) RETURN r` returns the list of edge |
| 105 | + lists, one per path of length 1 or 2. |
| 106 | +- [ ] No regression in non-varlen edge projection |
| 107 | + (`MATCH ()-[r]->() RETURN r`). |
| 108 | +- [ ] 22 `_gql_default_alias` errors in TCK flip to pass (or to a |
| 109 | + different fail). |
| 110 | + |
| 111 | +## Discovered |
| 112 | + |
| 113 | +2026-05-21 during iteration 38 of the open-work queue, dumping the |
| 114 | +SQL via /tmp/badsql.log debug print. |
0 commit comments