Skip to content

Fix JSON path type accessor parsing and projection ORDER BY#109

Merged
kyleconroy merged 22 commits intomainfrom
claude/fix-tests-loop-cxno8
Dec 31, 2025
Merged

Fix JSON path type accessor parsing and projection ORDER BY#109
kyleconroy merged 22 commits intomainfrom
claude/fix-tests-loop-cxno8

Conversation

@kyleconroy
Copy link
Copy Markdown
Collaborator

  • Add handling for .:TypeName syntax in parseArrayAccess after []
    This allows parsing expressions like json.c[].d.:Int64

  • Fix projection ORDER BY to use parseExpression instead of just reading
    a single identifier, allowing qualified identifiers like json.a, t.a,
    and json.c[].d.:Int64

Fixes 22 statements in 03464_projections_with_subcolumns and
additional statements in other tests.

- Add handling for `.:TypeName` syntax in parseArrayAccess after `[]`
  This allows parsing expressions like `json.c[].d.:Int64`

- Fix projection ORDER BY to use parseExpression instead of just reading
  a single identifier, allowing qualified identifiers like `json.a`, `t.a`,
  and `json.c[].d.:Int64`

Fixes 22 statements in 03464_projections_with_subcolumns and
additional statements in other tests.
- Add AlterFetchPartition and AlterMovePartition types to AST
- Add FromPath field to AlterCommand for FETCH PARTITION FROM
- Implement FETCH PARTITION parsing in parseAlterCommand
- Fix children count for PARTITION ALL (now counts as 1 child)
- Add explain support for FETCH_PARTITION and MOVE_PARTITION

Fixes stmt42 in 00753_alter_attach and other FETCH PARTITION tests.
When parsing table expressions, allow keywords (like FIRST, SECOND_) to be
used as table aliases without requiring the AS keyword, as long as the
keyword is not a clause keyword.

Added more keywords to isKeywordForClause() to prevent them from being
incorrectly parsed as aliases:
- ARRAY (for ARRAY JOIN)
- WINDOW (for window functions)
- WITH (for WITH clause/CTEs)
- INTERSECT (for set operations)
- SELECT (for FROM...SELECT syntax)
- TOTALS (for WITH TOTALS)

Fixes 21 statements in 00674_join_on_syntax test.
- Handle MySQL-compatible INT types (INT, TINYINT, SMALLINT, MEDIUMINT,
  BIGINT) by ignoring the display width parameter like INT(11)
- Handle UNSIGNED and SIGNED modifiers for INT types, appending them
  to the type name
- Fix SHOW CREATE TEMPORARY TABLE parsing to skip TEMPORARY keyword

Fixes 20 statements in 02271_int_sql_compatibility test.
- Map anyMatch to in (for expr == any subquery)
- Map allMatch to notIn (for expr != all subquery)

These SQL standard comparison operators with subqueries are normalized
to ClickHouse's IN/NOT IN functions in EXPLAIN AST output.

Fixes 5 statements in 02007_test_any_all_operators test.
Parse SQL standard typed literals like:
- DATE '2022-01-01' as toDate('2022-01-01')
- TIMESTAMP '...' as toDateTime('...')
- TIME '...' as toTime('...')

Fixes stmt14 in 02160_special_functions test.
- Add GroupByAll field to SelectQuery AST struct
- Detect GROUP BY ALL in parser and set the flag
- Skip outputting GROUP BY expression list in EXPLAIN for GROUP BY ALL

In ClickHouse, GROUP BY ALL is a special syntax that doesn't produce
a GROUP BY expression list in the EXPLAIN AST output.

Fixes 20 statements in 02459_group_by_all test.
This adds proper parsing and formatting for SKIP clauses in JSON type
parameters, including:
- SKIP path (e.g., SKIP a.b for dotted paths)
- SKIP REGEXP 'pattern' (for regex-based path matching)

These are used in JSON type casts like:
  json::JSON(SKIP a.b, max_dynamic_paths=2)
  json::JSON(SKIP REGEXP '.*a.*', max_dynamic_paths=2)
- Add ColumnTransformer struct to AST to preserve transformer ordering
- Add Transformers field to Asterisk and ColumnsMatcher structs
- Add parseColumnsApply, parseColumnsExcept, parseColumnsReplace functions
- Update explain functions to output transformers in query order
- Remove inline EXCEPT handling from parseColumnsMatcher for proper infix parsing

Fixes explain tests for column transformers across multiple test suites.
…OVE COMMENT

Parser changes:
- Add parseCreateQuota function returning CreateQuotaQuery
- Add parseSetRole function for SET DEFAULT ROLE statements
- Handle MODIFY COLUMN ... REMOVE COMMENT syntax
- Add FORMAT handling for SHOW GRANTS, SHOW CREATE ROLE/USER/POLICY/QUOTA/SETTINGS PROFILE

AST changes:
- Add CreateQuotaQuery struct
- Add SetRoleQuery struct
- Add Format field to ShowGrantsQuery, ShowCreateRoleQuery, ShowCreateRowPolicyQuery,
  ShowCreateQuotaQuery, ShowCreateSettingsProfileQuery

Explain changes:
- Add handlers for new statement types
- Fix ColumnDeclaration to omit (children 0) when no children
- Fix ALTER command type mapping (CLEAR_COLUMN -> DROP_COLUMN, DELETE_WHERE -> DELETE)
- Fix TRUNCATE/CHECK queries to output db/table as separate identifiers
- Add FORMAT handling for DESCRIBE, SHOW queries

Fixes all 20 statements in 01702_system_query_log and many others.
Implement ClickHouse-compatible transformations for special functions in
the explain output:

- DATE_ADD/DATEADD/TIMESTAMP_ADD/TIMESTAMPADD -> plus()
  - 3-arg form: (unit, n, date) -> plus(date, toIntervalUnit(n))
  - 2-arg form: (interval, date) -> plus(interval, date)

- DATE_SUB/DATESUB/TIMESTAMP_SUB/TIMESTAMPSUB -> minus()
  - 3-arg form: (unit, n, date) -> minus(date, toIntervalUnit(n))
  - 2-arg form: (date, interval) -> minus(date, interval)

- DATE_DIFF/DATEDIFF -> dateDiff()
  - (unit, date1, date2) -> dateDiff('unit', date1, date2)

- POSITION with IN syntax: POSITION('ll' IN 'Hello') -> position('Hello', 'll')

Also normalize POSITION function name to lowercase in all cases.

Fixes 02160_special_functions and many other related tests.
- Add parsing for SHOW INDEX/INDEXES/INDICES/KEYS statements
  Maps to ShowColumns type as ClickHouse does internally
- Handle SHOW EXTENDED INDEX syntax
- Handle SHOW INDEX FROM table FROM database syntax
- Add EscapeIdentifier function to escape single quotes as \'
- Apply escaping to DropQuery and CreateQuery output

Fixes 02724_show_indexes and related tests.
- Fix lexer to handle identifiers starting with numbers after a dot (e.g., db.03711_table)
- Add isIdentifierAfterDot() to distinguish 03711_table from decimal numbers
- Fix TO clause in CREATE MATERIALIZED VIEW to support qualified names (database.table)
- Add ToDatabase field to CreateQuery for materialized view targets
- Arrays with numeric primitives use string format ('[1, 2, 3]') in :: casts
- Arrays with booleans use Array_[Bool_0, Bool_1] format
- Arrays with NULLs use Array_[NULL, ...] format
- Arrays with strings use string format ('[\'foo\', \'bar\']')
- Aliases are always shown for :: cast syntax with arrays/tuples

This fixes 15+ statements in 02708_dotProduct and many other tests.
…erging

When parsing multi-param lambdas like `acc,x -> body` where parameters are
separated by commas without parentheses, the parser was incorrectly merging
preceding identifiers with explicitly parenthesized lambdas like `(x -> y)`.

This fix:
- Adds `Parenthesized bool` field to ast.Lambda
- Sets this flag when a lambda is wrapped in explicit parentheses
- Checks the flag in mergeMultiParamLambdas to skip parenthesized lambdas

This correctly parses:
- `arrayFold(acc,x -> body, arr, init)` - merges acc into lambda params
- `delay(time, (time -> 0.5), ...)` - keeps time as separate argument

Fixes all 17 statements in 02718_array_fold test and 3 in 02418_aggregate_combinators.
Adds support for parsing and explaining UNDROP TABLE statements:
- UNDROP TABLE name
- UNDROP TABLE db.name
- UNDROP TABLE name ON CLUSTER cluster
- UNDROP TABLE name UUID 'uuid'

Fixes 7 statements in 02681_undrop_query test (16→9 pending).
When formatting IN clauses like `i in (1, 3, NULL)`, the list can now be
combined into a compact tuple literal format when it contains a mix of:
- Numeric values + NULLs
- String values + NULLs

Previously, the presence of NULL would force the verbose Function tuple
format. Now `in (1, 3, NULL)` correctly outputs `Literal Tuple_(UInt64_1, UInt64_3, NULL)`.

Fixes all 16 statements in 01231_operator_null_in and many statements in:
- 00441_nulls_in
- 00939_test_null_in
- 01410_nullable_key_and_index
- 01507_transform_null_in
- 01558_transform_null_in
- 01756_optimize_skip_unused_shards_rewrite_in
- 02499_analyzer_aggregate_function_lambda_crash_fix
- 03234_evaluate_constant_analyzer
- 03393_non_constant_second_argument_for_in
- 03578_kv_in_type_casts
Adds support for lambda expressions in APPLY transformers:
- `* APPLY x -> toString(x)`
- `COLUMNS(...) APPLY x -> expr`

Previously only simple function names were supported (e.g., `* APPLY toString`).

This adds:
- ApplyLambda field to ast.ColumnTransformer struct
- Lambda detection and parsing in parseAsteriskApply and parseColumnsApply

Fixes 14 statements in 02378_analyzer_projection_names (16→2 pending) and
improves several other tests.
Handle both plural interval forms like INTERVAL 2 years and
string literal intervals like INTERVAL '2 years' by normalizing
the unit name and extracting value/unit from strings.

Changes:
- Add normalizeIntervalUnit() to strip trailing 's' and title-case
- Update explainIntervalExpr() to parse string literal intervals
- Update explainDateAddSubResult() to use normalizeIntervalUnit()

Fixes 28 statements across 6 tests including 02884_interval_operator_support_plural_literal.
Support for aliases on the value expression in INTERVAL syntax,
like INTERVAL '2' AS n minute.

Fixes 01523_interval_operator_support_string_literal (now all 25 pass).
When casting NULL with :: operator syntax (e.g. NULL::Nullable(UInt8)),
output as Literal NULL rather than Literal \'NULL\'.

Fixes many statements across 27 tests.
@kyleconroy kyleconroy merged commit 3f729a2 into main Dec 31, 2025
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants