Skip to content

feat(duckdb): Add transpilation support for COLLATION function#7443

Merged
georgesittas merged 2 commits intomainfrom
RD-1147700-collation
Apr 6, 2026
Merged

feat(duckdb): Add transpilation support for COLLATION function#7443
georgesittas merged 2 commits intomainfrom
RD-1147700-collation

Conversation

@fivetran-amrutabhimsenayachit
Copy link
Copy Markdown
Collaborator

DuckDB handles collation at the column-definition level via the COLLATE clause, not as a queryable runtime function. There is no DuckDB equivalent that returns the collation specification of an expression.

Fix:
E.g:

Collation(
  this = Collate(
    this = Literal('abc'),
    expression = Literal('en-ci')   ← the spec is RIGHT HERE
  )
)

If there's an explicit COLLATE 'spec' attached and the spec is non-empty, the answer is already known at transpile time — so just write that spec as a plain string literal.
Otherwise, there is no way to know the collation, so write NULL, which is exactly what Snowflake returns for uncollated expressions.

 python3 << 'EOF'
import sqlglot
q = "SELECT COLLATION('abc') AS lit_no_coll, COLLATION(NULL) AS null_input, COLLATION('abc' COLLATE 'en-ci') AS lit_en_ci, COLLATION('abc' COLLATE 'de-ci-pi') AS lit_complex, COLLATION('abc' COLLATE 'utf8') AS lit_utf8, COLLATION('abc' COLLATE '') AS lit_empty"
print(sqlglot.transpile(q, read='snowflake', write='duckdb')[0])
EOF
-->
COLLATION function is not supported by DuckDB
COLLATION function is not supported by DuckDB
COLLATION function is not supported by DuckDB

"SELECT NULL AS lit_no_coll, NULL AS null_input, 'en-ci' AS lit_en_ci, 'de-ci-pi' AS lit_complex, 'utf8' AS lit_utf8, NULL AS lit_empty"
│ lit_no_coll │ null_input │ lit_en_ci │ lit_complex │ lit_utf8 │ lit_empty │
│    int32    │   int32    │  varchar  │   varchar   │ varchar  │   int32   │
├─────────────┼────────────┼───────────┼─────────────┼──────────┼───────────┤
│        NULL │       NULL │ en-ci     │ de-ci-pi    │ utf8     │      NULL │
└─────────────┴────────────┴───────────┴─────────────┴──────────┴───────────┘

Comment on lines +2965 to +2971
def collation_sql(self, expression: exp.Collation) -> str:
this = expression.this
if isinstance(this, exp.Collate) and this.expression.name:
return self.sql(this.expression)
self.unsupported("COLLATION function is not supported by DuckDB")
return self.sql(exp.null())

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fivetran-amrutabhimsenayachit I don't think we should handle Collate as a special case. It's similar to why we don't want to handle NULL values explicitly. I wouldn't expect COLLATION(x COLLATE y) to show up in the wild, because that trivially evalutes to y.

I think we only want to call unsupported and do fallback_function_sql, not NULL.

@georgesittas georgesittas force-pushed the RD-1147700-collation branch from 40a9845 to b8cfc52 Compare April 6, 2026 12:05
@georgesittas georgesittas merged commit 4c29711 into main Apr 6, 2026
8 checks passed
@georgesittas georgesittas deleted the RD-1147700-collation branch April 6, 2026 12:07
@github-actions
Copy link
Copy Markdown
Contributor

github-actions bot commented Apr 6, 2026

SQLGlot Integration Test Results

Comparing:

  • this branch (sqlglot:RD-1147700-collation, sqlglot version: RD-1147700-collation)
  • baseline (main, sqlglot version: 0.0.1.dev1)

By Dialect

dialect main sqlglot:RD-1147700-collation transitions links
bigquery -> bigquery 22921/22926 passed (100.0%) 21255/21255 passed (100.0%) No change full result / delta
bigquery -> duckdb 1300/1670 passed (77.8%) 0/0 passed (0.0%) No change full result / delta
duckdb -> duckdb 2425/2425 passed (100.0%) 2425/2425 passed (100.0%) No change full result / delta
snowflake -> duckdb 1515/2674 passed (56.7%) 0/0 passed (0.0%) No change full result / delta
snowflake -> snowflake 65923/65923 passed (100.0%) 63028/63028 passed (100.0%) No change full result / delta
databricks -> databricks 1370/1370 passed (100.0%) 1370/1370 passed (100.0%) No change full result / delta
postgres -> postgres 6042/6042 passed (100.0%) 6042/6042 passed (100.0%) No change full result / delta
redshift -> redshift 7101/7101 passed (100.0%) 7101/7101 passed (100.0%) No change full result / delta

Overall

main: 110131 total, 108597 passed (pass rate: 98.6%), sqlglot version: 0.0.1.dev1

sqlglot:RD-1147700-collation: 101221 total, 101221 passed (pass rate: 100.0%), sqlglot version: RD-1147700-collation

Transitions:
No change

✅ 10 test(s) passed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants