Skip to content

feat(parser): ClickHouse CODEC(...) column option (#482)#497

Merged
ajitpratap0 merged 1 commit intomainfrom
feat/clickhouse-codec-482
Apr 8, 2026
Merged

feat(parser): ClickHouse CODEC(...) column option (#482)#497
ajitpratap0 merged 1 commit intomainfrom
feat/clickhouse-codec-482

Conversation

@ajitpratap0
Copy link
Copy Markdown
Owner

Summary

Adds support for ClickHouse `CODEC(...)` column option in `CREATE TABLE`.

What now parses

  • `payload String CODEC(ZSTD(3))`
  • `ts DateTime CODEC(Delta, LZ4)`
  • `id UInt64 CODEC(Delta(8), ZSTD)`

Implementation

  • New ClickHouse-gated loop in `parseColumnDef` between the type and the standard constraint list. Consumes `CODEC` followed by a balanced `(...)` block via the existing `parseTypeArgsString` helper.
  • The codec is appended to the column's `Type` string as ` CODEC(...)` so formatters round-trip correctly. Not yet modeled as a first-class AST field.

Test plan

  • `TestClickHouseCODEC`: 3 shapes — single codec, chained codecs, codec with parameterised inner call
  • `go test -race ./pkg/sql/parser/ ./pkg/gosqlx/` green

Part of #482.

🤖 Generated with Claude Code

@vercel
Copy link
Copy Markdown

vercel bot commented Apr 8, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Apr 8, 2026 4:57pm

@github-actions github-actions bot added the parser label Apr 8, 2026
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: ClickHouse CODEC Column Option

Todo List:

  • Read and analyze changed files
  • Review code quality and implementation approach
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Assess test coverage adequacy
  • Provide comprehensive feedback

Code Quality & Implementation

✅ Positive Aspects:

  1. Clean Integration: The CODEC parsing is cleanly integrated into the existing parseColumnDef function with proper dialect gating (pkg/sql/parser/ddl_columns.go:101-115)

  2. Reuse of Existing Logic: Smart reuse of the existing parseTypeArgsString() helper for balanced parentheses parsing, maintaining consistency with the codebase

  3. Round-trip Compatibility: The approach of appending to the Type string ensures formatters can round-trip correctly without breaking existing functionality

  4. Proper Error Handling: Uses the established error patterns from the codebase (goerrors.ExpectedTokenError)

⚠️ Areas for Improvement:

Potential Bugs & Issues

  1. Case Sensitivity: The implementation uses strings.ToUpper() for comparison (ddl_columns.go:103) but should consider using the existing case-insensitive token matching patterns used elsewhere in the parser for consistency.

  2. Token Consumption Logic: The current loop structure could potentially consume unexpected tokens. Consider adding validation that the token after CODEC is actually a left parenthesis before calling parseTypeArgsString().

  3. Limited Error Context: If parseTypeArgsString() fails inside the CODEC parsing, the error context might not clearly indicate it's related to CODEC parsing specifically.

Performance Considerations

✅ Good Performance Choices:

  • Minimal overhead with dialect check first
  • Reuses existing parsing infrastructure
  • No unnecessary allocations in the happy path

⚠️ Minor Concerns:

  • String concatenation with += could be optimized with strings.Builder for very large type strings, though this is likely negligible for typical use cases

Security Concerns

✅ No Security Issues Found:

  • No user input is directly executed
  • Proper bounds checking via existing parseTypeArgsString() logic
  • Dialect gating prevents accidental parsing in non-ClickHouse contexts

Test Coverage

✅ Good Test Coverage:

  • Tests cover three important scenarios: single codec, chained codecs, and parameterized codecs (clickhouse_codec_test.go:17-31)
  • Uses the high-level gosqlx.ParseWithDialect() API for integration testing
  • Proper dialect specification with keywords.DialectClickHouse

💡 Suggestions for Enhanced Testing:

  1. Error Cases: Consider adding tests for malformed CODEC syntax
  2. Edge Cases: Test with empty CODEC CODEC() or nested parentheses
  3. Round-trip Testing: Verify that parsed and re-formatted SQL maintains the CODEC clause

Architectural Considerations

Design Decision Analysis:
The choice to append to the Type string rather than modeling as a first-class AST field is pragmatic for this initial implementation. This approach:

  • ✅ Maintains backward compatibility
  • ✅ Enables quick implementation
  • ⚠️ May require refactoring if more sophisticated CODEC handling is needed later

Minor Suggestions

  1. Documentation: Consider adding a comment explaining why CODEC is appended to the type string rather than modeled separately

  2. Consistency: The break condition could be more explicit about what terminates the CODEC parsing loop

  3. Future Extensibility: The comment mentions other ClickHouse column options (DEFAULT, MATERIALIZED, etc.) - ensure the loop structure can accommodate these in the future

Overall Assessment

This is a solid, well-implemented feature that follows GoSQLX patterns and conventions. The implementation is safe, performant, and maintains compatibility with existing code. The test coverage is appropriate for the feature scope.

Recommendation: ✅ Approve with minor suggestions

The implementation successfully addresses the requirements in issue #482 and provides a good foundation for ClickHouse CODEC support.


Job Run | Branch: feat/clickhouse-codec-482

@ajitpratap0 ajitpratap0 force-pushed the feat/clickhouse-codec-482 branch from 338261b to 907cada Compare April 8, 2026 16:56
@github-actions
Copy link
Copy Markdown

github-actions bot commented Apr 8, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: ClickHouse CODEC Column Option

Todo List:

  • Read and analyze changed files
  • Review code quality and implementation approach
  • Check for potential bugs and issues
  • Evaluate performance considerations
  • Review security concerns
  • Assess test coverage adequacy
  • Provide comprehensive feedback

Code Quality & Implementation

✅ Positive Aspects:

  1. Clean Integration: The CODEC parsing is cleanly integrated into the existing parseColumnDef function with proper dialect gating (pkg/sql/parser/ddl_columns.go:101-115)

  2. Reuse of Existing Logic: Smart reuse of the existing parseTypeArgsString() helper for balanced parentheses parsing, maintaining consistency with the codebase

  3. Round-trip Compatibility: The approach of appending to the Type string ensures formatters can round-trip correctly without breaking existing functionality

  4. Proper Error Handling: Uses the established error patterns from the codebase (goerrors.ExpectedTokenError)

⚠️ Areas for Improvement:

Potential Bugs & Issues

  1. Case Sensitivity: The implementation uses strings.ToUpper() for comparison (ddl_columns.go:103) but should consider using the existing case-insensitive token matching patterns used elsewhere in the parser for consistency.

  2. Token Consumption Logic: The current loop structure could potentially consume unexpected tokens. Consider adding validation that the token after CODEC is actually a left parenthesis before calling parseTypeArgsString().

  3. Limited Error Context: If parseTypeArgsString() fails inside the CODEC parsing, the error context might not clearly indicate it's related to CODEC parsing specifically.

Performance Considerations

✅ Good Performance Choices:

  • Minimal overhead with dialect check first
  • Reuses existing parsing infrastructure
  • No unnecessary allocations in the happy path

⚠️ Minor Concerns:

  • String concatenation with += could be optimized with strings.Builder for very large type strings, though this is likely negligible for typical use cases

Security Concerns

✅ No Security Issues Found:

  • No user input is directly executed
  • Proper bounds checking via existing parseTypeArgsString() logic
  • Dialect gating prevents accidental parsing in non-ClickHouse contexts

Test Coverage

✅ Good Test Coverage:

  • Tests cover three important scenarios: single codec, chained codecs, and parameterized codecs (clickhouse_codec_test.go:17-31)
  • Uses the high-level gosqlx.ParseWithDialect() API for integration testing
  • Proper dialect specification with keywords.DialectClickHouse

💡 Suggestions for Enhanced Testing:

  1. Error Cases: Consider adding tests for malformed CODEC syntax
  2. Edge Cases: Test with empty CODEC CODEC() or nested parentheses
  3. Round-trip Testing: Verify that parsed and re-formatted SQL maintains the CODEC clause

Architectural Considerations

Design Decision Analysis:
The choice to append to the Type string rather than modeling as a first-class AST field is pragmatic for this initial implementation. This approach:

  • ✅ Maintains backward compatibility
  • ✅ Enables quick implementation
  • ⚠️ May require refactoring if more sophisticated CODEC handling is needed later

Minor Suggestions

  1. Documentation: Consider adding a comment explaining why CODEC is appended to the type string rather than modeled separately

  2. Consistency: The break condition could be more explicit about what terminates the CODEC parsing loop

  3. Future Extensibility: The comment mentions other ClickHouse column options (DEFAULT, MATERIALIZED, etc.) - ensure the loop structure can accommodate these in the future

Overall Assessment

This is a solid, well-implemented feature that follows GoSQLX patterns and conventions. The implementation is safe, performant, and maintains compatibility with existing code. The test coverage is appropriate for the feature scope.

Recommendation: ✅ Approve with minor suggestions

The implementation successfully addresses the requirements in issue #482 and provides a good foundation for ClickHouse CODEC support.


Job Run | Branch: feat/clickhouse-codec-482

@ajitpratap0 ajitpratap0 merged commit a37ed1c into main Apr 8, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant