feat: SQL transformations related to GROUP BY#209
Merged
Conversation
f6c0e22 to
2e23096
Compare
tobyhede
approved these changes
Apr 22, 2025
Contributor
|
Is it possible to make the tracing optional? Maybe use a crate feature? |
Contributor
Author
Disabling tracing of inference at compile time is definitely possible. It doesn't seem to significantly affect compilation times FWIW. As for runtime overhead, my understanding is that it's practically a no-op when there's no subscriber attached. |
5a46c38 to
0212074
Compare
Includes a version bump to the latest `sqltk` version which vendors `sqlparser` as `sqltk-parser` and includes an extra parameter in `Transform::transform` to allow contextual transformations (instead of solely by type). The following refactorings were done as part of this PR: 1. Break sole `Transform` impl into modular rules using the `TransformationRule` trait. 2. Removed `EqlFunctionTracker` - this existed to support `Expr` transformations in `ORDER BY` clauses. With contextual transformation it was no longer required. 3. Removed `TypeCell` - it did not carry its weight and made debugging really hard. The `TypeRegistry` now associates nodes to Arc<Type> using an intermediary `TypeVar`. An extra method was added to the post type checking step in the `EqlMapper` to resolve all `Value` nodes which have no concrete type after the inference step to `NativeValue`.
Debugging type checking code in VSCode is suboptimal - we really want a graphical representation of inference at every step (but also a debugger that doesn't lock up at inconvenient times would be great). To make instrumentation consistent I implemented a `#[trace_infer]` attribute macro which can be used on `InferType` impls. The `Unifier` has a single `#[tracing::instrument!(..)]` call and there is also code in `EqlMapper` to dump nodes types and type substitutions at the end of the type checking process which is useful when debugging failing inference tests. To get a trace from a test, uncomment the `init_tracing();` line. The output is *very* verbose but very useful.
New mise command: `mise rust:version` It is run during CI and can also be invoked locally. It will output something like this: ``` $ mise run rust:version [rust:version] $ echo "rustc --version = " $(rustc --version) rustc --version = rustc 1.86.0 (05f9846f8 2025-03-31) cargo --version = cargo 1.86.0 (adf9b6ad1 2025-02-28) cargo fmt --version = rustfmt 1.8.0-stable (05f9846f89 2025-03-31) cargo clippy --version = clippy 0.1.86 (05f9846f89 2025-03-31) ``` Very handy to verify that your local toolchain is the same as CI when rustfmt is failing the build but is fine locally.
Statement transformation should be performed when EQL columns are used, not solely on whethere there are EQL literals that require encryption. Because using EQL columns can reqire insertion of EQL helper functions.
The expr type could not be resolved because the transformation rule was using the wrong AST node to build a `NodeKey`.
0212074 to
5301649
Compare
The Proxy must be able to reliably detect whether a statement requires transformation because if a statement does not require transformation then the potentially expensive AST rebuilding step can be skipped. The result of type-checking is insufficient in general to tell whether a statement requires transformation unless the `TransformationRule` logic is duplicated in the Proxy - which we don't want of course. This commit extends the `TransformationRule` trait with a `would_edit` method which answers the question "would this rule change the AST if it was applied?". Additionally, a new `TranformationRule` impl `DryRunnable` wraps another rule in such a way that it can "pretend" to be performing a `Transform` (as far as `sqltk` is concerned) when really its doing a dry-run after which it will tell us if it *would* change the AST. `TypedCheckedStatement::requires_transform` is a new method that wraps up the dry-run logic and tells the called whether a statement must be transformed
5301649 to
9e4d145
Compare
tobyhede
approved these changes
Apr 28, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Includes a version bump to the latest
sqltkversion which vendorssqlparserassqltk-parserand includes an extra parameter inTransform::transformto allow contextual transformations (instead ofsolely by type).
The following refactorings were done as part of this PR:
Break sole
Transformimpl into modular rules using theTransformationRuletrait.Removed
EqlFunctionTracker- this existed to supportExprtransformations in
ORDER BYclauses. With contextual transformationit was no longer required.
Removed
TypeCell- it did not carry its weight and made debuggingreally hard. The
TypeRegistrynow associates nodes to Arcusing an intermediary
TypeVar.An extra method was added to the post type checking step in the
EqlMapperto resolve allValuenodes which have no concrete typeafter the inference step to
NativeValue.Additionally, I went through many levels of Hell trying to figure out how I
broke some existing tests so this PR also includes:
feat: tracing of
InferTypeimpls &Unifier::unifyDebugging type checking code in VSCode is suboptimal - we really want a
graphical representation of inference at every step (but also a debugger
that doesn't lock up at inconvenient times would be great).
To make instrumentation consistent I implemented a
#[trace_infer]attribute macro which can be used on
InferTypeimpls.The
Unifierhas a single#[tracing::instrument!(..)]call and thereis also code in
EqlMapperto dump nodes types and type substitutionsat the end of the type checking process which is useful when debugging
failing inference tests.
To get a trace from a test, uncomment the
init_tracing();line.The output is very verbose but very useful.
Acknowledgment
By submitting this pull request, I confirm that CipherStash can use, modify, copy, and redistribute this contribution, under the terms of CipherStash's choice.