[BugFix] fix information_schema.tables not escaping special characters in equality predicates#71273
Conversation
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: de34f44b20
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Signed-off-by: dontknow9179 <clin56322@gmail.com>
Signed-off-by: dontknow9179 <clin56322@gmail.com>
Signed-off-by: dontknow9179 <clin56322@gmail.com>
1422338 to
aa65d6f
Compare
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 29 / 29 (100.00%) file detail
|
[BE Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
|
@Mergifyio backport branch-4.0 |
|
@Mergifyio backport branch-4.1 |
|
@Mergifyio backport branch-3.5 |
✅ Backports have been createdDetails
|
✅ Backports have been createdDetails
|
✅ Backports have been createdDetails
Cherry-pick of 1224bae has failed: To fix up this pull request, you can check it out locally. See documentation: https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/reviewing-changes-in-pull-requests/checking-out-pull-requests-locally |
…s in equality predicates (#71273) Signed-off-by: dontknow9179 <clin56322@gmail.com> (cherry picked from commit 1224bae) # Conflicts: # fe/fe-core/src/main/java/com/starrocks/catalog/system/information/TablesSystemTable.java # fe/fe-core/src/main/java/com/starrocks/service/InformationSchemaDataSource.java # fe/fe-core/src/main/java/com/starrocks/sql/plan/PlanFragmentBuilder.java
…s in equality predicates (backport #71273) (#71407) ## Why I'm doing: When querying `information_schema.tables` with an equality predicate like: ```sql SELECT * FROM information_schema.tables WHERE table_schema = 'my_db' AND table_name = 'a_a'; ``` The value `a_a` was passed directly to `PatternMatcher.createMysqlPattern()` as a LIKE pattern, where `_` is treated as a single-character wildcard. This caused the query to return unrelated tables like `aba`, `a1a`, etc., instead of only the exact table `a_a`. Similarly, table names containing backslashes (e.g. `a\_a`) were not handled correctly — the backslash and underscore interacted with LIKE pattern escaping rules in unexpected ways. Fixes #67447 ## What I'm doing Fix `information_schema.tables` queries with equality predicates (`=`) on `table_name` or `table_schema` incorrectly treating LIKE-special characters (`_`, `%`, `\`) as wildcards instead of literal characters. ### Changes 1. **`PatternMatcher.convertMysqlPattern`**: Rewrite the two-pass conversion to a single-pass approach that correctly escapes regex metacharacters (e.g. `(`, `)`, `+`, `[`, `]`) in table/database names, and properly handles a trailing backslash as a literal character. 2. **`PatternMatcher.escapeLikeValue`**: Add a new utility method that escapes a literal string so it can be safely used as a MySQL LIKE pattern for exact matching (prefixing `\`, `%`, and `_` with a backslash). 3. **`TablesSystemTable.evaluate`**: Call `escapeLikeValue()` on the value extracted from `=` predicates before passing it to the Thrift request. This is the **primary fix** — the `SchemaTableEvaluateRule` optimizer rule short-circuits simple equality queries on `information_schema.tables` and evaluates them directly in the FE without going through `PlanFragmentBuilder.visitPhysicalSchemaScan`. 4. **`PlanFragmentBuilder.visitPhysicalSchemaScan`**: Also call `escapeLikeValue()` for `=` predicates on `TABLE_NAME`/`TABLE_SCHEMA`/`DATABASE_NAME` in the SchemaScan path, which is used when `SchemaTableEvaluateRule` is disabled or when the predicate is not a simple equality. 5. **`InformationSchemaDataSource.generateTablesInfoResponse`**: Remove the fallback OR-logic in `matchPattern` and use `matcher.match()` directly, preventing false-positive matches when the pattern matcher rejects a name but the raw string comparison accidentally matches. Signed-off-by: dontknow9179 <clin56322@gmail.com>
Why I'm doing:
When querying
information_schema.tableswith an equality predicate like:The value
a_awas passed directly toPatternMatcher.createMysqlPattern()as a LIKE pattern, where_is treated as a single-character wildcard. This caused the query to return unrelated tables likeaba,a1a, etc., instead of only the exact tablea_a.Similarly, table names containing backslashes (e.g.
a\_a) were not handled correctly — the backslash and underscore interacted with LIKE pattern escaping rules in unexpected ways.Fixes #67447
What I'm doing
Fix
information_schema.tablesqueries with equality predicates (=) ontable_nameortable_schemaincorrectly treating LIKE-special characters (_,%,\) as wildcards instead of literal characters.Changes
PatternMatcher.convertMysqlPattern: Rewrite the two-pass conversion to a single-pass approach that correctly escapes regex metacharacters (e.g.(,),+,[,]) in table/database names, and properly handles a trailing backslash as a literal character.PatternMatcher.escapeLikeValue: Add a new utility method that escapes a literal string so it can be safely used as a MySQL LIKE pattern for exact matching (prefixing\,%, and_with a backslash).TablesSystemTable.evaluate: CallescapeLikeValue()on the value extracted from=predicates before passing it to the Thrift request. This is the primary fix — theSchemaTableEvaluateRuleoptimizer rule short-circuits simple equality queries oninformation_schema.tablesand evaluates them directly in the FE without going throughPlanFragmentBuilder.visitPhysicalSchemaScan.PlanFragmentBuilder.visitPhysicalSchemaScan: Also callescapeLikeValue()for=predicates onTABLE_NAME/TABLE_SCHEMA/DATABASE_NAMEin the SchemaScan path, which is used whenSchemaTableEvaluateRuleis disabled or when the predicate is not a simple equality.InformationSchemaDataSource.generateTablesInfoResponse: Remove the fallback OR-logic inmatchPatternand usematcher.match()directly, preventing false-positive matches when the pattern matcher rejects a name but the raw string comparison accidentally matches.What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: