[BugFix] insert files credential redaction#71245
Conversation
|
@codex review |
|
Codex Review: Didn't find any major issues. Keep them coming! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. Another round soon, please! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
|
@codex review |
|
Codex Review: Didn't find any major issues. Can't wait for the next one! ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. Bravo. ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
|
@codex review |
|
Codex Review: Didn't find any major issues. 🎉 ℹ️ About Codex in GitHubYour team has set up Codex to review pull requests in this repo. Reviews are triggered when you
If Codex has suggestions, it will comment; otherwise it will react with 👍. Codex can also answer questions or update the PR. Try commenting "@codex address that feedback". |
9ebc120 to
516a75c
Compare
There was a problem hiding this comment.
Pull request overview
This PR fixes credential leakage risks for FILES(...)-based SQL by ensuring sensitive properties (for example, S3 access/secret keys) are redacted before being written to logs and query/profile metadata in the FE.
Changes:
- Redact SQL strings via
SqlCredentialRedactorinStmtExecutorwhen storing SQL into profiles, query registration, and various error/retry log paths. - Add
StmtExecutor.getRedactedOriginStmtInString()(testing-visible helper) and expand unit tests to validate redaction behavior. - Add tests validating
AuditEncryptionChecker.needEncrypt()forINSERT ... FROM FILES(...)andINSERT INTO FILES(...), and expandSqlCredentialRedactorcoverage for those SQL shapes.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java | Uses credential redaction for SQL stored in profiles, query metadata, and emitted in planner/retry/DDL/DML log paths. |
| fe/fe-core/src/test/java/com/starrocks/qe/StmtExecutorTest.java | Adds unit tests validating redaction behavior and some additional execution-path coverage. |
| fe/fe-core/src/test/java/com/starrocks/common/util/SqlCredentialRedactorTest.java | Adds tests ensuring credentials inside INSERT ... FILES(...) SQL text are redacted. |
| fe/fe-core/src/test/java/com/starrocks/sql/common/AuditEncryptionCheckerTest.java | Adds tests ensuring FILES-based inserts are flagged as needing encryption/redaction. |
Comments suppressed due to low confidence (1)
fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java:3875
- In addRunningQueryDetail(),
sqlis initially redacted via SqlCredentialRedactor, but whenneedEncrypt || Config.enable_sql_desensitize_in_logis true it gets overwritten with AstToSQLBuilder output. That output masks properties via PrintableMap.SENSITIVE_KEY, which is missing some keys that AuditEncryptionChecker treats as sensitive (e.g.azure.blob.oauth2_client_secret), so secrets can be reintroduced into QueryDetail/QueryDetailQueue. Consider applying SqlCredentialRedactor.redact() to the finalsqlstring (after AstToSQLBuilder) or aligning the AST formatter’s sensitive-key list with SqlCredentialRedactor/AuditEncryptionChecker.
String sql = SqlCredentialRedactor.redact(parsedStmt.getOrigStmt().originStmt);
boolean needEncrypt = AuditEncryptionChecker.needEncrypt(parsedStmt);
if (needEncrypt || Config.enable_sql_desensitize_in_log) {
sql = AstToSQLBuilder.toSQL(parsedStmt, FormatOptions.allEnable()
.setColumnSimplifyTableName(false)
.setHideCredential(needEncrypt)
.setEnableDigest(Config.enable_sql_desensitize_in_log))
.orElse("this is a desensitized sql");
4f06e47 to
70c4e3c
Compare
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com> Made-with: Cursor
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com> Made-with: Cursor
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com> Made-with: Cursor
The earlier "stabilize" commit bumped libthrift from 0.20.0 to 0.22.0 in fe/pom.xml and thirdparty/vars.sh, but branch-3.5 (and the original PR StarRocks#71245 on main) use 0.20.0. That version mismatch breaks FE UT on the 3.5 CI since the container's thrift compiler is 0.20.0 while the runtime was pinned to 0.22.0. Restore 0.20.0 in both places to match branch-3.5. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Keeps SqlCredentialRedactor, AuditEncryptionChecker tests, and extended redactor tests aligned with the original bugfix, and adapts StmtExecutor profile/query-detail SQL handling to use the redactor while staying source-compatible with branch-4.0 (reflection-guarded enable_sql_desensitize_in_log, no FormatOptions/digest API). Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Co-authored-by: kevincai <771299+kevincai@users.noreply.github.com>
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Keeps SqlCredentialRedactor, AuditEncryptionChecker tests, and extended redactor tests aligned with the original bugfix, and adapts StmtExecutor profile/query-detail SQL handling to use the redactor while staying source-compatible with branch-4.0 (reflection-guarded enable_sql_desensitize_in_log, no FormatOptions/digest API). Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main: - SqlCredentialRedactor + AuditEncryptionChecker tests and the extended SqlCredentialRedactorTest scenarios are identical to the upstream PR. - StmtExecutor adds the getRedactedOriginStmtInString() helper and applies redaction to all log/query-detail/profile SQL paths (planner error, explain exec plan failure, retry loop, LargeInPredicate retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery / QueryInfo for both the normal and load paths, DDL QueryStateException, DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL log, insert-txn failure log). - StmtExecutorTest ports all scenarios from the upstream PR except the `enable_sql_desensitize_in_log` digest-mode case, which depends on a Config field and FormatOptions.setEnableDigest API that don't exist on branch-4.0. A comment in the test documents the intentional skip. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-3.5-cc, aligned with the branch-3.5 backport minus the user-authentication-specific patterns (IDENTIFIED BY / IDENTIFIED WITH / SET PASSWORD). Those patterns guard behavior introduced by StarRocks#70360, which is not present on branch-3.5-cc, so the corresponding regex constants, helpers (redactIdentifiedWithClause / redactSqlCredentialClause), and unit tests (testRedactCreateAndAlterUserCredentialClauses, testRedactSetPasswordClause) are intentionally omitted. Main source changes (StmtExecutor.java) now match branch-3.5 on every redaction call site: - getRedactedOriginStmtInString() helper. - redaction applied to planner-error, explain exec-plan failure, retry loop, LargeInPredicate retry, registerQuery/QueryInfo (both normal and load paths), DDL fail logs (delete handler + QueryStateException + generic throwable), DML fail log, crash-cancel log, and the insert throwable handler. Tests: - StmtExecutorTest ports the branch-3.5 redaction-focused scenarios (testGetRedactedOriginStmtInStringScenarios, testBuildTopLevelProfileSqlStatementScenarios). More invasive StatementPlanner/Coordinator mocking tests from StarRocks#71245 on main are skipped; the WarehouseManager.getWarehouseComputeResourceName and Config.enable_sql_desensitize_in_log APIs they depend on don't exist on branch-3.5-cc. - SqlCredentialRedactorTest mirrors branch-3.5's version minus the two user-auth test methods. - AuditEncryptionCheckerTest is the same as branch-3.5 / branch-4.1. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-3.5. Compared with the upstream PR on main (StarRocks#71245), this backport: - Preserves the full set of credential regex patterns, including the user-authentication clauses (IDENTIFIED BY / IDENTIFIED WITH / SET PASSWORD) that were introduced by StarRocks#70360 and are also present on branch-3.5. - Adapts SqlCredentialRedactor to the re2j regex engine used on this branch (no backreferences), so the single-/double-quoted KEY = 'value' matching is expressed as two alternatives rather than a backref. - Applies redaction to the same StmtExecutor call sites as branch-4.1 (planner-error log, explain exec-plan failure log, retry loop, LargeInPredicate retry, registerQuery / QueryInfo for both the normal and load paths, DDL QueryStateException + delete-handler + generic-throwable logs, DML fail log, crash-cancel SQL log, insert throwable log) via a new getRedactedOriginStmtInString() helper. Tests: - SqlCredentialRedactorTest mirrors the upstream PR scenarios, including the user-auth clauses. - StmtExecutorTest ports the two redaction-focused scenarios (testGetRedactedOriginStmtInStringScenarios, testBuildTopLevelProfileSqlStatementScenarios). The remaining StatementPlanner / Coordinator / profile-digest tests from StarRocks#71245 depend on APIs (WarehouseManager.getWarehouseComputeResourceName, Config.enable_sql_desensitize_in_log, FormatOptions#setEnableDigest) that don't exist on branch-3.5 and are intentionally skipped. - AuditEncryptionCheckerTest is identical to the upstream PR. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main: - SqlCredentialRedactor + AuditEncryptionChecker tests and the extended SqlCredentialRedactorTest scenarios are identical to the upstream PR. - StmtExecutor adds the getRedactedOriginStmtInString() helper and applies redaction to all log/query-detail/profile SQL paths (planner error, explain exec plan failure, retry loop, LargeInPredicate retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery / QueryInfo for both the normal and load paths, DDL QueryStateException, DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL log, insert-txn failure log). - StmtExecutorTest ports all scenarios from the upstream PR except the `enable_sql_desensitize_in_log` digest-mode case, which depends on a Config field and FormatOptions.setEnableDigest API that don't exist on branch-4.0. A comment in the test documents the intentional skip. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main: - SqlCredentialRedactor + AuditEncryptionChecker tests and the extended SqlCredentialRedactorTest scenarios are identical to the upstream PR. - StmtExecutor adds the getRedactedOriginStmtInString() helper and applies redaction to all log/query-detail/profile SQL paths (planner error, explain exec plan failure, retry loop, LargeInPredicate retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery / QueryInfo for both the normal and load paths, DDL QueryStateException, DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL log, insert-txn failure log). - StmtExecutorTest ports all scenarios from the upstream PR except the `enable_sql_desensitize_in_log` digest-mode case, which depends on a Config field and FormatOptions.setEnableDigest API that don't exist on branch-4.0. A comment in the test documents the intentional skip. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-3.5. Compared with the upstream PR on main (StarRocks#71245), this backport: - Preserves the full set of credential regex patterns, including the user-authentication clauses (IDENTIFIED BY / IDENTIFIED WITH / SET PASSWORD) that were introduced by StarRocks#70360 and are also present on branch-3.5. - Adapts SqlCredentialRedactor to the re2j regex engine used on this branch (no backreferences), so the single-/double-quoted KEY = 'value' matching is expressed as two alternatives rather than a backref. - Applies redaction to the same StmtExecutor call sites as branch-4.1 (planner-error log, explain exec-plan failure log, retry loop, LargeInPredicate retry, registerQuery / QueryInfo for both the normal and load paths, DDL QueryStateException + delete-handler + generic-throwable logs, DML fail log, crash-cancel SQL log, insert throwable log) via a new getRedactedOriginStmtInString() helper. Tests: - SqlCredentialRedactorTest mirrors the upstream PR scenarios, including the user-auth clauses. - StmtExecutorTest ports the two redaction-focused scenarios (testGetRedactedOriginStmtInStringScenarios, testBuildTopLevelProfileSqlStatementScenarios). The remaining StatementPlanner / Coordinator / profile-digest tests from StarRocks#71245 depend on APIs (WarehouseManager.getWarehouseComputeResourceName, Config.enable_sql_desensitize_in_log, FormatOptions#setEnableDigest) that don't exist on branch-3.5 and are intentionally skipped. - AuditEncryptionCheckerTest is identical to the upstream PR. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor
Backport credential redaction for INSERT INTO FILES and related flows to branch-3.5-cc, aligned with the branch-3.5 backport minus the user-authentication-specific patterns (IDENTIFIED BY / IDENTIFIED WITH / SET PASSWORD). Those patterns guard behavior introduced by StarRocks#70360, which is not present on branch-3.5-cc, so the corresponding regex constants, helpers (redactIdentifiedWithClause / redactSqlCredentialClause), and unit tests (testRedactCreateAndAlterUserCredentialClauses, testRedactSetPasswordClause) are intentionally omitted. Main source changes (StmtExecutor.java) now match branch-3.5 on every redaction call site: - getRedactedOriginStmtInString() helper. - redaction applied to planner-error, explain exec-plan failure, retry loop, LargeInPredicate retry, registerQuery/QueryInfo (both normal and load paths), DDL fail logs (delete handler + QueryStateException + generic throwable), DML fail log, crash-cancel log, and the insert throwable handler. Tests: - StmtExecutorTest ports the branch-3.5 redaction-focused scenarios (testGetRedactedOriginStmtInStringScenarios, testBuildTopLevelProfileSqlStatementScenarios). More invasive StatementPlanner/Coordinator mocking tests from StarRocks#71245 on main are skipped; the WarehouseManager.getWarehouseComputeResourceName and Config.enable_sql_desensitize_in_log APIs they depend on don't exist on branch-3.5-cc. - SqlCredentialRedactorTest mirrors branch-3.5's version minus the two user-auth test methods. - AuditEncryptionCheckerTest is the same as branch-3.5 / branch-4.1. Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com> Made-with: Cursor



Why I'm doing:
What I'm doing:
Fixes #issue
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: