Skip to content

[BugFix] insert files credential redaction#71245

Merged
trueeyu merged 22 commits into
StarRocks:mainfrom
srihithg:bugfix/insert-files-credential-redaction
Apr 17, 2026
Merged

[BugFix] insert files credential redaction#71245
trueeyu merged 22 commits into
StarRocks:mainfrom
srihithg:bugfix/insert-files-credential-redaction

Conversation

@srihithg
Copy link
Copy Markdown
Contributor

@srihithg srihithg commented Apr 2, 2026

Why I'm doing:

What I'm doing:

Fixes #issue

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

@srihithg srihithg changed the title Bugfix/insert files credential redaction [Bugfix] insert files credential redaction Apr 2, 2026
@srihithg srihithg changed the title [Bugfix] insert files credential redaction [BugFix] insert files credential redaction Apr 2, 2026
@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Keep them coming!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Another round soon, please!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Apr 3, 2026

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Can't wait for the next one!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Bravo.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. 🎉

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

@srihithg srihithg force-pushed the bugfix/insert-files-credential-redaction branch from 9ebc120 to 516a75c Compare April 6, 2026 21:43
@kevincai kevincai requested review from Copilot and kevincai April 7, 2026 06:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes credential leakage risks for FILES(...)-based SQL by ensuring sensitive properties (for example, S3 access/secret keys) are redacted before being written to logs and query/profile metadata in the FE.

Changes:

  • Redact SQL strings via SqlCredentialRedactor in StmtExecutor when storing SQL into profiles, query registration, and various error/retry log paths.
  • Add StmtExecutor.getRedactedOriginStmtInString() (testing-visible helper) and expand unit tests to validate redaction behavior.
  • Add tests validating AuditEncryptionChecker.needEncrypt() for INSERT ... FROM FILES(...) and INSERT INTO FILES(...), and expand SqlCredentialRedactor coverage for those SQL shapes.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File Description
fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java Uses credential redaction for SQL stored in profiles, query metadata, and emitted in planner/retry/DDL/DML log paths.
fe/fe-core/src/test/java/com/starrocks/qe/StmtExecutorTest.java Adds unit tests validating redaction behavior and some additional execution-path coverage.
fe/fe-core/src/test/java/com/starrocks/common/util/SqlCredentialRedactorTest.java Adds tests ensuring credentials inside INSERT ... FILES(...) SQL text are redacted.
fe/fe-core/src/test/java/com/starrocks/sql/common/AuditEncryptionCheckerTest.java Adds tests ensuring FILES-based inserts are flagged as needing encryption/redaction.
Comments suppressed due to low confidence (1)

fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java:3875

  • In addRunningQueryDetail(), sql is initially redacted via SqlCredentialRedactor, but when needEncrypt || Config.enable_sql_desensitize_in_log is true it gets overwritten with AstToSQLBuilder output. That output masks properties via PrintableMap.SENSITIVE_KEY, which is missing some keys that AuditEncryptionChecker treats as sensitive (e.g. azure.blob.oauth2_client_secret), so secrets can be reintroduced into QueryDetail/QueryDetailQueue. Consider applying SqlCredentialRedactor.redact() to the final sql string (after AstToSQLBuilder) or aligning the AST formatter’s sensitive-key list with SqlCredentialRedactor/AuditEncryptionChecker.
            String sql = SqlCredentialRedactor.redact(parsedStmt.getOrigStmt().originStmt);
            boolean needEncrypt = AuditEncryptionChecker.needEncrypt(parsedStmt);
            if (needEncrypt || Config.enable_sql_desensitize_in_log) {
                sql = AstToSQLBuilder.toSQL(parsedStmt, FormatOptions.allEnable()
                                .setColumnSimplifyTableName(false)
                                .setHideCredential(needEncrypt)
                                .setEnableDigest(Config.enable_sql_desensitize_in_log))
                        .orElse("this is a desensitized sql");

Comment thread fe/fe-core/src/main/java/com/starrocks/qe/StmtExecutor.java
Comment thread fe/fe-core/src/test/java/com/starrocks/common/util/SqlCredentialRedactorTest.java Outdated
Comment thread fe/fe-core/src/test/java/com/starrocks/qe/StmtExecutorTest.java
@srihithg srihithg force-pushed the bugfix/insert-files-credential-redaction branch 2 times, most recently from 4f06e47 to 70c4e3c Compare April 8, 2026 21:23
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: srihithg <78094568+srihithg@users.noreply.github.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
The earlier "stabilize" commit bumped libthrift from 0.20.0 to 0.22.0 in
fe/pom.xml and thirdparty/vars.sh, but branch-3.5 (and the original PR
StarRocks#71245 on main) use 0.20.0. That version mismatch breaks FE UT on the
3.5 CI since the container's thrift compiler is 0.20.0 while the runtime
was pinned to 0.22.0. Restore 0.20.0 in both places to match branch-3.5.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 17, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 20, 2026
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Keeps SqlCredentialRedactor, AuditEncryptionChecker tests, and extended redactor tests aligned with the original bugfix, and adapts StmtExecutor profile/query-detail SQL handling to use the redactor while staying source-compatible with branch-4.0 (reflection-guarded enable_sql_desensitize_in_log, no FormatOptions/digest API).

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 20, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
kevincai pushed a commit that referenced this pull request Apr 20, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Copilot AI pushed a commit that referenced this pull request Apr 21, 2026
Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Co-authored-by: kevincai <771299+kevincai@users.noreply.github.com>
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 21, 2026
Backport credential redaction for INSERT INTO FILES and related flows to branch-4.0. Keeps SqlCredentialRedactor, AuditEncryptionChecker tests, and extended redactor tests aligned with the original bugfix, and adapts StmtExecutor profile/query-detail SQL handling to use the redactor while staying source-compatible with branch-4.0 (reflection-guarded enable_sql_desensitize_in_log, no FormatOptions/digest API).

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main:

- SqlCredentialRedactor + AuditEncryptionChecker tests and the extended
  SqlCredentialRedactorTest scenarios are identical to the upstream PR.
- StmtExecutor adds the getRedactedOriginStmtInString() helper and
  applies redaction to all log/query-detail/profile SQL paths
  (planner error, explain exec plan failure, retry loop, LargeInPredicate
  retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery /
  QueryInfo for both the normal and load paths, DDL QueryStateException,
  DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL
  log, insert-txn failure log).
- StmtExecutorTest ports all scenarios from the upstream PR except the
  `enable_sql_desensitize_in_log` digest-mode case, which depends on a
  Config field and FormatOptions.setEnableDigest API that don't exist
  on branch-4.0. A comment in the test documents the intentional skip.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-3.5-cc, aligned with the branch-3.5 backport minus the
user-authentication-specific patterns (IDENTIFIED BY / IDENTIFIED WITH /
SET PASSWORD). Those patterns guard behavior introduced by StarRocks#70360, which
is not present on branch-3.5-cc, so the corresponding regex constants,
helpers (redactIdentifiedWithClause / redactSqlCredentialClause), and
unit tests (testRedactCreateAndAlterUserCredentialClauses,
testRedactSetPasswordClause) are intentionally omitted.

Main source changes (StmtExecutor.java) now match branch-3.5 on every
redaction call site:
- getRedactedOriginStmtInString() helper.
- redaction applied to planner-error, explain exec-plan failure, retry
  loop, LargeInPredicate retry, registerQuery/QueryInfo (both normal and
  load paths), DDL fail logs (delete handler + QueryStateException +
  generic throwable), DML fail log, crash-cancel log, and the insert
  throwable handler.

Tests:
- StmtExecutorTest ports the branch-3.5 redaction-focused scenarios
  (testGetRedactedOriginStmtInStringScenarios,
  testBuildTopLevelProfileSqlStatementScenarios). More invasive
  StatementPlanner/Coordinator mocking tests from StarRocks#71245 on main are
  skipped; the WarehouseManager.getWarehouseComputeResourceName and
  Config.enable_sql_desensitize_in_log APIs they depend on don't exist
  on branch-3.5-cc.
- SqlCredentialRedactorTest mirrors branch-3.5's version minus the two
  user-auth test methods.
- AuditEncryptionCheckerTest is the same as branch-3.5 / branch-4.1.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-3.5.

Compared with the upstream PR on main (StarRocks#71245), this backport:

- Preserves the full set of credential regex patterns, including the
  user-authentication clauses (IDENTIFIED BY / IDENTIFIED WITH /
  SET PASSWORD) that were introduced by StarRocks#70360 and are also present on
  branch-3.5.
- Adapts SqlCredentialRedactor to the re2j regex engine used on this
  branch (no backreferences), so the single-/double-quoted KEY = 'value'
  matching is expressed as two alternatives rather than a backref.
- Applies redaction to the same StmtExecutor call sites as branch-4.1
  (planner-error log, explain exec-plan failure log, retry loop,
  LargeInPredicate retry, registerQuery / QueryInfo for both the
  normal and load paths, DDL QueryStateException + delete-handler +
  generic-throwable logs, DML fail log, crash-cancel SQL log, insert
  throwable log) via a new getRedactedOriginStmtInString() helper.

Tests:
- SqlCredentialRedactorTest mirrors the upstream PR scenarios,
  including the user-auth clauses.
- StmtExecutorTest ports the two redaction-focused scenarios
  (testGetRedactedOriginStmtInStringScenarios,
  testBuildTopLevelProfileSqlStatementScenarios). The remaining
  StatementPlanner / Coordinator / profile-digest tests from StarRocks#71245
  depend on APIs (WarehouseManager.getWarehouseComputeResourceName,
  Config.enable_sql_desensitize_in_log, FormatOptions#setEnableDigest)
  that don't exist on branch-3.5 and are intentionally skipped.
- AuditEncryptionCheckerTest is identical to the upstream PR.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main:

- SqlCredentialRedactor + AuditEncryptionChecker tests and the extended
  SqlCredentialRedactorTest scenarios are identical to the upstream PR.
- StmtExecutor adds the getRedactedOriginStmtInString() helper and
  applies redaction to all log/query-detail/profile SQL paths
  (planner error, explain exec plan failure, retry loop, LargeInPredicate
  retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery /
  QueryInfo for both the normal and load paths, DDL QueryStateException,
  DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL
  log, insert-txn failure log).
- StmtExecutorTest ports all scenarios from the upstream PR except the
  `enable_sql_desensitize_in_log` digest-mode case, which depends on a
  Config field and FormatOptions.setEnableDigest API that don't exist
  on branch-4.0. A comment in the test documents the intentional skip.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-4.0. Brings the PR in line with the original StarRocks#71245 on main:

- SqlCredentialRedactor + AuditEncryptionChecker tests and the extended
  SqlCredentialRedactorTest scenarios are identical to the upstream PR.
- StmtExecutor adds the getRedactedOriginStmtInString() helper and
  applies redaction to all log/query-detail/profile SQL paths
  (planner error, explain exec plan failure, retry loop, LargeInPredicate
  retry, WarehouseIdleChecker.decreaseRunningSQL, registerQuery /
  QueryInfo for both the normal and load paths, DDL QueryStateException,
  DDL fail logs, DML fail log, DDL delete-handler log, crash-cancel SQL
  log, insert-txn failure log).
- StmtExecutorTest ports all scenarios from the upstream PR except the
  `enable_sql_desensitize_in_log` digest-mode case, which depends on a
  Config field and FormatOptions.setEnableDigest API that don't exist
  on branch-4.0. A comment in the test documents the intentional skip.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-3.5.

Compared with the upstream PR on main (StarRocks#71245), this backport:

- Preserves the full set of credential regex patterns, including the
  user-authentication clauses (IDENTIFIED BY / IDENTIFIED WITH /
  SET PASSWORD) that were introduced by StarRocks#70360 and are also present on
  branch-3.5.
- Adapts SqlCredentialRedactor to the re2j regex engine used on this
  branch (no backreferences), so the single-/double-quoted KEY = 'value'
  matching is expressed as two alternatives rather than a backref.
- Applies redaction to the same StmtExecutor call sites as branch-4.1
  (planner-error log, explain exec-plan failure log, retry loop,
  LargeInPredicate retry, registerQuery / QueryInfo for both the
  normal and load paths, DDL QueryStateException + delete-handler +
  generic-throwable logs, DML fail log, crash-cancel SQL log, insert
  throwable log) via a new getRedactedOriginStmtInString() helper.

Tests:
- SqlCredentialRedactorTest mirrors the upstream PR scenarios,
  including the user-auth clauses.
- StmtExecutorTest ports the two redaction-focused scenarios
  (testGetRedactedOriginStmtInStringScenarios,
  testBuildTopLevelProfileSqlStatementScenarios). The remaining
  StatementPlanner / Coordinator / profile-digest tests from StarRocks#71245
  depend on APIs (WarehouseManager.getWarehouseComputeResourceName,
  Config.enable_sql_desensitize_in_log, FormatOptions#setEnableDigest)
  that don't exist on branch-3.5 and are intentionally skipped.
- AuditEncryptionCheckerTest is identical to the upstream PR.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
srihithg added a commit to srihithg/starrocks that referenced this pull request Apr 22, 2026
Backport credential redaction for INSERT INTO FILES and related flows
to branch-3.5-cc, aligned with the branch-3.5 backport minus the
user-authentication-specific patterns (IDENTIFIED BY / IDENTIFIED WITH /
SET PASSWORD). Those patterns guard behavior introduced by StarRocks#70360, which
is not present on branch-3.5-cc, so the corresponding regex constants,
helpers (redactIdentifiedWithClause / redactSqlCredentialClause), and
unit tests (testRedactCreateAndAlterUserCredentialClauses,
testRedactSetPasswordClause) are intentionally omitted.

Main source changes (StmtExecutor.java) now match branch-3.5 on every
redaction call site:
- getRedactedOriginStmtInString() helper.
- redaction applied to planner-error, explain exec-plan failure, retry
  loop, LargeInPredicate retry, registerQuery/QueryInfo (both normal and
  load paths), DDL fail logs (delete handler + QueryStateException +
  generic throwable), DML fail log, crash-cancel log, and the insert
  throwable handler.

Tests:
- StmtExecutorTest ports the branch-3.5 redaction-focused scenarios
  (testGetRedactedOriginStmtInStringScenarios,
  testBuildTopLevelProfileSqlStatementScenarios). More invasive
  StatementPlanner/Coordinator mocking tests from StarRocks#71245 on main are
  skipped; the WarehouseManager.getWarehouseComputeResourceName and
  Config.enable_sql_desensitize_in_log APIs they depend on don't exist
  on branch-3.5-cc.
- SqlCredentialRedactorTest mirrors branch-3.5's version minus the two
  user-auth test methods.
- AuditEncryptionCheckerTest is the same as branch-3.5 / branch-4.1.

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
Made-with: Cursor
kevincai pushed a commit that referenced this pull request Apr 23, 2026
)

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
kevincai pushed a commit that referenced this pull request Apr 23, 2026
)

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
kevincai pushed a commit that referenced this pull request Apr 23, 2026
…71865)

Signed-off-by: Srihith Garlapati <srihith.garlapati@gmail.com>
trueeyu added a commit that referenced this pull request May 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants