Skip to content

[Enhancement] Skip predicate column vacuuming when TTL is negative or usage is empty#71290

Merged
stephen-shelby merged 5 commits into
StarRocks:mainfrom
seokyun-ha-toss:disable-predicate-column-vacuum
Apr 16, 2026
Merged

[Enhancement] Skip predicate column vacuuming when TTL is negative or usage is empty#71290
stephen-shelby merged 5 commits into
StarRocks:mainfrom
seokyun-ha-toss:disable-predicate-column-vacuum

Conversation

@seokyun-ha-toss
Copy link
Copy Markdown
Contributor

@seokyun-ha-toss seokyun-ha-toss commented Apr 4, 2026

Why I'm doing:

I'm using StarRocks with enable_predicate_columns_collection=false and have truncated the _statistics_.predicate_columns table. However, the vacuum process with TTL is still running and generating useless audit logs. I want to suppress predicate column vacuuming by disabling predicate column TTL.

What I'm doing:

When statistic_predicate_columns_ttl_hours is set to a negative value (e.g. -1), skip vacuuming entirely.

Additionally, skip vacuuming id2columnUsage when it is already empty before removal:

    public void vacuum() {
        long ttlHour = Config.statistic_predicate_columns_ttl_hours;
        if (ttlHour < 0) { // HERE
            return;
        }
        LocalDateTime ttlTime = TimeUtils.getSystemNow().minusHours(ttlHour);
        Predicate<ColumnUsage> outdated = x -> x.getLastUsed().isBefore(ttlTime);

        long before = id2columnUsage.size();
        if (before > 0 && id2columnUsage.values().removeIf(outdated)) { // HERE
            long after = id2columnUsage.size();
            LOG.info("removed {} objects from predicate columns because of ttl {}", before - after,
                    Config.statistic_predicate_columns_ttl_hours);
        }

        // If the process crashed before vacuum the storage, the storage may be different from in-memory state,
        // but it doesn't matter. Because we will remove them finally.
        getStorage().vacuum(ttlTime);
    }

Already, reflect the changes on documents also.

Fixes #71291

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
    • This pr needs auto generate documentation
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 4.1
    • 4.0
    • 3.5
    • 3.4

@seokyun-ha-toss seokyun-ha-toss requested a review from a team as a code owner April 4, 2026 08:02
@github-actions github-actions Bot added behavior_changed title needs [type] documentation Improvements or additions to documentation labels Apr 4, 2026
@seokyun-ha-toss seokyun-ha-toss changed the title Disable predicate column vacuum [improve] Skip predicate column vacuuming when TTL is negative or usage is empty Apr 4, 2026
@seokyun-ha-toss seokyun-ha-toss changed the title [improve] Skip predicate column vacuuming when TTL is negative or usage is empty [Enhancement] Skip predicate column vacuuming when TTL is negative or usage is empty Apr 4, 2026
@CelerData-Reviewer
Copy link
Copy Markdown

@codex review

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. Nice work!

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

stephen-shelby
stephen-shelby previously approved these changes Apr 11, 2026
@stephen-shelby stephen-shelby enabled auto-merge (squash) April 11, 2026 13:26
@seokyun-ha-toss
Copy link
Copy Markdown
Contributor Author

Thanks, @stephen-shelby for reviewing. However, CI reports that the added line is not covered by any tests.

Should I add unit test for that line? Please leave your opinion. Thanks!

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a configuration-driven way to suppress predicate column TTL vacuuming (to avoid unnecessary cleanup work/logs when predicate column collection is disabled or the system table is empty), and updates documentation accordingly.

Changes:

  • Skip predicate column vacuum entirely when statistic_predicate_columns_ttl_hours is negative.
  • Avoid running in-memory removeIf when the usage map is already empty.
  • Document the new “negative TTL disables vacuum” behavior in Config comments and EN/JA/ZH optimizer docs.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
fe/fe-core/src/main/java/com/starrocks/statistic/columns/PredicateColumnsMgr.java Adds early return for negative TTL and avoids removal pass when the map is empty.
fe/fe-core/src/main/java/com/starrocks/common/Config.java Updates config comment to document that negative TTL disables vacuum.
docs/zh/using_starrocks/Cost_based_optimizer.md Documents negative TTL disabling vacuum (ZH).
docs/ja/using_starrocks/Cost_based_optimizer.md Documents negative TTL disabling vacuum (JA).
docs/en/using_starrocks/Cost_based_optimizer.md Documents negative TTL disabling vacuum (EN).

Comment thread fe/fe-core/src/main/java/com/starrocks/common/Config.java
auto-merge was automatically disabled April 15, 2026 02:38

Head branch was pushed to by a user without write access

@stdpain stdpain self-assigned this Apr 15, 2026
@seokyun-ha-toss
Copy link
Copy Markdown
Contributor Author

Hello, @stephen-shelby @kevincai @stdpain , I added a unittest by 8152cc1. Thanks!

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
…columns vacuumed

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
@seokyun-ha-toss seokyun-ha-toss force-pushed the disable-predicate-column-vacuum branch from 8152cc1 to 04e5a9a Compare April 15, 2026 06:25
@seokyun-ha-toss
Copy link
Copy Markdown
Contributor Author

I've done git rebase and push forcly to pass DCO checking by:

git rebase HEAD~3 --signoff
git push --force-with-lease origin disable-predicate-column-vacuum

Thanks!

Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
@github-actions github-actions Bot added the 4.1 label Apr 15, 2026
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 15, 2026

🌎 Translation Required?

All translation files are up to date.
Great job! No translation actions are required for this PR.

🕒 Last updated: Wed, 15 Apr 2026 09:21:17 GMT

@github-actions
Copy link
Copy Markdown
Contributor

[Java-Extensions Incremental Coverage Report]

pass : 0 / 0 (0%)

@github-actions
Copy link
Copy Markdown
Contributor

[FE Incremental Coverage Report]

pass : 3 / 3 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/statistic/columns/PredicateColumnsMgr.java 3 3 100.00% []

@github-actions
Copy link
Copy Markdown
Contributor

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@stephen-shelby stephen-shelby merged commit e3510e9 into StarRocks:main Apr 16, 2026
69 checks passed
@github-actions
Copy link
Copy Markdown
Contributor

@Mergifyio backport branch-4.1

@github-actions github-actions Bot removed the 4.1 label Apr 16, 2026
@mergify
Copy link
Copy Markdown
Contributor

mergify Bot commented Apr 16, 2026

backport branch-4.1

✅ Backports have been created

Details

mergify Bot pushed a commit that referenced this pull request Apr 16, 2026
… usage is empty (#71290)

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: 絵空事スピリット <richard.wang@celerdata.com>
(cherry picked from commit e3510e9)
Copilot AI pushed a commit that referenced this pull request Apr 21, 2026
… usage is empty (#71290)

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: kevincai <771299+kevincai@users.noreply.github.com>
robd003 pushed a commit to robd003/starrocks that referenced this pull request May 6, 2026
… usage is empty (StarRocks#71290)

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: 絵空事スピリット <richard.wang@celerdata.com>
wanpengfei-git pushed a commit that referenced this pull request May 9, 2026
… usage is empty (backport #71290) (#71777)

Signed-off-by: seokyun.ha <seokyun.ha@toss.im>
Signed-off-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: seokyun.ha <seokyun.ha@toss.im>
Co-authored-by: 絵空事スピリット <richard.wang@celerdata.com>
Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

4.1-merged documentation Improvements or additions to documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Skip predicate column vacuuming when TTL is negative or usage is empty

7 participants