Skip to content

repository: add max_size byte limit to PackWriter, OR-ed with max_count#9821

Merged
ThomasWaldmann merged 3 commits into
borgbackup:masterfrom
mr-raj12:pack-files-max-size
Jun 27, 2026
Merged

repository: add max_size byte limit to PackWriter, OR-ed with max_count#9821
ThomasWaldmann merged 3 commits into
borgbackup:masterfrom
mr-raj12:pack-files-max-size

Conversation

@mr-raj12

@mr-raj12 mr-raj12 commented Jun 27, 2026

Copy link
Copy Markdown
Contributor

What

PackWriter previously decided a pack was full by object count alone (max_count). This adds an optional max_size limit so a pack can also be closed once it reaches a target byte size. A flush happens as soon as either the count or the size limit is reached, whichever comes first.

max_size defaults to None, which leaves the size check off and keeps the current count-only behaviour. A caller opts in by passing a byte limit.

Why

Object count says nothing about the resulting file size: the same number of chunks gives a tiny pack for small chunks and a very large one for big chunks. Limiting by byte size gives direct control over how large packs grow on disk, which is the next step toward size-controlled packs.

Also in this PR

  • The default max_count is raised to 3, and the explicit max_count=2 at the Repository.open() call site is removed so the repository relies on PackWriter's own default. Running with a higher default exposed tests that only passed because packs happened to tile evenly at a count of 2.
  • test_list now flushes its last partial pack before listing, instead of depending on the object count dividing evenly into whole packs.

Tests

  • test_pack_writer_flushes_on_max_size: confirms the size limit triggers a flush, with the count kept high so only size can fire.
  • test_pack_writer_max_size_none_is_count_only: confirms max_size=None leaves the size check off and count remains the only trigger.
  • Full suite passes at the new default (1750 passed, 97 skipped).

max_size=None (default) keeps count-only behaviour. Also bump default
max_count 1->3 and flush test_list's last partial pack explicitly.
@mr-raj12

mr-raj12 commented Jun 27, 2026

Copy link
Copy Markdown
Contributor Author

On the default max_count bump to 3: test (test_list) is now fixed with an explicit flush. The suite passes at either value.

can keep it at 3 as a temporary diagnostic, or revert to 2 before merge, whichever you prefer.

@codecov

codecov Bot commented Jun 27, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.95%. Comparing base (106dfba) to head (0e20258).
⚠️ Report is 10 commits behind head on master.
✅ All tests successful. No failed tests found.

Additional details and impacted files
@@           Coverage Diff           @@
##           master    #9821   +/-   ##
=======================================
  Coverage   84.94%   84.95%           
=======================================
  Files          92       92           
  Lines       15291    15297    +6     
  Branches     2296     2297    +1     
=======================================
+ Hits        12989    12995    +6     
- Misses       1611     1612    +1     
+ Partials      691      690    -1     

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

@mr-raj12 mr-raj12 marked this pull request as ready for review June 27, 2026 05:02

@ThomasWaldmann ThomasWaldmann left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess the new feature works, good.

But, as a general comment (not limited to this PR):

I am not happy with the quality of code comments and doc strings.

If the AI is doing a bad job, it is YOUR job to find that and improve that before submitting.

Comments are talking about irrelevant stuff (e.g. things we do NOT do), hacks of the past (things that were temporary hacks and we stopped doing), they are too repetitive, talking about what a function CALLER does instead of clearly saying what the function does, stating the obvious (like that a pack starts empty), ...

Comment thread src/borg/repository.py Outdated
Comment thread src/borg/repository.py Outdated
Comment thread src/borg/repository.py Outdated
Comment thread src/borg/repository.py Outdated
Comment thread src/borg/repository.py Outdated
@ThomasWaldmann

Copy link
Copy Markdown
Member

max_count: keep it at 3 for now.

Comment thread src/borg/repository.py Outdated
Comment thread src/borg/repository.py
Comment thread src/borg/repository.py Outdated
@mr-raj12 mr-raj12 force-pushed the pack-files-max-size branch from 113cb38 to 0e20258 Compare June 27, 2026 22:04

@ThomasWaldmann ThomasWaldmann left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm.

@ThomasWaldmann ThomasWaldmann merged commit 8e6d178 into borgbackup:master Jun 27, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants