fix(server): warn and skip non-numeric filenames in consumer offset directories instead of panicking#3135
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #3135 +/- ##
=============================================
- Coverage 73.77% 51.07% -22.71%
Complexity 943 943
=============================================
Files 1200 1198 -2
Lines 109106 94848 -14258
Branches 85996 71755 -14241
=============================================
- Hits 80496 48439 -32057
- Misses 25873 43857 +17984
+ Partials 2737 2552 -185
🚀 New features to boost your workflow:
|
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
can you make it a integration test instead of unit? code in core/server will be replaced when we introduce clustering, thus test will be lost. |
4fc2476 to
39db2e4
Compare
Good call, moved the tests from |
|
This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days if no further activity occurs. If you need a review, please ensure CI is green and the PR is rebased on the latest master. Don't hesitate to ping the maintainers - either @core on Discord or by mentioning them directly here on the PR. Thank you for your contribution! |
|
/ready |
hubcio
left a comment
There was a problem hiding this comment.
core idea is right - replacing the panic! on non-numeric filenames with warn + skip is the correct direction. but as posted this is a partial fix: into_string().unwrap() and dir_entry.unwrap() in both functions keep the exact startup-panic-on-stray-file behavior #3132 targets, and the regression tests only feed inputs the patch already handles (.DS_Store, backup.bak - all valid utf-8), so they confirm the patch runs but don't prove the bug class is gone. inline comments below.
/author
|
/author |
…irectories instead of panicking
5ef2977 to
3a6493b
Compare
|
/ready |
Which issue does this PR close?
Closes #3132
Rationale
An operator accidentally placing a file (e.g.
.DS_Store, editor swap file, backup) in the consumer offsets directory crashes the server on startup due to apanic!on non-numeric filenames.What changed?
Both
load_consumer_offsetsandload_consumer_group_offsetspanicked on stray files in the consumer offset directories. Several panic and abort paths existed:into_string().unwrap()on non-UTF-8 filenames,dir_entry.unwrap()on per-entry IO errors,metadata.is_err() -> breaksilently dropping remaining valid offsets, the non-numeric filename panic, andread_exact/File::openvia?aborting the entire load on a single truncated or unreadable file.All panic and abort paths in both functions are now replaced with
warn!+continue, so one bad file never takes down the rest of the load. Note: a genuinely corrupt or renamed offset file that should be numeric is now silently skipped (with a warning), meaning that consumer loses its committed offset and will re-consume on restart. Thewarn!gives observability so operators can detect this.Local Execution
AI Usage
cargo test,cargo clippy,cargo fmt --check, and CI scripts