zipfile: add a structural validation feature#136891
Closed
gpshead wants to merge 1 commit intopython:mainfrom
Closed
zipfile: add a structural validation feature#136891gpshead wants to merge 1 commit intopython:mainfrom
gpshead wants to merge 1 commit intopython:mainfrom
Conversation
I had Claude Sonnet 4 examine the zipfile module with an eye for what validation we were not doing and implement strucural validation. Trustworthiness: This needs further review. Both for to see if it is sufficient, understand what is missing, and running it over a corpus of actual zip-ish format files to see what surprises pop up. The kinds of issues Claude wanted to fix are common zip format footguns. It saved an analysis and plan first, I could share those as GH comments. Author: me prompting Claude Sonnet 4
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
[DRAFT] I had Claude Sonnet 4 examine the zipfile module with an eye for what validation we were not doing and implement strucural validation.
Status / Trustworthiness: This needs further review. Both for to see if it is sufficient, understand what is missing, and running it over a corpus of actual zip-ish format files to see what surprises pop up.
Is this the right API? unknown / undecided. it is the type of thing I was thinking of though - opt-in via an API flag. the "strict" CRC validation option is probably the least important. the "structural" validation is more interesting.
I would remove the Enum in favor of module level constants. If the CRC validation is kept I might suggest these be combinable flags rather than that only being a stricter mode than structural validation. People who want CRCs checked should be able to do that regardless - CRCs are rather poor in this day and age, most people validate data using a secure hash outside of old file formats.
quick pass first review thoughts: Some of this is the right theme. Some bits are silly or not enough. Good start. Wouldn't ship this as is. Likely to just close this PR, I put it up for purposes of sharing.
The kinds of issues Claude wanted to fix are common zip format footguns. It saved an analysis and plan first, I could share those as GH comments.
Author: me prompting Claude Sonnet 4
📚 Documentation preview 📚: https://cpython-previews--136891.org.readthedocs.build/