More Robust and Faster S3 Deletion#1462
Conversation
…S3 deletion more robust
There was a problem hiding this comment.
Code Review
This pull request refactors the rmtree utility to use the native filesystem delete method and expands the S3 error handler to include closed connection errors. Feedback suggests wrapping the delete call in a try-except block to maintain the previous idempotent behavior and ensuring the path is passed as a string for compatibility.
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified Files
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request optimizes S3 directory deletion by replacing a manual directory walk with the native recursive delete method from the s3fs file system. It also enhances the S3 retry logic by adding TransferEncodingError and ClientPayloadError to the retryable errors list and implementing custom handling for specific transient network error messages. A typo was identified in the error string comparison for 'not enough data to satisfy', which should be corrected to ensure the retry logic triggers as expected.
| return True | ||
| if ( | ||
| "connection was closed" in str(exception).lower() | ||
| or "not enough data for satisfy" in str(exception).lower() |
There was a problem hiding this comment.
There is likely a typo in the error string comparison. The common error message from underlying libraries like urllib3 or http.client is "not enough data to satisfy", not "for satisfy". Using the correct string will ensure the retry logic triggers as intended for these transient network errors.
| or "not enough data for satisfy" in str(exception).lower() | |
| or "not enough data to satisfy" in str(exception).lower() |
There was a problem hiding this comment.
Nope, the error message that we have seen reads "Not enough data for satisfy transfer length header"
Description:
Todos: