Skip to content

Partially revert lsjson#571

Merged
JoeZiminski merged 10 commits into
mainfrom
partially_revert_lsjson
Aug 26, 2025
Merged

Partially revert lsjson#571
JoeZiminski merged 10 commits into
mainfrom
partially_revert_lsjson

Conversation

@JoeZiminski
Copy link
Copy Markdown
Member

@JoeZiminski JoeZiminski commented Aug 5, 2025

This PR partially reverts #551 by using local filesystem searches when possible for speed improvement (avoiding process creation which is slow on Windows).

It essentially creates a duplicate function, search_local_filesystem that should have the same output as search_central_via_connection but does not call rclone. This is faster as a new process does not need to be made with subprocess.run(). This slowness is particularly noticeable during live-validation on the create-folders screen, it is so bad that is makes the user experience very poor. On Linux, it is less of an issue.

This solution is not at all ideal, we now have two functions doing the same thing. When all searches were done through search_central_via_connection, it meant that all local filesystem tests implicitly tested the behaviour ssh, aws, google drive tests. There are many more local filesystem transfer tests because these are much quicker to run. Therefore, this duplication is reducing test coverage. The solution is just to add a few more explicit tests to ssh, aws and google drive #577 and in this PR, perform a set of tests to check that the outputs of these functions are the same under many conditions.

I thought another potential solution in #551 is to create a process on datashuttle with POpen and hold it for the lifetime of the class, and call it when needed. However, as far as I can tell this is not how POpen works. It will also be a high burden and possible source of bugs to manage the lifetime of the process on the datashuttle class. So I think this is a better solution, although not ideal.

@JoeZiminski JoeZiminski marked this pull request as draft August 5, 2025 11:53
@JoeZiminski JoeZiminski force-pushed the partially_revert_lsjson branch from 9a39f9f to d7472a9 Compare August 6, 2025 15:24
@JoeZiminski JoeZiminski marked this pull request as ready for review August 21, 2025 09:30
@JoeZiminski JoeZiminski requested a review from cs7-shrey August 21, 2025 09:49
Copy link
Copy Markdown
Collaborator

@cs7-shrey cs7-shrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good @JoeZiminski. Just a few minor suggestions.

Comment thread tests/tests_integration/test_search_methods.py Outdated
Comment thread datashuttle/utils/folders.py Outdated
@JoeZiminski JoeZiminski requested a review from cs7-shrey August 25, 2025 20:46
@JoeZiminski
Copy link
Copy Markdown
Member Author

thanks @cs7-shrey !

@JoeZiminski JoeZiminski merged commit 102a691 into main Aug 26, 2025
16 checks passed
@JoeZiminski JoeZiminski deleted the partially_revert_lsjson branch August 26, 2025 16:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants