Add --retry-failed: re-download only the failed entries from a prior report by pchvykov · Pull Request #23 · roshanlam/iFetch

pchvykov · 2026-06-07T07:24:41Z

Motivation

After a large download, a handful of files often fail for transient reasons (network blips, throttling). Re-running the whole download re-walks the entire tree and re-verifies every already-good file, which is slow on large folders. --retry-failed targets only the files that failed.

What it does

Reads a prior download_report.json, picks out the entries with status == "failed", maps each failed local path back to its iCloud remote path (icloud_path is the remote root that local_path mirrors), and downloads just those - skipping the full-tree walk entirely. A fresh retry_<report>.json is written so a second pass can target anything still failing.

# default report path (<local_path>/download_report.json)
ifetch <icloud_path> <local_path> --retry-failed

# explicit report path
ifetch <icloud_path> <local_path> --retry-failed /path/to/download_report.json

Notes

Local and stored paths are both .resolve()d before matching, so a symlinked/normalised local root (e.g. /tmp -> /private/tmp) still lines up with the absolute paths recorded in the report.
Paths that fall outside the local root, or remote paths that no longer resolve, are logged and recorded as still-failed rather than crashing the run.
_submit is refactored into a small reusable _submit_task(fn, *args) so the retry path reuses the exact same traversal-completion counting as the main download.

Testing

Full test suite passes (114 passed).
Verified end to end: only failed entries are retried, already-completed files are skipped, and unresolvable paths are recorded as still-failed.

Dependency

⚠️ This is stacked on #22 - it reuses the shared bounded pool introduced there (_submit_task / _executor / completion counting). Until #22 is merged, the diff here includes that commit too. Please merge #22 first; I'll rebase this onto main afterward for a clean diff.

Co-Authored-By: Claude noreply@anthropic.com

… + O(n^2) metadata write Two issues compounded to make large downloads get progressively slower the deeper/larger the tree: 1. Unbounded concurrency. process_item_parallel spawned a fresh ThreadPoolExecutor per directory and parent threads blocked on their children, so live concurrency multiplied with tree depth (observed ~24 concurrent transfers with max_workers=4). That saturated the shared HTTP pool (connection churn -> CLOSE_WAIT pileup) and tripped iCloud server-side throttling. Now there is a single global ThreadPoolExecutor for the whole traversal. Directory workers submit their children onto the same pool via _submit and return immediately instead of blocking (completion is counted via future done-callbacks, so there is no pool-starvation deadlock). The HTTPAdapter is sized to max_workers with pool_block=True to stop connection churn. 2. O(n^2) version-metadata write. download_drive_item rewrote the entire .ifetch_versions.json under a global lock after every file, serializing all download threads. The in-memory map is now updated per-file and flushed once at the end of download(). Verified: full test suite (114 tests) passes; smoke test over a 484-file / 5-level tree drains with peak concurrency == max_workers. Co-Authored-By: Claude <noreply@anthropic.com>

…rior report After a large download, a handful of files often fail (transient network / throttling). Re-running the whole download re-walks the entire tree and re-verifies every already-good file, which is slow on large folders. --retry-failed reads a prior download_report.json, maps each "failed" local path back to its iCloud remote path (icloud_path is the remote root that local_path mirrors), and downloads only those, reusing the shared bounded pool. Paths are resolved on both sides so a symlinked local root (e.g. /tmp -> /private/tmp) still matches. A retry_<report>.json is written for a follow-up pass. Usage: ifetch <icloud_path> <local_path> --retry-failed ifetch <icloud_path> <local_path> --retry-failed /path/to/report.json _submit is refactored into a reusable _submit_task so the retry path reuses the same traversal-completion counting as the main download. Verified: full test suite (114 tests) passes; --retry-failed retries only failed entries, skips completed ones, and records unresolvable paths as still-failed. Co-Authored-By: Claude <noreply@anthropic.com>

pchvykov and others added 2 commits June 7, 2026 14:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add --retry-failed: re-download only the failed entries from a prior report#23

Add --retry-failed: re-download only the failed entries from a prior report#23
pchvykov wants to merge 2 commits into
roshanlam:mainfrom
pchvykov:feat-retry-failed

pchvykov commented Jun 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

pchvykov commented Jun 7, 2026

Motivation

What it does

Notes

Testing

Dependency

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant