Skip to content

Commit 0941304

Browse files
author
Gereon Elvers
committed
Radboud: per-file completion log so small files are visible
Previously the prefetch_files banner said "Downloading 12 files" but only files >= 10 MB got a per-byte progress bar (everything else downloaded silently to avoid bar spam). The result looked like only one file was actually transferring even though all 12 were. Now every completed download prints a one-line entry: Radboud WebDAV: downloading 12 file(s)… [1/12] BadChannels (0 B) [2/12] sub-A2002_task-rest_meg.meg4 (530 MB) ... Radboud WebDAV: done (12/12 files). The per-byte tqdm bar still shows for files >= 10 MB (the .meg4 chunk is what the user actually wants to watch).
1 parent 743ff24 commit 0941304

1 file changed

Lines changed: 38 additions & 11 deletions

File tree

pnpl/datasets/mixins/radboud_download.py

Lines changed: 38 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -159,17 +159,44 @@ def _join_url(cls, rel_path: str) -> str:
159159

160160
def prefetch_files(self, file_paths: list[str]) -> None:
161161
"""Prefetch multiple files in parallel (skips already-present)."""
162-
futures = []
163-
for fpath in {p for p in file_paths if not os.path.exists(p)}:
164-
futures.append(self._schedule_download(fpath))
165-
if futures:
166-
print(f"Downloading {len(futures)} files from Radboud WebDAV...")
167-
for future in futures:
168-
try:
169-
future.result()
170-
except Exception as e:
171-
print(f"Error downloading a file: {e}")
172-
print("Done!")
162+
# De-dupe + drop already-present files, but keep a stable order
163+
# so the per-file completion log reads naturally.
164+
seen: set[str] = set()
165+
targets: list[str] = []
166+
for fpath in file_paths:
167+
if fpath in seen or os.path.exists(fpath):
168+
continue
169+
seen.add(fpath)
170+
targets.append(fpath)
171+
if not targets:
172+
return
173+
174+
n = len(targets)
175+
type(self)._log(f"Radboud WebDAV: downloading {n} file(s)…")
176+
177+
# Schedule the whole batch up front so the executor starts
178+
# parallel downloads before we begin waiting on results.
179+
scheduled = [(fpath, self._schedule_download(fpath)) for fpath in targets]
180+
181+
completed = 0
182+
for fpath, future in scheduled:
183+
try:
184+
future.result()
185+
completed += 1
186+
# Always emit a per-file completion line — small files
187+
# don't get a per-byte progress bar (we suppress it for
188+
# anything < 10 MB to avoid bar spam) but users still
189+
# need to see the batch advancing.
190+
size = os.path.getsize(fpath) if os.path.exists(fpath) else None
191+
size_str = type(self)._format_bytes(size)
192+
type(self)._log(
193+
f" [{completed}/{n}] {os.path.basename(fpath)} ({size_str})"
194+
)
195+
except Exception as exc:
196+
type(self)._log(
197+
f" [error] {os.path.basename(fpath)}: {exc}"
198+
)
199+
type(self)._log(f"Radboud WebDAV: done ({completed}/{n} files).")
173200

174201
def ensure_file(self, fpath: str) -> str:
175202
"""Ensure a file exists locally, downloading via WebDAV if needed."""

0 commit comments

Comments
 (0)