Hopefully solving several points: #2223
1. Containers not removed
EDIT: still happening with MOT20 challenge:
2. Wrong log when storage is full
When docker pull fails because of full storage, we have no clear logs.
See:
Then it gets stuck in Running state. Solved by #2223.
3. Progress bar
Related: show_progress and the progress bar adds up to the mess:
2026-02-28 02:38:37.854 | ERROR | compute_worker:show_progress:137 - There was an error showing the progress bar
2026-02-28 02:38:37.854 | ERROR | compute_worker:show_progress:138 - 6
2026-02-28 02:38:37.955 | ERROR | compute_worker:show_progress:137 - There was an error showing the progress bar
2026-02-28 02:38:37.955 | ERROR | compute_worker:show_progress:138 - 1
4. Logs
5. No space left
How to manage the disks? Should we limit docker images size?
We could run a prune when docker pull hits the storage limit:
6. Option for container shared memory
More details here:
7. Submissions not marked as Failed
Submissions stuck in "Running" or "Scoring" or status
Related issues:
Example failure during "Preparing":
[2025-09-18 11:25:05,234: ERROR/ForkPoolWorker-2] Task compute_worker_run[fd956bf5-3e2d-4168-ab48-f0896dc80993] raised unexpected: OSError(28, 'No space left on device')
Traceback (most recent call last):
[...]
OSError: [Errno 28] No space left on device
8. Duplication of submission files
9. Scoring and ingestion only works without directory structure
Classic CodaLab and Codabench bug, if the scoring program or ingestion program is inside a folder in the zip, the submission fails.
We need either to:
Related issues:
Tentative fix #1905 got reversed by #1946.
10. To check: log level
The log level is defined in this way in compute_worker.py:
configure_logging(
os.environ.get("LOG_LEVEL", "INFO"), os.environ.get("SERIALIZED", "false")
)
Generally we want as much log as possible, so we may want to be in "DEBUG" log level.
Related:
11. Docker pull failing
Pull for image: codalab/codalab-legacy:py39 returned a non-zero exit code! Check if the docker image exists on docker hub.
Related issues:
Solution:
12. Logs at the wrong place
13. No hostname in server status when status is "Preparing"
https://www.codabench.org/server_status
Hopefully solving several points: #2223
1. Containers not removed
EDIT: still happening with MOT20 challenge:
2. Wrong log when storage is full
When docker pull fails because of full storage, we have no clear logs.
See:
Submission failed: "scoring_hostname-worker9" #2206
Compute worker - Docker image not found #2217
Have the right error logs, and have them on the platform's UI
Detect errors in
_get_container_image()Then it gets stuck in
Runningstate. Solved by #2223.3. Progress bar
Related:
show_progressand the progress bar adds up to the mess:Make
show_progress()more robust (not treating missing keys as errors)Avoid printing the multiple error lines like this (Compute worker - Improve status update and logs #2223):
4. Logs
5. No space left
How to manage the disks? Should we limit docker images size?
We could run a
prunewhendocker pullhits the storage limit:6. Option for container shared memory
shm-sizeas a compute_worker.envsettingMore details here:
7. Submissions not marked as Failed
Submissions stuck in "Running" or "Scoring" or status
Related issues:
Submissions stuck in "Running" or failing on compute worker ("non-zero return code") #2258 (grouped issue)
Docker pull fails + wrong status #1203
submission in "Scoring" status for multiple hours on default queue #1184
Failed statuses not updated #1257
Submissions fail silently #1821
Problem with BEA 2019 Shared Task submissions #1994
Submissions are not reliably working #2169
Continuously stuck in the "Running" state #2177
Similarly, it looks like the status get stuck to "Preparing" when failing during this process.
Example failure during "Preparing":
8. Duplication of submission files
9. Scoring and ingestion only works without directory structure
Classic CodaLab and Codabench bug, if the scoring program or ingestion program is inside a folder in the zip, the submission fails.
We need either to:
Related issues:
Tentative fix #1905 got reversed by #1946.
10. To check: log level
The log level is defined in this way in
compute_worker.py:Generally we want as much log as possible, so we may want to be in "DEBUG" log level.
Related:
11. Docker pull failing
Related issues:
Solution:
compute_worker.pyso we print more logs in the logger (More logs when docker pull fails in compute_worker.py #1283).12. Logs at the wrong place
Solved by: Show error in scoring std_err #1214
13. No hostname in server status when status is "Preparing"
https://www.codabench.org/server_status