Fix Complete Rescan not clearing unselected metadata sources#3395
Conversation
Agent-Logs-Url: https://github.com/rommapp/romm/sessions/e76abf0d-7039-4dae-ad88-5f1f1c4f422f Co-authored-by: gantoine <3247106+gantoine@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Test Results (mariadb) 1 files 1 suites 4m 3s ⏱️ Results for commit a245ac7. ♻️ This comment has been updated with latest results. |
☂️ Python Coverage
Overall Coverage
New FilesNo new covered files... Modified Files
|
Test Results (postgresql) 1 files 1 suites 4m 29s ⏱️ Results for commit a245ac7. ♻️ This comment has been updated with latest results. |
There was a problem hiding this comment.
Pull request overview
This PR aims to ensure COMPLETE rescans don’t leave behind stale metadata from sources that are no longer selected, by explicitly clearing the corresponding *_id and *_metadata fields before applying newly-fetched metadata.
Changes:
- Refactors
metadata_handlersto carry each source’s handler output plus its associatedid_field/metadata_field. - Adds a COMPLETE-rescan pass that clears ID + metadata fields for sources not in the current
metadata_sources. - Updates metadata/artwork application logic to use the new
metadata_handlers[source]["handler"]structure.
Comments suppressed due to low confidence (1)
backend/handler/scan_handler.py:883
- On COMPLETE scans, fields that become “empty” (e.g.,
igdb_id=None,igdb_metadata={},url_screenshots=[]) are still skipped byif field_value:below, which means those attributes often won’t be present inrom_attrsandsession.merge()will retain the previous DB values. If the goal is to remove stale data, consider usingfield_value is not None(or key-specific rules) for IDs/metadata/artwork, or pre-initializing all known ID/metadata fields inrom_attrstoNone/{}so they always overwrite during COMPLETE rescans.
for source_name in reversed(priority_ordered):
handler_data = metadata_handlers[source_name]["handler"]
# Only update fields that have valid values
for key, field_value in handler_data.items():
if field_value:
rom_attrs[key] = field_value
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Hey, I don't think this is completely working yet. I was doing some testing on PR 3396 and noticed some of the "wrong" region assets that had previously been assigned because of PR 3367 were being retained. After sanity checking this wasn't a cache issue I'm becoming confident the issue is this PR. Everything below this sentence is AI generated to ideally capture all the technical aspects of this issue. |
The PR zeroes *_id and *_metadata for unselected sources, but url_cover, url_screenshots, and url_manual are seeded into rom_attrs from the existing ROM at the top of scan_rom and never explicitly cleared during a COMPLETE rescan. If a source that was the sole provider of url_cover gets deselected, the stale URL remains in the DB. More critically, path_cover_s and path_cover_l are also seeded from the existing ROM and never touched. The system prefers the on-disk path over url_cover, so even if url_cover were correctly cleared, the already-downloaded cover file would still be served. remove_cover() exists in resources_handler but is never called during a rescan.
This is the more fundamental issue. For a COMPLETE rescan, downloaded asset files — covers, screenshots, and SS extended media (miximage, bezel, box3d, logo, fanart, video) under resources/roms/<platform_id>/<rom_id>/ — should be deleted before the metadata fetch runs, regardless of whether the source remains selected. If they aren't, the cached files are reused even when the fetch returns a different URL. This directly breaks the region priority fix in #3396. Even with correct region ordering, a COMPLETE rescan won't replace an already-downloaded cover from the wrong region because the old file is still on disk and path_cover_s still points to it. remove_cover() and remove_media_resources_path() both exist in resources_handler but neither is called during a rescan. Suggested scope additions for a COMPLETE rescan: Delete all downloaded cover and extended media files before the metadata fetch runs |
During a Complete Rescan, games retain metadata from previously selected but currently unselected sources. This occurs because unselected metadata handlers return
Nonevalues, which are skipped by theif field_value:check when updatingrom_attrs.Changes
None/{}for any source not in the current selectionThis ensures the database is updated with explicit
NULLvalues for unselected sources rather than preserving stale data.Warning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
get.trunk.io/usr/bin/curl curl REDACTED -fsSL e04/log.json -c y /tmp/runc-proces/run/containerd/io.containerd.runtime.v2.task/moby/d2b58081866b9080943200bee533dinit ntime.v2.task/mod2b58081866b9080943200bee533d31cc8ac412f6e167699bcb310204aaba7f2 31cc8ac412f6e167699bcb310204aaba7f2/log.json 31cc8ac412f6e167699bcb310204aaba7f2/d40aae1ecec0f2426d08cf0ded6454bd6b200cfc16ae531a��(dns block)trunk.io/usr/bin/wget wget --verbose --tries=3 --limit-rate=10M REDACTED --output-document /home/REDACTED/.cache/trunk/cli/tmp.nU9AYBvoNj/launcher/install/download-1.25.0.tar.gz rgo/bin/git 080943200bee533dbash(dns block)If you need me to access, download, or install something from one of these locations, you can either: