Fix overwrite conflict report generation to correctly handle deletions#223
Conversation
Add save-aware dataframe utilities and deletion-aware conflict detection. Introduces canonical_keypoint_columns_from_header, save_index_from_points_metadata, complete_df_for_save and merge_save_df to normalize headers, expand the save scope (inserting NaNs for deleted keypoints) and merge new saves with existing DLC files while preserving out-of-scope rows. keypoint_conflicts now supports include_deletions to report destructive deletions (old present, new absent) and combines overwrite+deletion checks. Integrate these behaviors into compute_overwrite_report_for_points_save and write_hdf, and update imports to include PointsMetadata and the new helpers.
Move _drop_likelihood_columns from core/io.py into core/dataframes.py as drop_likelihood_columns and update callers/tests to use the new location. Add _keypoint_group_levels and keypoint_deletions to detect destructive keypoint removals, refactor keypoint_conflicts to delegate deletion detection, and have compute_overwrite_report_for_points_save include deletion info. Extend build_overwrite_conflict_report and models (ConflictEntry, OverwriteConflictReport) to surface deleted_keypoints and n_deletions, prefer reporting deletions over overwrites when both occur, and update has_conflicts to account for deletions.
Refactor and enhance the overwrite-conflicts UI and report handling. - Use OverwriteConflictReport from config.models (import updated). - Add helpers to build plain-text details, HTML summary, and affected summary from reports (with backward compatibility to older report models). - Make dialog content richer: enable rich-text summaries, selectable labels, configurable details label and confirm button text, and variable default button behavior for dangerous deletions. - Compute dynamic dialog title and texts in maybe_confirm_overwrite (handle deletions specially), escape context strings, and rely on content-driven sizing. These changes improve clarity around keypoint deletions vs overwrites and make the dialog safer and more informative for users.
Import QPalette and add a helper to compute an inverted parent background color. Insert a thin separator line in OverwriteConflictsDialog using that color. Update conflict summary wording to be shorter/clearer, and adjust confirmation button and details label text/formatting (use italics for the details header). Minor UI copy and layout cleanups.
Refactor keypoint_deletions to use harmonize_keypoint_row_index and harmonize_keypoint_column_index. The old align_old_new flow was replaced by explicit reindexing of old/new frames to the new save scope (index and columns) so deletion detection is restricted to the current scope. Minor docstring/comment cleanup and formatting adjustments; logic for grouping by keypoint levels and computing deletions is unchanged.
Refactor tests in core/test_conflicts.py to exercise deletion conflict handling and dataframe completion logic. Introduces a simple _HeaderStub and updates _stub_validation_pipeline to return a completed_df and monkeypatch complete_df_for_save (and captures its inputs). Replaces references to dataframes_mod with conflicts_mod for functions under test (keypoint_conflicts, keypoint_deletions, build_overwrite_conflict_report) and updates expectations accordingly. Adds new tests to ensure deletion conflicts are passed to the report builder and to cover extracted-labels overwrite-report behavior and fallbacks.
Extend dataframe tests to cover complete_df_for_save, merge_save_df, keypoint_deletions, build_overwrite_conflict_report, and drop_likelihood_columns. New tests verify expansion of sparse napari data to full save scope, dropping of likelihood cols, merging behavior that preserves rows/columns outside save scope and respects intentional NaN deletions, detection/harmonization of deletions across basename vs deep-path indices, and conflict report counting of overwrites vs deletions.
deruyter92
left a comment
There was a problem hiding this comment.
Good fix, that solves the issue.
I left quite some generic comments, keeping in mind that you are planning a refactor. I agree that this is a worthwhile effort.
One important remark:
Please (if not there) consider implementing a functional test for write_hdf, rather than only unit-testing the helpers: delete keypoint → write_hdf → read back → deleted point is NaN. This is the only strong guarantee that write_hdf will not produce a similar bug in the future.
Otherwise: good to go! Would be nice to have this fix shipped before the full refactor.
Add a details_text property to OverwriteConflictReport that constructs plain-text conflict details (frame labels, modified and deleted keypoints, truncated-entry notice) and trims trailing whitespace. Remove the duplicate _conflict_details_text helper from the UI dialogs and use report.details_text when building the overwrite confirmation dialog, consolidating formatting logic in the model.
Handle cases where path coercion yields unresolved items by converting any integer indexes back to the original paths for display, building a preview of up to 10 unresolved paths, and showing a QMessageBox warning. The save is cancelled (returns None, True) to avoid writing ambiguous or unrelated image paths into the CollectedData file; the dialog asks the user to report the issue with the debug log.
Enhance the test overwrite confirmation helper to capture canonical fields (n_overwrites, n_deletions, n_frames, entries, report) and provide backwards-compatible aliases (n_pairs, n_images). Update forbid-mode logic to treat any destructive change (overwrites or deletions) as a failure. Refactor tests to exercise the plugin save path (use keypoint_controls._save_layers_dialog) and guard against falling back to generic napari dialogs, promote config-placeholder layers to real annotation layers, and create an overwrite conflict to assert confirmation is requested. Add a new end-to-end test that verifies deletion preflight triggers when an existing saved keypoint is removed and that the saved H5 reflects the deletion.
Replace the SimpleNamespace fake report with OverwriteConflictReport and ConflictEntry imports, constructing a typed report with two entries and n_deletions=0. Update the test expectations to match the dialog's new output: include the dialog title, formatted multi-line details, details label text, confirm button text, and the dangerous_default_cancel flag.
Add robust helpers and assertions to end-to-end tests for saving project-less labeled folders. Introduces _get_single_point_label_at_xy and _assert_points_layer_has_label_xy to inspect plugin-visible point labels and coordinates, and uses the plugin add wrapper to record the actual assigned label. Update test_projectless_folder_save_can_associate_with_config_and_coerce_paths_to_d to verify normalized metadata is persisted, to reopen the produced H5 via the plugin, and to assert the same plugin-visible keypoint label/coordinates survive save+readback (removing reliance on low-level HDF row-index checks). Improve docstrings, monkeypatching, waits and error messages. Add a new test (test_projectless_folder_save_refuses_unresolved_outside_paths) that ensures saves are refused when some image paths cannot be safely rewritten into DLC dataset row keys and that the user is warned; also assert no GT files are created in either project or external folders.
Add a post-harmonization check in merge_save_df to ensure df_old and df_new share row-index labels after harmonization and raise a ValueError if they do not. This prevents silent preservation of existing labels when an overwrite/delete was expected. Also add two tests: one ensuring NaNs in df_new clear old values after harmonizing basename vs path row keys, and another asserting merge_save_df refuses merges with no row-index overlap.
Replace ad-hoc header manipulation with harmonize_keypoint_column_index(df) and canonical_keypoint_columns_from_header(ctx.meta.header). This centralizes keypoint column handling, aligns DataFrame columns with the canonical header MultiIndex before reindexing, and removes the previous commented logic for inserting/dropping the individuals level. Logging was adjusted to use the computed hdr_cols to avoid repeated header conversions.
Add test_write_hdf_gt_merge_persists_deleted_keypoint to verify the writer persists deleted keypoints when merging with an existing GT HDF. The test creates an existing CollectedData_me.h5 containing two keypoints (nose, tail), then provides metadata/points for only the tail and ensures write_hdf_napari_dlc writes NaN for the removed nose coordinates while updating the tail (converting napari [y,x] to DLC [x,y]). Uses DLCHeaderModel, guarantee_multiindex_rows, and _add_source_io to recreate the GT provenance and asserts the expected HDF contents.
deruyter92
left a comment
There was a problem hiding this comment.
Looks good @C-Achard! Thanks for the improvements!
This is definitely a good fix, and I would say the open comments above are minor topics for future refactors.
|
@deruyter92 I do apologize, I had replied to all the comments properly but it seems this created a broken pending review from the VSCode extension and I now cannot post it or answer comments... Not sure what went wrong, happy to discuss things further if needed however. Edit: I can post replies but they show as pending. Let me know if you can't see them, I added for clear context what I remember of the previous replies |
Motivation
Fixes #221 by adding logic to handle keypoint deletion properly and provide relevant GUI wiring to keep reports meaningful.
Scope