Skip metadata assignment for untyped divisions#6931
Conversation
0a18cef to
6934a75
Compare
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 8 |
| Duplication | 0 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
a13e01a to
f7786eb
Compare
It might be that this only fixes parts of the issue. As outlined by @michaelkubina in #4362 (comment) Kitodo also seems to inject default values (presets) into the untyped metadata divisions. It would have to be traced where this happens exactly (process creation? Save in Metadata editor?). |
a597d31 to
6c0e48b
Compare
| public void preserve() throws InvalidMetadataValueException, NoSuchMetadataFieldException { | ||
| try { | ||
| if (isDivisionUntyped()) { | ||
| logger.warn("Skipping metadata preservation for untyped division."); |
There was a problem hiding this comment.
How helpful is this warning message? There are no context information nor an administrator or any user with access to the log files can do here anything or inform anyone as context information are missing. Maybe the user in the UI should be informed about this case.
There was a problem hiding this comment.
You are right; i am trying to adress a behaviour which should be fixed by the application, but logging it seems overkill. I think "preserve" is called way to often so that the user will also be annoyed by those messages.
|
After having discussed the problems with @michaelkubina the unwanted enrichments happen, when an instiution has defined metadata keys as "always showing" and with a default preset, e.g: <key id="docType">
<label>Document Type</label>
<label lang="de">Dokumenttyp</label>
<option value="monograph">
<label>Monograph</label>
<label lang="de">Monographie</label>
</option>
<option value="multivolume_work">
<label>Multivolume Work</label>
<label lang="de">Mehrbändiges Werk</label>
</option>
<preset>monograph</preset>
</key>and: <setting key="docType" editable="true" alwaysShowing="true"/>When you have those keys defined and select in the metadataeditor view of the issue one of the divisions without a type, those default values are injected in the UI:
Those UI value are then preserved in the meta.xml upon save. To quote @michaelkubina:
When this setting is defined in the ruleset: <restriction division="" unspecified="forbidden">
<permit division="page"/>,
<permit division="track"/>
<permit division="other"/>
</restriction>this does not happen, because the fields in questions are not actually rendered in the UI. My fix does not prevent them from appearing in the UI, but prevents that those values are serialized in the meta.xml as untyped divisions are skipped when calling This PR therefor introduces a safety net for institutions which do not have the necessary ruleset rules defined. |
e9de62a to
4f64566
Compare
|
My commit 1f0d82a goes one step further. If the division is untyped we should not inject metadata in the UI layer. Untyped divisions do not show any data now, which might get serialized into the meta.xml:
|
|
I am not exactly sure if i introduce a behaviour change here. It might be that people actually assign metadata to the untyped divisions. This is of course only possible if the metadata is actually rendered in the UI and preserved on Save. My fix basically enforces that the untyped divisions do not get any metadata assigned. Maybe @andre-hohmann can comment here. |
This does of course not exclude the option to enrich metadata here on export or via XSL. See: #4362 (comment) |
1f0d82a to
e96358e
Compare
|
I hope I understood correctly that "untyped divisions do not get any metadata assigned" refers to both:
Regarding 1 (Manual): Regarding 2 (Automatic): Wouldn't this be solved, if the container levels would be eliminated?: #4362 (comment) |
I agree, but i am not really sure what this means exactly and what the consequences would be. But thanks a lot for your comment. Given what you say my changes are probably to radical as i think it would break your existing workflows. On the other hand the current behaviour of metadata injections can have really destructive consequences. I have to think more about this. |
9ef678e to
fdc4db9
Compare
Prevent metadata from being written to structural container nodes of type page without a TYPE. Recursion is preserved, but only semantic divisions receive metadata. Fixes unintended DMDSEC creation.
2569d89 to
aecd7f9
Compare
e26b480 to
84dfb9d
Compare
84dfb9d to
f10a525
Compare


This Pull request addresses the issue that Kitodo right now often injects unwanted metadata into the METS file. The issue has been described in different places, e.g.
#4362 (comment)
or
#6024 (comment)
The problem is, that right now it is not mandatory to define special rules in the ruleset which prevent the uncontrolled insertion of e.g. the
processTitle:If the institution does not have those rules in place,
pageelements orunspecified(untyped) elements which are created for newspaper issues might get unwanted metadata injections.My fix therefor does two things:
If both things are wanted it has to be implemented in a safer way. We cannot guarantee that all institutions know about those settings and have them in place. The behavior should therefor by default prevent unwanted insertions.