Digitized theses workflow retrieves item handles for synced replacement theses#244
Conversation
Coverage Report for CI Build 26885385138Warning No base build found for commit Coverage: 83.35%Details
Uncovered Changes
Coverage RegressionsRequires a base build to compare against. How to fix this → Coverage Stats
💛 - Coveralls |
28a66b2 to
a239203
Compare
a239203 to
db736f0
Compare
32df1bf to
768c9a1
Compare
db736f0 to
7c55b01
Compare
There was a problem hiding this comment.
Pull request overview
Updates the digitized theses workflow's synced-batch path to retrieve a DSpace handle for replacement theses by calling _get_item_from_dspace, and refactors the item submission construction to assign status/status_details per thesis type. Also replaces string status literals with ItemSubmissionStatus enum values in _create_batch_in_s3.
Changes:
- In
_get_item_submissions_from_synced_batch, buildItemSubmissionfirst, then look up DSpace handle for replacement theses and set status/details accordingly. - Switch hard-coded status strings to
ItemSubmissionStatusenum in_create_batch_in_s3and consolidate theitem_submissions.appendcall. - Add a unit test for the synced-batch handle retrieval path; rename
mock_s3_digitized_thesesfixture tomock_s3_digitized_theses_dsc.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| dsc/workflows/digitized_theses/workflow.py | Adds DSpace handle lookup for replacement theses in synced-batch flow; switches to enum-based statuses in _create_batch_in_s3. |
| tests/workflows/digitized_theses/test_workflow.py | Renames fixture and adds a unit test for _get_item_submissions_from_synced_batch. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
@ehanson8 Addressed initial Copilot review! |
ehanson8
left a comment
There was a problem hiding this comment.
Looking good but some structure questions
…nt theses Why these changes are being introduced: * The digitized theses workflow was missing a step to retrieve item handles for replacement theses when creating a batch from a synced batch folder. The item handle is required in the submission message sent to DSS during the `submit` step. How this addresses that need: * Use DSpace client to retrieve item handle for replacement theses Side effects of this change: * None Relevant ticket(s): * https://mitlibraries.atlassian.net/browse/IN-1756
7226f74 to
d650e86
Compare
Purpose and background context
This PR updates the digitized theses workflow to retrieve item handles for replacement theses during the batch creation process when
synced=True(i.e., the batch folder was synced from the DSC S3 bucket in Stage to the DSC S3 bucket in Prod).As shown in
DigitizedTheses.prepare_batch, whensynced=True, the workflow derives a list ofItemSubmissionsbased on the contents of the batch folder synced toS3_BUCKET_SUBMISSION_ASSETSrather than repeating the steps outlined inDigitizedTheses._create_batch_in_s3. Effectively, this means:When the batch was previously prepared and
synced=True, all this information can be derived from the contents of the batch folder (e.g., the thesis type is determined by the subfolder the item submission is located indsc-upload/digitized-theses/batch/<thesis-type>/).However, as noted in this Copilot review,
DigitizedTheses._get_item_submissions_from_synced_batchwas missing a step to retrieve the handle for a DSpace item in the case of a replacement thesis. This PR adds a new code block, which requires using a DSpace client to retrieve theItemobject from DSpace and extract thehandleattribute.Notes:
ItemSubmissionat the beginning so that the method can directly assign attributes to the instance.How can a reviewer manually see the effects of these changes?
Review the added unit test.
Note: While it was straightforward to add a unit test for the
synced=Truepath, creating unit tests forsynced=Falsepath (i.e.,DigitizedTheses._create_batch_in_s3) is harder given all the steps involved. That said, there are unit tests for the smaller sub-methods it relies on + previous testing with MinIO demonstrates the code works as expected. Just continuing to share with reviewers that improving/adding unit tests continue to be on my mind!Includes new or updated dependencies?
NO
Changes expectations for external applications?
NO
What are the relevant tickets?
Code review