feat!(website, backend): SeqSet citation tracking#6304
Conversation
|
This PR may be related to: #1501 (Check Crossref via API to see what papers citing DOIs) |
…d exposing doiPrefix for use outside of the service
…, added periodic task to update citations with crossref results
| @@ -0,0 +1,26 @@ | |||
| create table seqset_citation_source ( | |||
| citation_source_id bigserial, | |||
| source_doi text not null unique, | |||
There was a problem hiding this comment.
I might have asked this before, sorry for forgetting, I thought that not all citations might have created a DOI and we wanted to still be able to add them here - so I think we would need to allow this to still be null?
maverbiest
left a comment
There was a problem hiding this comment.
I was finally able to go through the whole PR, apologies that it took so long!
Looks great overall, no major blockers on my side. I think we could make CrossRefServiceTest.kt a bit cleaner with a parametrized test. I would also like to interact with the front-end changes on a preview or locally. Do you have any tips for setting this up so there's some test seqsets to play around with?
anna-parker
left a comment
There was a problem hiding this comment.
Hey Tom! Sorry only finished looking through this today, its a bit hard to test without CrossRef examples - I added a few comments on the backend but it would be great to put this on staging on Monday and then take a closer look, let me know if you need any help :-)
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: e2fb4a4bda
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
Summary
Breaking Changes
get-seqset-cited-by-publicationendpoint, as it is replaced with the endpointget-seqset-citations. The former endpoint used theCitedByreturn type, and the new one usesList<SeqSetCitation>.Schema Changes
seqset_citation_sourcetable: a 'citation source' being any publication (or possibly other item) that references one or more SeqSets. A citation source must have a DOI, title, year and contributors, and can have an origin either from Crossref or manually curated (manual curation endpoints will be added in a follow-up PR).seqset_to_citation_sourcetable for database joins between citation sources and the seqsets they cite.Backend Changes
crossref_resulttag, anIllegalStateExceptionis thrown. If individualforward_linksfail validation (missing dois/titles/years) these are added asCrossRefValidationErrorobjects and logged in the task output. Validated citations are then merged into individual citation sources, conflicts logged, and results upserted in the database. If a curated citation source exists for a DOI found on Crossref, it is updated with the results from Crossref./get-seqset-citations: for a Seqset ID and version, retrieves citations and returns their source DOIs, titles, years and contributors for each citation of the SeqSet.get-seqset-cited-by-publicationendpoint as this is now made redundant.Website Changes
View Citationsmodal to the SeqSet details page which lists citations for the seqset.get-seqset-citationsendpoint.Additional website changes
BaseDialogcomponent for consistency.Screenshot
PR Checklist
🚀 Preview: https://seqset-citations.loculus.org