-
Notifications
You must be signed in to change notification settings - Fork 29
Add database constraints to prevent duplicate DOI initiator types #6126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## hotfix/1.17.1 #6126 +/- ##
===================================================
- Coverage 74.22% 74.18% -0.04%
+ Complexity 5664 5661 -3
===================================================
Files 389 389
Lines 20334 20334
Branches 2101 2101
===================================================
- Hits 15093 15085 -8
- Misses 4238 4246 +8
Partials 1003 1003
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After this PR, looks like the "initiator" DOI value will be stored in two tables (the doi
table and the join table), which kinda makes my spider sense a little tingly, because the values could get out of sync etc (https://en.wikipedia.org/wiki/Database_normalization). Is it that way because it's otherwise hard to enforce the "unique initiator" constraint?
Yeah it was for the unique initiator constraint. I'll explore other options to see what else can be done instead! |
DELETE FROM entry_concept_doi WHERE doiid IN (SELECT * FROM dois_to_delete); | ||
DELETE FROM doi WHERE id IN (SELECT * FROM dois_to_delete); | ||
</sql> | ||
<addUniqueConstraint columnNames="id, initiator" constraintName="unique_doi_id_initiatior" tableName="doi"/> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand the other two constraints, but I'm not sure I understand this one and why it doesn't change the JPA model https://github.com/dockstore/dockstore/blob/d6025c121f39672fa22f9e2956674157a73837b0/dockstore-webservice/src/main/java/io/dockstore/webservice/core/Doi.java like the other two.
@@ -39,7 +39,7 @@ | |||
|
|||
@Entity | |||
@Schema(description = "A Digital Object Identifier (DOI)") | |||
@Table(name = "doi", uniqueConstraints = @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" })) | |||
@Table(name = "doi", uniqueConstraints = { @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }), @UniqueConstraint(name = "unique_doi_initiator", columnNames = { "id", "initiator" }) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need this unique constraint so that we can add (id, initiator) foreign keys in the join table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see.
My observation was that id is already an @Id
so it should be unique already, so this seems a little redundant.
But it sounds like JPA or postgres require this? Sounds a bit like a composite primary key
@@ -67,3 +67,7 @@ CREATE UNIQUE INDEX collection_name_index ON collection (LOWER(name), organizati | |||
CREATE UNIQUE INDEX collection_displayname_index ON collection (LOWER(displayname), organizationid) WHERE NOT deleted; | |||
|
|||
CREATE UNIQUE INDEX collection_categoryname_index ON collection (LOWER(name)) WHERE dtype = 'Category' AND deleted = false; | |||
|
|||
-- unable to convert these to JPA properly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to add these using JPA, but it didn't seem possible
ALTER TABLE entry_concept_doi ADD CONSTRAINT entry_doi_initiator_fk FOREIGN KEY (doiid, doiinitiator) REFERENCES doi(id, initiator); | ||
ALTER TABLE version_metadata_doi ADD CONSTRAINT version_doi_initiator_fk FOREIGN KEY (doiid, doiinitiator) REFERENCES doi(id, initiator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These foreign keys enforce that the doiinitiator
values in the join table are in sync with the initiator
values in the doi table in order to address Steve's feedback
The CI test failures seem unrelated to my PR 🤔 |
@@ -39,7 +39,7 @@ | |||
|
|||
@Entity | |||
@Schema(description = "A Digital Object Identifier (DOI)") | |||
@Table(name = "doi", uniqueConstraints = @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" })) | |||
@Table(name = "doi", uniqueConstraints = { @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }), @UniqueConstraint(name = "unique_doi_initiator", columnNames = { "id", "initiator" }) }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see.
My observation was that id is already an @Id
so it should be unique already, so this seems a little redundant.
But it sounds like JPA or postgres require this? Sounds a bit like a composite primary key
|
Description
See related deploy PR https://github.com/dockstore/dockstore-deploy/pull/841
This PR adds a unique database constraint to the
entry_concept_doi
tables to prevent multiple concept DOIs for an entry from being added to the database. Although we didn't see this for the version DOIs, I added a similar constraint to theversion_metadata_doi
table so that a version can't have multiple DOIs of the same initiator type. This constraint will prevent our database from storing multiple concept DOIs for the same entry by throwing an error, but it doesn't solve the synchronization problem. See https://github.com/dockstore/dockstore-deploy/pull/841 for this solution.I also added migrations to clean up the data in the DB that resulted from the multiple concept DOIs. The migrations completely remove the multiple concept DOIs and the version DOIs of the same initiator type belonging to the affected workflow. After this is released, we'll want to run the retrospective DOI script so that the affected workflows get new DOIs generated for them.
Review Instructions
For a GitHub App workflow with no Dockstore concept DOIs, create two tags and push them at the same time. There should be a unique database constraint error and only one concept DOI should be created.
Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-7170
Security and Privacy
If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.
e.g. Does this change...
Please make sure that you've checked the following before submitting your pull request. Thanks!
mvn clean install
@RolesAllowed
annotation