+
Skip to content

Add database constraints to prevent duplicate DOI initiator types #6126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Jul 8, 2025

Conversation

kathy-t
Copy link
Contributor

@kathy-t kathy-t commented Jun 23, 2025

Description
See related deploy PR https://github.com/dockstore/dockstore-deploy/pull/841

This PR adds a unique database constraint to the entry_concept_doi tables to prevent multiple concept DOIs for an entry from being added to the database. Although we didn't see this for the version DOIs, I added a similar constraint to the version_metadata_doi table so that a version can't have multiple DOIs of the same initiator type. This constraint will prevent our database from storing multiple concept DOIs for the same entry by throwing an error, but it doesn't solve the synchronization problem. See https://github.com/dockstore/dockstore-deploy/pull/841 for this solution.

I also added migrations to clean up the data in the DB that resulted from the multiple concept DOIs. The migrations completely remove the multiple concept DOIs and the version DOIs of the same initiator type belonging to the affected workflow. After this is released, we'll want to run the retrospective DOI script so that the affected workflows get new DOIs generated for them.

Review Instructions
For a GitHub App workflow with no Dockstore concept DOIs, create two tags and push them at the same time. There should be a unique database constraint error and only one concept DOI should be created.

Issue
https://ucsc-cgl.atlassian.net/browse/SEAB-7170

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

@kathy-t kathy-t self-assigned this Jun 23, 2025
Copy link

codecov bot commented Jun 23, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 74.18%. Comparing base (5e24f6c) to head (8a05cf9).

Additional details and impacted files
@@                 Coverage Diff                 @@
##             hotfix/1.17.1    #6126      +/-   ##
===================================================
- Coverage            74.22%   74.18%   -0.04%     
+ Complexity            5664     5661       -3     
===================================================
  Files                  389      389              
  Lines                20334    20334              
  Branches              2101     2101              
===================================================
- Hits                 15093    15085       -8     
- Misses                4238     4246       +8     
  Partials              1003     1003              
Flag Coverage Δ
bitbuckettests 25.94% <ø> (ø)
hoverflytests 27.62% <ø> (ø)
integrationtests 56.07% <ø> (ø)
languageparsingtests 10.82% <ø> (ø)
localstacktests 21.33% <ø> (ø)
regressionintegrationtests ?
toolintegrationtests 29.90% <ø> (ø)
unit-tests_and_non-confidential-tests 26.32% <ø> (ø)
workflowintegrationtests 37.36% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@kathy-t kathy-t requested review from denis-yuen and svonworl June 23, 2025 21:18
Copy link
Contributor

@svonworl svonworl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After this PR, looks like the "initiator" DOI value will be stored in two tables (the doi table and the join table), which kinda makes my spider sense a little tingly, because the values could get out of sync etc (https://en.wikipedia.org/wiki/Database_normalization). Is it that way because it's otherwise hard to enforce the "unique initiator" constraint?

@kathy-t
Copy link
Contributor Author

kathy-t commented Jun 24, 2025

After this PR, looks like the "initiator" DOI value will be stored in two tables (the doi table and the join table), which kinda makes my spider sense a little tingly, because the values could get out of sync etc (https://en.wikipedia.org/wiki/Database_normalization). Is it that way because it's otherwise hard to enforce the "unique initiator" constraint?

Yeah it was for the unique initiator constraint. I'll explore other options to see what else can be done instead!

DELETE FROM entry_concept_doi WHERE doiid IN (SELECT * FROM dois_to_delete);
DELETE FROM doi WHERE id IN (SELECT * FROM dois_to_delete);
</sql>
<addUniqueConstraint columnNames="id, initiator" constraintName="unique_doi_id_initiatior" tableName="doi"/>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand the other two constraints, but I'm not sure I understand this one and why it doesn't change the JPA model https://github.com/dockstore/dockstore/blob/d6025c121f39672fa22f9e2956674157a73837b0/dockstore-webservice/src/main/java/io/dockstore/webservice/core/Doi.java like the other two.

@@ -39,7 +39,7 @@

@Entity
@Schema(description = "A Digital Object Identifier (DOI)")
@Table(name = "doi", uniqueConstraints = @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }))
@Table(name = "doi", uniqueConstraints = { @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }), @UniqueConstraint(name = "unique_doi_initiator", columnNames = { "id", "initiator" }) })
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need this unique constraint so that we can add (id, initiator) foreign keys in the join table

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.
My observation was that id is already an @Id so it should be unique already, so this seems a little redundant.
But it sounds like JPA or postgres require this? Sounds a bit like a composite primary key

@@ -67,3 +67,7 @@ CREATE UNIQUE INDEX collection_name_index ON collection (LOWER(name), organizati
CREATE UNIQUE INDEX collection_displayname_index ON collection (LOWER(displayname), organizationid) WHERE NOT deleted;

CREATE UNIQUE INDEX collection_categoryname_index ON collection (LOWER(name)) WHERE dtype = 'Category' AND deleted = false;

-- unable to convert these to JPA properly
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to add these using JPA, but it didn't seem possible

Comment on lines +72 to +73
ALTER TABLE entry_concept_doi ADD CONSTRAINT entry_doi_initiator_fk FOREIGN KEY (doiid, doiinitiator) REFERENCES doi(id, initiator);
ALTER TABLE version_metadata_doi ADD CONSTRAINT version_doi_initiator_fk FOREIGN KEY (doiid, doiinitiator) REFERENCES doi(id, initiator);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These foreign keys enforce that the doiinitiator values in the join table are in sync with the initiator values in the doi table in order to address Steve's feedback

@kathy-t kathy-t requested review from denis-yuen and svonworl July 2, 2025 14:09
@kathy-t
Copy link
Contributor Author

kathy-t commented Jul 2, 2025

The CI test failures seem unrelated to my PR 🤔

@@ -39,7 +39,7 @@

@Entity
@Schema(description = "A Digital Object Identifier (DOI)")
@Table(name = "doi", uniqueConstraints = @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }))
@Table(name = "doi", uniqueConstraints = { @UniqueConstraint(name = "unique_doi_name", columnNames = { "name" }), @UniqueConstraint(name = "unique_doi_initiator", columnNames = { "id", "initiator" }) })
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see.
My observation was that id is already an @Id so it should be unique already, so this seems a little redundant.
But it sounds like JPA or postgres require this? Sounds a bit like a composite primary key

Copy link

sonarqubecloud bot commented Jul 3, 2025

@kathy-t kathy-t merged commit 153be7f into hotfix/1.17.1 Jul 8, 2025
23 of 24 checks passed
@kathy-t kathy-t deleted the seab-7170/doi-500 branch July 8, 2025 14:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants
点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载