+
Skip to content

Conversation

coverbeck
Copy link
Collaborator

@coverbeck coverbeck commented Nov 14, 2024

Description
Use latestReleaseDate to determine what repos to harvest DOIs for. Also added some logic to not run into Zenodo rate limits. The harvesting flow will work like this:

  1. We harvest all GitHub-Zenodo DOIs once, using the existing updateDOIs endpoint.
  2. When new GitHub releases are created, we are notified via GitHub apps; we stored the release date on the workflow.
  3. We will then call this endpoint daily, but only checking workflows that have had releases in the last 2(TBD) days.

This PR adds a query parameter that specifies how far back to look for GitHub-Zenodo DOIs.

A companion PR for dockstore-deploy will be created to do step 3 daily.

On the rate-limiting, I could not find a library that implements client-side rate limiting based on the headers. I played for a bit with https://github.com/bucket4j/bucket4j, but the Zenodo refresh and/or rate limits don't seem to match be exactly as documented, so when I set up a token bucket, I would end up running into rate limits from Zenodo.

Review Instructions

  1. Turn on Zenodo-GitHub integration for a GitHub repo with a workflow in Dockstore.
  2. Do a GitHub release for that repo
  3. The next day, check that the DOI is visible in Dockstore.

Issue
SEAB-6688

Security and Privacy

If there are any concerns that require extra attention from the security team, highlight them here and check the box when complete.

  • Security and Privacy assessed

e.g. Does this change...

  • Any user data we collect, or data location?
  • Access control, authentication or authorization?
  • Encryption features?

Please make sure that you've checked the following before submitting your pull request. Thanks!

  • Check that you pass the basic style checks and unit tests by running mvn clean install
  • Ensure that the PR targets the correct branch. Check the milestone or fix version of the ticket.
  • Follow the existing JPA patterns for queries, using named parameters, to avoid SQL injection
  • If you are changing dependencies, check the Snyk status check or the dashboard to ensure you are not introducing new high/critical vulnerabilities
  • Assume that inputs to the API can be malicious, and sanitize and/or check for Denial of Service type values, e.g., massive sizes
  • Do not serve user-uploaded binary images through the Dockstore API
  • Ensure that endpoints that only allow privileged access enforce that with the @RolesAllowed annotation
  • Do not create cookies, although this may change in the future
  • If this PR is for a user-facing feature, create and link a documentation ticket for this feature (usually in the same milestone as the linked issue). Style points if you create a documentation PR directly and link that instead.

Charles Overbeck added 3 commits November 13, 2024 13:51
@coverbeck coverbeck self-assigned this Nov 14, 2024
<groupId>io.dockstore</groupId>
<artifactId>swagger-java-zenodo-client</artifactId>
<version>2.1.3</version>
<version>2.1.4</version>
Copy link
Collaborator Author

@coverbeck coverbeck Nov 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This version has created/modified dates in response. See dockstore/swagger-java-zenodo-client#25

Copy link

codecov bot commented Nov 14, 2024

Codecov Report

Attention: Patch coverage is 72.58065% with 17 lines in your changes missing coverage. Please review.

Project coverage is 74.43%. Comparing base (5fed68d) to head (08e6179).
Report is 5 commits behind head on develop.

Files with missing lines Patch % Lines
...tore/webservice/helpers/ClientRateLimitHelper.java 71.42% 6 Missing and 4 partials ⚠️
...java/io/dockstore/webservice/jdbi/WorkflowDAO.java 50.00% 5 Missing ⚠️
...ckstore/webservice/resources/WorkflowResource.java 71.42% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@              Coverage Diff              @@
##             develop    #6035      +/-   ##
=============================================
- Coverage      74.45%   74.43%   -0.02%     
- Complexity      5495     5505      +10     
=============================================
  Files            381      382       +1     
  Lines          19786    19839      +53     
  Branches        2043     2048       +5     
=============================================
+ Hits           14731    14767      +36     
- Misses          4075     4087      +12     
- Partials         980      985       +5     
Flag Coverage Δ
bitbuckettests 26.59% <1.61%> (-0.07%) ⬇️
hoverflytests 27.99% <51.61%> (+0.04%) ⬆️
integrationtests 56.56% <1.61%> (-0.15%) ⬇️
languageparsingtests 11.02% <1.61%> (-0.03%) ⬇️
localstacktests 21.51% <1.61%> (-0.06%) ⬇️
toolintegrationtests 29.96% <1.61%> (-0.08%) ⬇️
unit-tests_and_non-confidential-tests 25.85% <38.70%> (+0.06%) ⬆️
workflowintegrationtests 37.96% <1.61%> (-0.10%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.


🚨 Try these New Features:

Copy link
Member

@denis-yuen denis-yuen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some minor questions for code comment

@coverbeck coverbeck requested a review from denis-yuen November 15, 2024 01:03
Copy link

Quality Gate Failed Quality Gate failed

Failed conditions
51.3% Coverage on New Code (required ≥ 80%)

See analysis details on SonarQube Cloud

@svonworl svonworl self-requested a review November 20, 2024 03:10
@coverbeck coverbeck merged commit a95a5d5 into develop Nov 20, 2024
16 of 19 checks passed
@coverbeck coverbeck deleted the feature/seab-6668/harvest branch November 20, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants

点击 这是indexloc提供的php浏览器服务,不要输入任何密码和下载