-
Notifications
You must be signed in to change notification settings - Fork 2.8k
server: add a 30 minute timeout for event trigger locks #5473
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Review app for commit fe70d9f deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
jberryman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Co-authored-by: Brandon Simmons <brandon.m.simmons@gmail.com>
|
Review app for commit ca190ef deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
|
@paf31 IIUC, we can now get rid of this? graphql-engine/server/src-lib/Hasura/App.hs Line 371 in df51a8e
|
server/src-rsr/initialise.sql
Outdated
| created_at TIMESTAMP DEFAULT NOW(), | ||
| locked BOOLEAN NOT NULL DEFAULT FALSE, | ||
| /* when locked IS NULL the event is unlocked and can be processed */ | ||
| locked TIMESTAMP, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this will affect the behaviour, but I think it will be a good idea to use TIMESTAMPZ instead of TIMESTAMP because now() returns a TIMESTAMPZ and there may be issues when comparing a TIMESTAMPZ value with a TIMESTAMP value
That code unlocks all events if I understand correctly. I think it might be good to keep it, so that we have a way to quickly unlock everything without having to wait 30 minutes. |
|
Review app for commit c93a00b deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
|
Review app for commit 8cc548a deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
|
Review app for commit ab73923 deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
We are tracking events which are in-memory and unlocking them during graceful shutdown, right? graphql-engine/server/src-lib/Hasura/App.hs Line 462 in 2c397f9
Shouldn't this cover most cases? Otherwise, I am not really sure what is the ROI of changing the locked column from bool to timestamp. We are managing the same amount of state as before? |
|
Mm, yeah I agree we shouldn't be doing While this is still open, do we want to think more about the timeout for (re)delivery? Maybe it should be much shorter or be relative to the http client timeout value we use? Just a detail. |
|
Review app for commit 5dee537 deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
Definitely. I wasn't sure about this so I just did the most naive thing possible to start with. Any ideas? |
…-engine into event-trigger-lock-timeout
I'm not sure, and setting it too low is the real danger (sort of slow thundering herd situation... lumbering herd). Seems good for now |
|
Review app for commit b3856d8 deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
|
(sorry if this was awaiting another approval by me) |
Sorry, it may have, but I'm also being a little hesitant to merge this, since it is a little difficult to test and be sure that it is the best change, for arguably little benefit. Since the original problem is (I think) now addressed, I'm thinking we can hold off until I can write some tests for this. What do you think? |
|
Review app for commit 4cc7394 deployed to Heroku: https://hge-ci-pull-5473.herokuapp.com |
|
I have marked this for v1.4 release since we get a beta period to test this change. |
codingkarthik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
(FWIW if you want to try using kodiak for auto-merge/update you can add the |
This gives event trigger locks a 30 minute timeout. In case the graceful shutdown of a server fails, this will automatically unlock in-process rows after 30 minutes at most.
This PR is a work in progress and I haven't had chance to test it extensively yet, but we can use this as a place to discuss the approach.
Changelog
CHANGELOG.mdis updated with user-facing content relevant to this PR. If no changelog is required, then add theno-changelog-requiredlabel.Affected components
Related Issues
Solution and Design
Steps to test and verify
Limitations, known bugs & workarounds
Server checklist
Catalog upgrade
Does this PR change Hasura Catalog version?
Metadata
Does this PR add a new Metadata feature?
GraphQL
Breaking changes
This changes the type of the
lockedcolumn to be atimestamp nullinstead of aboolean not null.