-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Add request timings and count histograms to telemetry. Closes #3552 #3622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add request timings and count histograms to telemetry. Closes #3552 #3622
Conversation
|
Deploy preview for hasura-docs ready! Built with commit ee474ac |
|
Review app for commit 5fd892f deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
5fd892f to
c8d913f
Compare
|
Review app for commit c8d913f deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
c8d913f to
fdb4918
Compare
|
Review app for commit fdb4918 deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
fdb4918 to
ee474ac
Compare
|
Review app for commit ee474ac deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
ee474ac to
484ad79
Compare
|
Deploy preview for hasura-docs ready! Built with commit 484ad79 |
|
@lexi-lambda @0x777 this is ready for review again |
lexi-lambda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’ve left a few small comments, but I’m going to defer to @0x777 to review the bulk of this change, since I think he has more context on it than I do, anyway (and I have plenty of other things to look at).
|
Review app for commit 484ad79 deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
484ad79 to
94719b2
Compare
|
@lexi-lambda I think this should be ready now. |
94719b2 to
ef79555
Compare
|
Review app for commit ef79555 deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
|
|
||
| -- | 'RequestTimings' along with the count | ||
| data RequestTimingsCount = | ||
| RequestTimingsCount { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current model does not seem to enable the ability to calculate percentiles, which are extremely useful for performance tracking. Would you guys be open to explore an alternative design that would enable calculations beyond totals and averages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now for the purposes of telemetry we're just collecting counts and timings for a few hand chosen buckets, which should be fine for our purposes, rather than using hdrhistogram or something similar. But I agree with you that percentiles are useful!
|
This should address Alexis's latest requested changes (use DiffTime instead of NominalDiffTime). History got somewhat worse as I didn't want to spend the time to again split up all the changes and squash them individually. I also added a few comments, added a |
0x777
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
Review app for commit 2d95118 deployed to Heroku: https://hge-ci-pull-3622.herokuapp.com |
…3552 We upload a set of accumulating timers and counters to track service time for different types of operations, across several dimensions (e.g. did we hit the plan cache, was a remote involved, etc.) Also... Standardize on DiffTime as a standard duration type, and try to use it consistently. See discussion here: hasura#3584 (review) It should be possible to overwrite that module so the new threadDelay sticks per the pattern in hasura#3705 blocked on hasura#3558 Rename the Control.Concurrent.Extended.threadDelay to `sleep` since a naive use with a literal argument would be very bad! We catch a bug in 'computeTimeDiff'. Add convenient 'Read' instances to the time unit utility types. Make 'Second' a newtype to support this.
Add a new 'request_read_time' to the logging record. We also use 'withElapsedTime' to get proper monotonic clock for 'query_execution_time' as well.
2d95118 to
b863dc2
Compare
|
Review app https://hge-ci-pull-3622.herokuapp.com is deleted |
We upload a set of accumulating timers and counters to track service
time for different types of operations, across several dimensions (e.g.
did we hit the plan cache, was a remote involved, etc.)
Description
Affected components
Related Issues
#3552 although metrics should hopefully give us general visibility into performance
Solution and Design
Steps to test and verify
run tests
Limitations, known bugs & workarounds
TODO immortal, but we should fix all of these together (see Audit uses of forkIO #3768 )telemetry.rst; that doc is already stale. It would be nice if we could generate it from testsServer checklist
Catalog upgrade
Does this PR change Hasura Catalog version?
Metadata
Does this PR add a new Metadata feature?
run_sqlauto manages the new metadata through schema diffing?run_sqlauto manages the definitions of metadata on renaming?export_metadata/replace_metadatasupports the new metadata added?GraphQL
Breaking changes
No Breaking changes
There are breaking changes:
Metadata API
Existing
querytypes:argspayload which is not backward compatibleJSONschemaGraphQL API
Schema Generation:
NamedTypeSchema Resolve:-
nullvalue for any input fieldsLogging
JSONschema has changedtypenames have changed