-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Probe properly after startup when periodSeconds greater than zero #11190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## main #11190 +/- ##
==========================================
- Coverage 87.81% 87.80% -0.01%
==========================================
Files 196 196
Lines 9393 9391 -2
==========================================
- Hits 8248 8246 -2
- Misses 889 890 +1
+ Partials 256 255 -1
Continue to review full report at Codecov.
|
hm. looks unrelated, but also new :/ /test pull-knative-serving-istio-stable-no-mesh |
hmmm happened again, but this doesn't change anything for the probePeriod=0 case, which the failing test is :/ /retest |
🎉 /assign @markusthoemmes @vagababov |
Looks good! I'm contemplating if we should have some E2E test for the behavior as our QP-probe-interaction seems kinda brittle. Would it make sense to write a test where a pod initially becomes ready and then subsequently stops serving and thus becomes unready? Might be useful to also make sure the changes you've planned remain sound? |
Yeah good point @markusthoemmes: we're way lighter on e2e coverage for this stuff than we should be, especially given the number of moving parts involved. I actually have a test image from initially investigating this that should be fairly easy to use to improve the e2e coverage, added to my list to do |
isAggressive: false, | ||
wantStatus: http.StatusOK, | ||
wantBody: queue.Name, | ||
name: "alive: false, no prober, K-Probe", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can drop these K-Probe
mentions anymore - since I believe those were for the aggressive state
This Pull Request is stale because it has been open for 90 days with |
Rebased on #11814 and (since this makes it redundant) dropped periodSeconds>0 case (both cases now act the same \o/). |
bump for another look @markusthoemmes @dprotaso |
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: dprotaso, julz The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…ative#11190) * Probe properly after startup when periodSeconds greater than zero * Update comformance test for fixed behaviour * Remove redundant K-Probe since this is no longer aggressive-probe specific
Fixes #10764.
WIP bc letting prow run the tests for me.Aside: the obvious question is why we initially broke this. Reading through past github history I think the original intention may have been to only use the QP probing path for the aggressive probing case, and use a direct RP on the user container otherwise (in which case it would be safe to make the QP path no-op in that circumstance). Unfortunately it looks like we made the QP path a no-op when probePeriod>0, but never actually added a probe against the user container in this case.
I think moving back the readinessProbe to be directly against the user container when probePeriod>0 is a good idea (especially it will be when we get sub-second probePeriods so both startup and readiness could be on user container..), but that's going to take a bit more work and this is a small step that makes things better. (I don't think, now that we have startup and readiness split, that even when we move the probe to user container it will make sense to use isAggressive to determine whether to cache the probe: we just won't ever probe user container via QP in this case).
Release Note