这是indexloc提供的服务,不要输入任何密码
Skip to content

Trust k8s readiness signal #12086

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Oct 15, 2021
Merged

Conversation

julz
Copy link
Member

@julz julz commented Oct 5, 2021

If k8s tells us a pod is ready before we manage to get a result via probing, it makes sense to trust that rather than probing ourselves. This only works when mesh autodetection is off, because otherwise we need the probe to sniff whether mesh is on.

Release Note

When mesh compatibility mode is not set to "auto" in the networking config map,
the activator will respect Kubernetes's readiness state and avoid probing when
kubernetes readiness propagates more quickly than the activator's probe.

/assign @markusthoemmes

If k8s tells us a pod is ready before we manage to get a result via
probing, it makes sense to trust that rather than waiting for a probe
result. This only works when mesh autodetection is off, because
otherwise we need the probe to sniff whether mesh is on.
@google-cla google-cla bot added the cla: yes Indicates the PR's author has signed the CLA. label Oct 5, 2021
@knative-prow-robot knative-prow-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/autoscale area/networking labels Oct 5, 2021
@codecov
Copy link

codecov bot commented Oct 5, 2021

Codecov Report

Merging #12086 (8dbb943) into main (ebeebd9) will increase coverage by 0.03%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main   #12086      +/-   ##
==========================================
+ Coverage   87.44%   87.47%   +0.03%     
==========================================
  Files         196      196              
  Lines        9531     9540       +9     
==========================================
+ Hits         8334     8345      +11     
+ Misses        923      921       -2     
  Partials      274      274              
Impacted Files Coverage Δ
pkg/activator/net/revision_backends.go 91.88% <100.00%> (-0.67%) ⬇️
pkg/reconciler/domainmapping/reconciler.go 92.10% <0.00%> (ø)
pkg/apis/serving/fieldmask.go 95.01% <0.00%> (+0.01%) ⬆️
pkg/apis/config/features.go 95.23% <0.00%> (+0.23%) ⬆️
pkg/reconciler/configuration/configuration.go 86.15% <0.00%> (+1.53%) ⬆️
pkg/autoscaler/scaling/multiscaler.go 89.09% <0.00%> (+1.81%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ebeebd9...8dbb943. Read the comment docs.

func (rw *revisionWatcher) probePodIPs(ready, notReady sets.String) (succeeded sets.String, noop bool, notMesh bool, err error) {
dests := ready.Union(notReady)

// Short circuit case where all the current pods are already known to be healthy.
if rw.healthyPods.Equal(dests) {
return rw.healthyPods, true /*no-op*/, false /* notMesh */, nil
}

toProbe := sets.NewString()
healthy := sets.NewString()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Should we prevent this allocation if the meshmode stuff aligns?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, it does mean writing an else, though 😅

// not also using the probe to sniff whether mesh is enabled, we can just
// trust k8s and mark it healthy without probing.
// TODO: replace with ready.Clone() once we have a recent-enough apimachinery (1.23+).
healthy = ready.Union(nil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the copying necessary? If so, why?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so, because we muck with healthy a few lines down, and it's probably pretty unexpected that someone passing a set in to probePodIPs would find that set modified as a side effect (e.g. otherwise the "Failed probing pods" log message would end up logging a misleading set of ready pods).

@julz
Copy link
Member Author

julz commented Oct 5, 2021

/assign @psschwei too

@knative-prow-robot
Copy link
Contributor

@julz: GitHub didn't allow me to assign the following users: too.

Note that only knative members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @psschwei too

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@julz
Copy link
Member Author

julz commented Oct 7, 2021

/retest

Copy link
Contributor

@psschwei psschwei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but since @markusthoemmes had a question about the copying, will let him stamp it

@knative-prow-robot knative-prow-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 7, 2021
@psschwei
Copy link
Contributor

psschwei commented Oct 7, 2021

/hold

@knative-prow-robot knative-prow-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 7, 2021
@psschwei
Copy link
Contributor

psschwei commented Oct 7, 2021

(thought I had to explicitly type /lgtm to add the tag, guess selecting "approve" in github will also do that. this may be more evidence for whomever mentioned that prow can confuse newer contributors :) )

@julz
Copy link
Member Author

julz commented Oct 11, 2021

ping @markusthoemmes

Copy link
Contributor

@markusthoemmes markusthoemmes left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve
/unhold

@knative-prow-robot knative-prow-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 15, 2021
@knative-prow-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: julz, markusthoemmes, psschwei

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/autoscale area/networking cla: yes Indicates the PR's author has signed the CLA. lgtm Indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants