After handling status update, reset update timer with correct duration #789

tghartland · 2019-11-04T09:55:46Z

If the ping timer is being used, it should be reset with the ping update
interval. If the status update interval is used then Ping stops being
called for long enough to cause kubernetes to mark the node as NotReady.

Fixes #788

cpuguy83

Any chance you can write a test to show the bug and the fix?

cpuguy83 · 2019-11-10T17:14:45Z

node/node.go

@@ -259,10 +259,13 @@ func (n *NodeController) controlLoop(ctx context.Context) error {
 			return nil
 		case updated := <-n.chStatusUpdate:
 			var t *time.Timer
+			var resetDuration time.Duration


Maybe we can set this before we start the loop?

I've made this change in the new commit, please take a look.

Personally I think it was more obvious why the reset duration had to be changed when it was bundled with this

virtual-kubelet/node/node.go

Lines 262 to 266 in ba940a9

if n.disableLease {

t = pingTimer

} else {

t = statusTimer

}

logic.

Maybe selecting the active timer could be moved out of the loop as well?

tghartland · 2019-11-11T13:16:34Z

@cpuguy83 I've pushed a new commit with a test that shows this.

Without the fix (ping timer is reset to status timer interval):

--- FAIL: TestPingAfterStatusUpdate (0.20s)
    node_test.go:377: assertion failed: expression is false: testP.maxPingInterval < maxAllowedInterval: maximum time between node pings (63.134795ms) was greater than the maximum expected interval (25ms)
time="2019-11-11T14:06:04+01:00" level=debug msg="got node from api server"
FAIL

With the fix:

--- PASS: TestPingAfterStatusUpdate (0.20s)
PASS

If the ping timer is being used, it should be reset with the ping update interval. If the status update interval is used then Ping stops being called for long enough to cause kubernetes to mark the node as NotReady.

cpuguy83

LGTM

Looking at this code again, I wonder if we should be resetting the ping timer at all for a node status update.
In any case this looks like a correct fix to the current situation.

Thanks!

cpuguy83 requested changes Nov 10, 2019

View reviewed changes

tghartland force-pushed the fix-notify-status-788 branch from 8180607 to fb6e9c6 Compare November 11, 2019 13:07

tghartland force-pushed the fix-notify-status-788 branch from fb6e9c6 to 8e79bb5 Compare November 11, 2019 13:23

tghartland added 2 commits November 11, 2019 14:29

Add test for node ping interval

3783a39

After handling status update, reset update timer with correct duration

c258614

If the ping timer is being used, it should be reset with the ping update interval. If the status update interval is used then Ping stops being called for long enough to cause kubernetes to mark the node as NotReady.

tghartland force-pushed the fix-notify-status-788 branch from 8e79bb5 to c258614 Compare November 11, 2019 13:30

cpuguy83 approved these changes Nov 12, 2019

View reviewed changes

cpuguy83 merged commit 1a9c4bf into virtual-kubelet:master Nov 12, 2019

cpuguy83 modified the milestones: 1.2.1, 1.1.1 Nov 12, 2019

cpuguy83 added the status/cherry-pick To be backported to older releases label Nov 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

After handling status update, reset update timer with correct duration #789

After handling status update, reset update timer with correct duration #789

Uh oh!

tghartland commented Nov 4, 2019

Uh oh!

cpuguy83 left a comment

Uh oh!

cpuguy83 Nov 10, 2019

Uh oh!

tghartland Nov 11, 2019

Uh oh!

tghartland commented Nov 11, 2019

Uh oh!

cpuguy83 left a comment

Uh oh!

Uh oh!

After handling status update, reset update timer with correct duration #789

After handling status update, reset update timer with correct duration #789

Uh oh!

Conversation

tghartland commented Nov 4, 2019

Uh oh!

cpuguy83 left a comment

Choose a reason for hiding this comment

Uh oh!

cpuguy83 Nov 10, 2019

Choose a reason for hiding this comment

Uh oh!

tghartland Nov 11, 2019

Choose a reason for hiding this comment

Uh oh!

tghartland commented Nov 11, 2019

Uh oh!

cpuguy83 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!