-
Notifications
You must be signed in to change notification settings - Fork 81
README for new MT Scheduler with pluggable policies #888
Conversation
/cc @lionelvillard |
Codecov Report
@@ Coverage Diff @@
## main #888 +/- ##
=======================================
Coverage 75.01% 75.01%
=======================================
Files 152 152
Lines 7080 7080
=======================================
Hits 5311 5311
Misses 1485 1485
Partials 284 284 Continue to review full report at Codecov.
|
@aavarghese can you fix the linter errors? thx! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I love this document!
pkg/common/scheduler/README.md
Outdated
1. **Pod failure**: | ||
When a pod/replica in a StatefulSet goes down due to some reason (but its node and zone are healthy), a new replica is spun up by the StatefulSet with the same pod identit (pod can come up on a different node) almost immediately. | ||
All existing vreplica placements will still be valid and no rebalancing is needed. | ||
There shouldn’t be any degradation in Kafka message processing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not really true, a consumer group rebalance could degrade message processing especially when Kafka Consumer Incremental Rebalance Protocol is not being used (which afaik is not implemented in Sarama).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pierDipi the pod set being referred to here is only talking about the eventing scheduler adapter pods where vreplicas are placed. Since pod will restart, the same placements can be kept without a rebalancing of the vreps.
I agree with you about the consumer group rebalancing and degradation but that may/may not happen here if the kafka pods are affected, as well.
I hope I'm not missing anything...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pierDipi the pod set being referred to here is only talking about the eventing scheduler adapter pods where vreplicas are placed. Since pod will restart, the same placements can be kept without a rebalancing of the vreps.
This is what All existing vreplica placements will still be valid and no rebalancing is needed.
is saying, I agree and it's clear to me why.
I was referring to There shouldn’t be any degradation in Kafka message processing.
.
I agree with you about the consumer group rebalancing and degradation but that may/may not happen here if the kafka pods are affected, as well.
I hope I'm not missing anything...
so, are you saying that if a pod where vreplicas are placed goes down that won't trigger a consumer group rebalance that affects message processing?
In the worst-case scenario, I'd expect something like this to happen (happy to be wrong):
- Pod goes down
- A new pod comes up (same name)
- Kafka broker sees a new consumer that wants to join the group -> rebalance
- Kafka detects that the consumer that was consuming messages in the dead pod (1) is not sending heartbeats anymore -> rebalance (again)
at least one rebalance happen. 2 in the worst case since terminationGracePeriodSeconds = 0 < "time for Kafka to detect that a consumer is dead"
Is the above not possible? If yes, does that count as a degradation for Kafka message processing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. This is absolutely possible.
I made an assumption that the same sticky pod (when restarted) would have the same consumer member ID and using static membership, would get the same assignment or so.
I don't have any numbers to quantify the extent of degradation for these recovery scenarios. Will need to run some performance runs to measure latency. Thank you for catching this @pierDipi !!
Signed-off-by: aavarghese <avarghese@us.ibm.com>
3c4da3d
to
b49ece3
Compare
Signed-off-by: aavarghese <avarghese@us.ibm.com>
Signed-off-by: aavarghese <avarghese@us.ibm.com>
/approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aavarghese, lionelvillard The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Signed-off-by: aavarghese avarghese@us.ibm.com
Continuation of #768
Proposed Changes
Release Note
Docs