-
Notifications
You must be signed in to change notification settings - Fork 292
Open
Description
Hi. On January 16th we performed a Kubernetes upgrade. This involved rolling the kubernetes nodes in a given order such that the
nats nodes/pods would roll first then the nats streaming pods would roll with the two followers first and the leader last.
I see the following sequence:
- Nats-streaming-0 becomes the new leader at 17:41:12
- Nats-streaming-2 restores two channels at 17:43:26
- Immediately after the restoration nats-streaming-2 gets runtime error: panic: runtime error: invalid memory address or nil pointer dereference
- nats-streaming-2 restarts but very quickly shuts down due to "STREAM: Failed to start: log not found"
- nats-streaming-2 restarts again but again shuts down for the same reason
- and one more time
- nats-streaming starts up again 1.75 hours later (no idea why such a delay). NO restoring is performed even though some channels restored in steps 4, 5 and 6.
- Yesterday we rolled the nats-streaming-2 pod but again no channels were restored.
Is there a way to mitigate/correct such that nats-streaming-2 restores the channels. I worry that if there is a subsequent leader election, nats-streaming-2 may become the leader with potentially bad results.
Log file from nats-streaming-2:
Metadata
Metadata
Assignees
Labels
No labels