Support zero-downtime upgrading for the Trident controller plugin

**Describe the solution you'd like**
We would like the trident operator to upgrade the Trident controller plugin without downtime.

Similar to https://github.com/NetApp/trident/issues/740, the trident operator deletes the deployment for the Trident controller plugin once when updating the trident version. It causes all the Trident functionality to be unavailable until the new controller pod becomes ready.

Furthermore, the deployment for the trident controller plugin has only one replica, and its strategy is `Recreate`. So even after the trident operator would not delete the deployment, when the old pod failed to be deleted, the deployment controller does not create a new controller pod, causing all the Trident's functionality not to work.

Because the situation that we cannot delete pods (stuck in `Terminating` state) is a common problem in Kubernetes, we would like to have multiple replicas of the Trident controller plugin with leader election.

**Describe alternatives you've considered**
none

**Additional context**
This situation can be reproduced with the following steps.

1. Deploy the trident operator v22.01.1 with the TridentOrchestrator object.
1. Wait until all trident pods become ready.
1. Set a dummy finalizer to the Trident controller pod.
    - e.g. `kubectl patch -n trident -p '{"metadata":{"finalizers": ["example.com/dummy"]}}' "$(kubectl get pods -n trident -l app=controller.csi.trident.netapp.io -o name | head -1)"`
    - This step simulates the controller plugin pod cannot be deleted.
1. Update the trident operator and the TridentOrchestrator object to v22.04.0.
1. There will be no healthy controller pod, which means all the Trident functionality does not work.

```
$ kubectl get pods -n trident -l app=controller.csi.trident.netapp.io
NAME                          READY   STATUS        RESTARTS   AGE
trident-csi-ccc5cdd56-hkppj   0/6     Terminating   0          6m5s
```



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support zero-downtime upgrading for the Trident controller plugin #745

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support zero-downtime upgrading for the Trident controller plugin #745

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions