Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: sidecar is writing env variables before controller creates/modifies pools #2222

Open
pjuarezd opened this issue Jul 17, 2024 · 0 comments
Assignees
Labels
bug Something isn't working

Comments

@pjuarezd
Copy link
Member

pjuarezd commented Jul 17, 2024

A undesired secondary effect with sidecar is there if all three conditions are met:

  • Operator controller is down (or failing), ergo no statefulsets are being modified by the controller
  • A Pool is added or removed from the Tenant resource
  • Any of the runing minio pods is restarted or error flapping (for whathever reason)

Then that pod do not return to run state because sidecar is writing in the env variables file in /tmp/minio/config.env a MINIO_ARGS variable that includes the latest change in Pools as soon as the Tenant resource has a change, regardless if the Controller is creating or not a statefulset, or removing it.

MinIO tenant pods should be able to keep runing regardless of the controller is healty running or not in the cluster, that was the idea behind removing the Operator TLS cert and the Operator webhook to get env variables, to make Tenant pods able to keep running standalone.

Steps to reproduce:

  1. Build this branch image (or any starting v5.0.0)
  2. Create a Tenant, wait for it to be in Initialized state
  3. Add a second pool, wait the tenant to be in Initialized state
  4. Scale down the Operator deployment to 0 pods (to simulate the controller is down)
  5. Remove the second pool, <- at this point everything should be fine and no problem
  6. Delete any pods in the 2 statefulsets
  • Notice the pod do not come back online
  • Notice the pod has a MINIO_ARGS env variable in the file /tmp/minio/config.env with only one pool, instead of 2 as any of the other pods in the cluster
  • Notice the error message in MinIO is related to mismatching configuration
INFO: Following servers have mismatching configuration [https://myminio-pool-0-0.myminio-hl.tenant-lite.svc.cluster.local:9000->https://myminio-pool-0-3.myminio-hl.tenant-lite.svc.cluster.local:9000 has incorrect configuration: Expected number of endpoints 8, seen 16 https://myminio-pool-0-0.myminio-hl.tenant-lite.svc.cluster.local:9000->https://myminio-pool-0-1.myminio-hl.tenant-lite.svc.cluster.local:9000 has incorrect configuration: Expected number of endpoints 8, seen 16 https://myminio-pool-0-0.myminio-hl.tenant-lite.svc.cluster.local:9000->https://myminio-pool-0-2.myminio-hl.tenant-lite.svc.cluster.local:9000 has incorrect configuration: Expected number of endpoints 8, seen 16]

Possible solution

Somehow conscider the state of the pools in the Tenant spec.status or add a new field the Tenant spec.status to comunicate sidecar when is OK to regenerate env variables, @jiuker 's PR (here) is an aproximately similar solution it just misses the part where sidecar waits for the "OK" to regenerate env variables.

@pjuarezd pjuarezd added the bug Something isn't working label Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants