Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle ConfigMap created by workload container during application remove #120

Open
i-chvets opened this issue Mar 23, 2023 · 1 comment
Open
Labels
bug Something isn't working

Comments

@i-chvets
Copy link
Contributor

i-chvets commented Mar 23, 2023

Description

Handle ConfigMap created by workload container during application remove

There is a ConfigMap created by workload container seldon-core to track its leadership:

kubectl -n <namespace> get configmap a33bd623.machinelearning.seldon.io -o=yaml
apiVersion: v1
kind: ConfigMap
metadata:
  annotations:
    control-plane.alpha.kubernetes.io/leader: '{"holderIdentity":"seldon-controller-manager-5ff5b59788-r9c5b_d5372a80-cf0b-49ce-88d6-4c33a1096f28","leaseDurationSeconds":15,"acquireTime":"2023-03-23T18:52:51Z","renewTime":"2023-03-23T18:54:15Z","leaderTransitions":1}'
  creationTimestamp: "2023-03-23T18:49:51Z"
  labels:
    app.juju.is/created-by: seldon-controller-manager
  name: a33bd623.machinelearning.seldon.io
  namespace: kf
  resourceVersion: "4439"
  uid: 7dfa723e-5286-435b-a633-84a25f340506

This ConfigMap has expiration time of 45 seconds.

Initial problem was detected when testing upgrade: deploying stable charm and then upgrading to updated one with 45 seconds failed upgrade due to container inability to acquire lock on ConfigMap above:

 error retrieving resource lock kf/a33bd623.machinelearning.seldon.io

If upgrade is performed outside of expiration window it succeeds.

On application removal this ConfigMap (a33bd623.machinelearning.seldon.io) is not removed. It should be removed.

@i-chvets
Copy link
Contributor Author

Seldon-core hardcodes the name of that ConfigMap:

seldon-core$ grep a33bd623  * -rn
helm-charts/seldon-core-operator/values.yaml:88:  leaderElectionID: a33bd623.machinelearning.seldon.io
helm-charts/seldon-core-operator/README.md:79:| manager.leaderElectionID | string | `"a33bd623.machinelearning.seldon.io"` |  |
operator/bundle/manifests/seldon-operator.clusterserviceversion.yaml:450:                  value: a33bd623.machinelearning.seldon.io
operator/bundle-certified/manifests/seldon-operator-certified.clusterserviceversion.yaml:450:                  value: a33bd623.machinelearning.seldon.io
operator/config/manager/manager.yaml:47:          value: "a33bd623.machinelearning.seldon.io"
operator/main.go:57:	leaderElectionIDDefault = "a33bd623.machinelearning.seldon.io"

@i-chvets i-chvets added the bug Something isn't working label Mar 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant