Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Empty explainers on old deployments #1593

Closed
adriangonz opened this issue Mar 24, 2020 · 1 comment · Fixed by #1604
Closed

Empty explainers on old deployments #1593

adriangonz opened this issue Mar 24, 2020 · 1 comment · Fixed by #1604
Assignees
Labels
Milestone

Comments

@adriangonz
Copy link
Contributor

Context

After merging #1485, the Explainer field on a SeldonDeployment can now be nil. However, previously we used "empty" explainers (i.e. entries with all fields empty) to signal that there was no explainer configured.

This means that, if you are running a version previous to SC 1.1, Kubernetes will be storing your SeldonDeployment resources with an "empty" explainers like the one below:

kind: SeldonDeployment
spec:
 predictors:
   - explainer:
      containerSpec:
        name: ""
        resources: {} 

Issue

The problem comes when you update to SC 1.1. Since it now considers that no explainer is defined by setting a nil pointer on the explainer field, it will consider that all previous SeldonDeployment entries have an explainer configured. Therefore, it will attempt to deploy a mis-configured explainer for all of them:

NAME                                         READY   STATUS             RESTARTS   AGE
mymodel-mymodel-0-696f48c588-2tmgq           2/2     Running            0          59m
mymodel-mymodel-explainer-54cbf97f75-8zhlv   0/1     CrashLoopBackOff   15         59m

Potential fixes

Fix ideas that come to mind are:

  • Using the mutating webhook to change "empty" explainers to nil. I'm not sure if this would apply to existing SeldonDeployments after an update though.
  • Checking on the validating webhook if an explainer is "empty" and raise an error as it's mis-configured. This would cause an error for all existing SeldonDeployments so probably not advised.
  • Going back to the old way of checking if an explainer is not configured. That would mean that the check would look like:
    if p.Explainer != nil && p.Explainer.Type != "" {
      ...
    }
@adriangonz adriangonz added bug triage Needs to be triaged and prioritised accordingly and removed triage Needs to be triaged and prioritised accordingly labels Mar 24, 2020
@adriangonz adriangonz modified the milestone: 1.1 Mar 24, 2020
@ukclivecox
Copy link
Contributor

Its probably worth doing option 3. Its true for a running SeldonDeployment I don't think the mutating webhook would be called again on upgrade.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants