Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add backwards compatible CRD versioning #240

Merged

Conversation

asymmetric
Copy link
Contributor

This PR does several things.

CustomVersion

There is currently no way to run multiple versions of a CRD. The CustomVersion field works around this by "overloading" the Habitat type.

All controllers receive all Habitat objects, but then decide which ones to skip based on the value (and presence) of the CustomVersion field.

Code generator

This PR also reorganizes the pkg tree to match the expectations of the code-generator. The code-generator is used to generate client, listers and watchers.

In the future, we will be able to maintain multiple versions more easily thanks to this structure.

The use of the code-generator also implied changes in the controller, because the structure of the client, listers and watchers has changed.

Backwards compatibility

The new field allows us to maintain backwards-compatibility with clusters that are already running Deployment-backed Habitat objects.

Closes #215.
Closes #216.

Lorenzo Manacorda and others added 30 commits April 11, 2018 14:57
This gets the code generator to create a client for us.
So that it does not depend on the specific version of the package.
Required by the client generator.
Instructs the client generator to create a client for the Habitat
resource.
We now rely on the code-generator.
To maintain backwards compatibility with clusters created using
Deployments.
The backwards compatible version, still using Deployments.
This was generated using the following command:
`./generate-groups.sh deepcopy,client github.com/kinvolk/habitat-operator/pkg/client
 github.com/kinvolk/habitat-operator/pkg/apis "habitat:v1beta1"`
The code-generator had somehow failed to add the habitat import.
The StatefulSet-based controller is v1beta2, the Deployment-based one will be
v1beta1.
Scoped by version, in case we decide to change it.
Because each controller should create a CRD of the appropriate version.
We also need informers, so we might as well create all of them.
Instead of creating the watchers by hand, we use the auto-generated factories.
We only run tests against the latest version, and assume that the
previous ones have already been tested.
Use methods that are specific to the Habitat type.
We were checking the wrong variable, which by this time was `nil`, so we
were returning `nil, nil`.
The name we were passing was of the form `name/version`, which is invalid.

What we actually want is of the `resource.group` form.
Instead of master, which was not compatible with client-go 6.0.0.
For consistency with the v1beta2 controller.
@asymmetric
Copy link
Contributor Author

Re: your comments on specv1beta2: I am thinking we can avoid making backwards-incompatible type changes for now, and wait for proper support for multiple versions to land.

Once that happens, we will simply remove the customVersion field, and checks for it, without having to change much else.

If we introduce a completely separate field for the spec, once support for multiple versions lands, we will have to support looking for two different fields (spec and specv1beta2), and also handle conflicts (what happens if a type has both spec and specv1beta2?)

@krnowak
Copy link
Contributor

krnowak commented Apr 17, 2018

Re: your comments on specv1beta2: I am thinking we can avoid making backwards-incompatible type changes for now, and wait for proper support for multiple versions to land.

Making backward-incompatible type changes is not a thing we plan to do, yes, but you never know. Anyway, I was opting for a separate spec struct per version, so we don't have a struct with bag of fields that are supported in various versions. spec would contain fields supported only in v1beta1, specv1beta2 for v1beta2 and so on, so it would be all clear. It hints redundancy, but considering that all the supported versions but last one are effectively "frozen", redundancy should not be a problem.

Just want to avoid having someone using the v1beta1 custom version and complaining that persistence doesn't work. It wouldn't work, because the operator silently ignored it. And modifying v1beta1 operator to learn about fields specific to v1beta2 just to warn about them being ignored does not sound like a thing that will work in long run.

Once that happens, we will simply remove the customVersion field, and checks for it, without having to change much else.

Who's "we"? Developers? We will need to support it anyway for some time, even if we release a proper v1betaX version via CRD. And probably introducing a separate CRD version will result in a duplication of the spec struct anyway.

If "we" are users, then the change will involve renaming specv1beta2 to spec, removing customVersion field and updating the apiVersion field.

If we introduce a completely separate field for the spec, once support for multiple versions lands, we will have to support looking for two different fields (spec and specv1beta2), and also handle conflicts (what happens if a type has both spec and specv1beta2?)

With introduction of v1beta2 customVersion I would consider v1beta1 frozen. So nothing should be added for the old version. And having a separate spec struct per version would reflect that. About conflicts - we still have the customVersion field and I want to keep it. This field will decide which operator will handle the manifest and which spec struct to expect. So if there is customVersion set to v1beta1 and there is only specv1beta2 field, then the operator will bail out, because spec is missing. If for the same customVersion actually both spec and specv1beta2 are specified in the manifest then I would say that the old operator should just ignore the specv1beta2 field and use spec - the operator has no clue about specv1beta2 at all. And the new operator will ignore the manifest because customVersion is different from what it supports.

@asymmetric
Copy link
Contributor Author

asymmetric commented Apr 17, 2018

So to summarize, your wish to use specv1beta2 is to avoid users creating a manifest with customVersion = v1beta1, but then expecting e.g. persistence.

Couldn't we solve that with documentation?

Having both customVersion and specv1beta2 seems unnecessary to me, because once customVersion is set, you don't really need specv1beta2 anymore.

Or alternatively, could we e.g. add a subkey to the spec instead, like spec.v1beta2? That would be cleaner IMO, and would also solve the documentation problem.

@asymmetric
Copy link
Contributor Author

asymmetric commented Apr 17, 2018

In practice this would look like:

apiVersion: habitat.sh/v1beta1
kind: Habitat
metadata:
  name: example-persistent-habitat
customVersion: v1beta2
spec:
  # the core/redis habitat service packaged as a Docker image
  image: kinvolk/redis-hab
  count: 3
  # the presence of this key activates persistent storage
  v1beta2:
    persistentStorage:
      # the size of the volume that will be mounted in each Pod
      size: 1Gi
      # a StorageClass object with name "standard" must be created beforehand by
      # the cluster administrator
      storageClassName: example-sc
      # the location under which the volume will be mounted
      mountPath: /tmp/foobar
  env:
    - name: HAB_REDIS
      # this is needed to make redis accept connections from other hosts
      # see https://redis.io/topics/security#protected-mode
      # NOTE: do not use this setting in a production environment
      value: '{ "protected-mode": "no" }'
  service:
    name: redis
    topology: leader
    # if not present, defaults to "default"
    group: foobar

@krnowak
Copy link
Contributor

krnowak commented Apr 17, 2018

So to summarize, your wish to use specv1beta2 is to avoid users creating a manifest with customVersion = v1beta1, but then expecting e.g. persistence.

Couldn't we solve that with documentation?

We could solve it with documentation, but how would documentation look like if we have a single spec thing? "This field is available only in v1beta2", "This field has a bit different behaviour in v1beta3", "This field is optional starting from v1beta2". Kind of a mess, I think. Would be better to document all the fields separately for each version. Also, I think it is better if the operator tells you what you did wrong openly and that maybe will make you go to see the docs.

Having both customVersion and specv1beta2 seems unnecessary to me, because once customVersion is set, you don't really need specv1beta2 anymore.

The specv1beta2 name is only because we can't use the spec name again. I mean, we could if we had something like this:

type SpecV1Beta1 {
    …
}

type SpecV1Beta2 {
    …
}

type Habitat struct {
    // usual k8s fields, versions, meta, blah blah blah
    CustomVersion string
    Spec json.RawMessage // parsed as SpecV1Beta1 or SpecV1Beta2, depending on CustomVersion
}

But I'm not really sure how that flies with yaml and client-go and whatnot. So I thought that repeating custom version in spec name is the least bad thing (better than something like spec2, for example).

Could we e.g. add a subkey to the spec instead, like spec.v1beta2? That would be cleaner IMO, and would also solve the documentation problem.

This spec.v1beta2 would a be struct that specv1beta2 was meant to be originally, right? If so, then yeah, it is also a good idea, I think.

@krnowak
Copy link
Contributor

krnowak commented Apr 17, 2018

Ah, now it's clear what you meant by spec.v1beta2. Well, that won't save us from duplication, because if we have v1beta3 we will need to duplicate all the persistence stuff there anyway. And if I will want to add new things to service, where do I put it? spec.v1beta2.service.newstuff or spec.service.v1beta2.newstuff?

I rather imagined something like:

apiVersion: habitat.sh/v1beta1
kind: Habitat
metadata:
  name: example-persistent-habitat
customVersion: v1beta1
spec:
  image: kinvolk/redis-hab
  count: 3
  env:
    - name: HAB_REDIS
      value: '{ "protected-mode": "no" }'
  service:
    name: redis
    topology: leader
    group: foobar

and

apiVersion: habitat.sh/v1beta1
kind: Habitat
metadata:
  name: example-persistent-habitat
customVersion: v1beta2
spec:
  v1beta2:
    image: kinvolk/redis-hab
    count: 3
    persistentStorage:
      size: 1Gi
      storageClassName: example-sc
      mountPath: /tmp/foobar
    env:
      - name: HAB_REDIS
      value: '{ "protected-mode": "no" }'
    service:
      name: redis
      topology: leader
      group: foobar

@asymmetric
Copy link
Contributor Author

if we have v1beta3 we will need to duplicate all the persistence stuff there anyway

In a hypothetical v1beta3, you would have:

apiVersion: habitat.sh/v1beta1
kind: Habitat
metadata:
  name: example-persistent-habitat
customVersion: v1beta2
spec:
  image: kinvolk/redis-hab
  count: 3
  v1beta2:
    persistentStorage:
      size: 1Gi
      storageClassName: example-sc
      mountPath: /tmp/foobar
  v1beta3:
    foo:
      bar: baz
  env:
    - name: HAB_REDIS
      value: '{ "protected-mode": "no" }'
  service:
    name: redis
    topology: leader
    # if not present, defaults to "default"
    group: foobar

i.e. each field would be related to a release. Is it ideal? No. But we are working around a deficiency in Kubernetes.

How would documentation look like if we have a single spec thing? "This field is available only in v1beta2", "This field has a bit different behaviour in v1beta3", "This field is optional starting from v1beta2". Kind of a mess, I think. Would be better to document all the fields separately for each version.

This is a good point.

@asymmetric
Copy link
Contributor Author

asymmetric commented Apr 17, 2018

What does the upgrade path look like?

When multiple-version CRDs land

  • Add clients/listeners for each version
  • Switch v1beta2 controller to watching its own version as well as v1beta1
  • Check the spec.v1beta2 field
    • when type is habitat.sh/v1beta1 and customVersion = v1beta2
  • Check the spec field
    • when type is habitat.sh/v1beta2

When we can drop support for this hack

  • Remove customVersion
  • Stop watching for v1beta1 objects in non-v1beta1 controllers
  • Stop checking spec.v1beta2 field
  • Remove spec.v1beta2 key, and move all its keys up to spec

@krnowak
Copy link
Contributor

krnowak commented Apr 17, 2018

About hypothetical v1beta3 - this repeats the thing we had with spec having fields for all the versions. Now v1beta2.persistentStorage will be interpreted by v1beta3 operator, which may be something different from v1beta2 operator's interpretation… Also, you have made a mistake there, and the example would deploy v1beta2 version of the object silently ignoring v1beta3 field, because you wrote customVersion: v1beta2.

About CRD versioning: This is for the future, but I suppose that when CRD versioning lands, we don't do anything immediately I think. I would say that if we need to make another breaking change, then we switch to CRD versioning at the same time. So if the latest custom version is v1betaN then the new, breaking CRD version could be v1betaN+1 without all that custom version and spec hacks. Over time we can deprecate and phase out the oldest versions, which will eventually make custom version to completely fade away.

Lorenzo Manacorda added 2 commits April 17, 2018 16:19
Instead of defining v1beta2 fields in the Spec struct, we add a
completely separate spec.v1beta2 struct for v1beta2.

The v1beta2 controller will only access this struct, whereas other
controllers will simply ignore it.

This allows us to introduce backwards-incompatible changes to fields
(e.g. switching to a pointer type).
Especially on Travis, where we can't SSH into the machines, this will
help with debugging failing builds.
@asymmetric
Copy link
Contributor Author

PTALA.

Copy link
Contributor

@krnowak krnowak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some nits.

Not-sure-if-blocking item: Both operators are taking care of the peer watch file config map. Hopefully they won't end up in a loop endlessly updating it, because the other operator changed its contents.

Do we have something like a "string enum"? We could use one for the customVersion field to give an error when somebody writes a wrong value there, like v1beta3 when there is no such custom version yet, or make a typo c1beta2.

# count must be at least 3 for a leader-follower topology
count: 3
service:
name: consul
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heh, while at it, change this to "redis", please. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry 😅

@@ -61,7 +61,8 @@ type HabitatSpec struct {
// Optional.
Env []corev1.EnvVar `json:"env,omitempty"`
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This +optional should be the last comment line of the field, right?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another comment about it, later. This one was made on a commit and github marked it as outdated immediately.

@@ -61,7 +61,8 @@ type HabitatSpec struct {
// Optional.
Env []corev1.EnvVar `json:"env,omitempty"`
// +optional
PersistentStorage *PersistentStorage `json:"persistentStorage,omitempty"`
// V1beta2 are fields for the v1beta2 type.
V1beta2 *V1beta2 `json:"v1beta2"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize "beta" in type name? So it is V1Beta2? WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made another comment about it, later. This one was made on a commit and github marked it as outdated immediately.

Copy link
Contributor Author

@asymmetric asymmetric Apr 18, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is the capitalization format used in kubernetes.

I guess they consider 1beta1 one word.

@@ -18,15 +18,15 @@ image: linux
$(SUDO) docker build -t "$(IMAGE):$(TAG)" .

test:
go test -v $(shell go list ./... | grep -v /vendor/ | grep -v /test/)
go test $(shell go list ./... | grep -v /vendor/ | grep -v /test/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why dropping -v from the tests? Will you get any meaningful output without it? You wrote "too much information" in the commit message, but recently you asked me about the output from tests, so that left me wondering…

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right that the commit message is maybe not very descriptive, but the point is that these are the unit tests, and the output was basically useless IMO.

Where we don't see any output for a very long time is in e2e tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, right. Those are unit tests.

No output in e2e tests may be caused by the fact the the output from the test case is printed when the test case finishes. So if the test case takes long to complete, you won't see anything for a long time.

@@ -51,8 +60,9 @@ type HabitatSpec struct {
// The EnvVar type is documented at https://kubernetes.io/docs/reference/generated/kubernetes-api/v1.9/#envvar-v1-core.
// Optional.
Env []corev1.EnvVar `json:"env,omitempty"`
// Optional.
PersistentStorage *PersistentStorage `json:"persistentStorage,omitempty"`
// +optional
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this line be a last one?

}

func validateCustomObject(h habv1beta1.Habitat) error {
spec := h.Spec.V1beta2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that this is the first time we access the fields of spec.V1beta2, so a nil pointer check should be in place here, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Hopefully we can introduce schema validation soon.

if err != nil {
level.Error(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", d.Name)
level.Error(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", sts.Name)
return
}

hc.enqueue(h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Call to the checkCustomVersionMatch function is missing here. We don't want the v1beta2 operator to handle v1beta3 stateful sets. I'm wondering if this checkCustomVersionMatch could be added to the enqueue function to have a single place where this thing is checked.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I'll do that.

if err != nil {
level.Error(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", d.Name)
level.Error(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", sts.Name)
return
}

hc.enqueue(h)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

if err != nil {
// Could not find Habitat, it must have already been removed.
level.Debug(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", d.Name)
level.Debug(hc.logger).Log("msg", "Could not find Habitat for StatefulSet", "name", sts.Name)
return
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

PersistentStorage *PersistentStorage `json:"persistentStorage,omitempty"`
// +optional
// V1beta2 are fields for the v1beta2 type.
V1beta2 *V1beta2 `json:"v1beta2"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Capitalize b in beta so we have V1Beta2? WDYT?

To reduce the number of services we need to package.
Instead of having it spread around too many places.
@asymmetric
Copy link
Contributor Author

PTALA.

@krnowak
Copy link
Contributor

krnowak commented Apr 18, 2018

LFAD.

@asymmetric asymmetric merged commit 4ffec50 into habitat-sh:master Apr 18, 2018
@asymmetric
Copy link
Contributor Author

OMG.

@asymmetric asymmetric deleted the asymmetric/update-script branch April 18, 2018 12:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants