Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CRD versioning design proposal #2026

Closed
wants to merge 1 commit into from

Conversation

mbohlool
Copy link
Contributor

@mbohlool mbohlool commented Apr 10, 2018

CRD versioning proposal is discussed in detail here:

https://docs.google.com/document/d/1Ucf7JwyHpy7QlgHIN2Rst_q6yT0eeN9euzUV6kte6aY/edit

This is the main part of the design. Look at #2054 first.

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Apr 10, 2018
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: deads2k

Assign the PR to them by writing /assign @deads2k in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added kind/design Categorizes issue or PR as related to design. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. labels Apr 10, 2018

# **Background**

Custom Resource Definitions ([CRDs](https://kubernetes.io/docs/concepts/api-extension/custom-resources/)) are a popular mechanism for extending Kubernetes, due to their ease of use compared with the main alternative of building an Aggregated API Servers. They are, however, lacking a very important feature that all other kubernetes objects support: Versioning. Today, each CR can only have one version and there is no clear way for authors to advance their resources to a newer version other than creating a completely new CRD and converting everything manually in sync with their client software.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: server


* Support ability to change storage version

* Support Validation/OpenAPI schema for all versions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for each or for all at once?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a clarification.


The summary is to support a list of versions that will include current version. One of these versions can be flagged as the storage version and all versions ever marked as storage version will be listed in a stored_version field in the Status object to enable authors to plan a migration for their stored objects.

The current `Version` field in planned to be deprecated in a later release and will be used to pre-populate the `Versions` field (The `Versions` field will be defaulted to a single version, constructed from top level `Version` field). The `Version` field will be also be mutable to give a way to the authors to remove it from the list.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is

// ***************
// Versions is the list of all supported versions for this resource.
// Validation: All versions must use the same validation schema for now. i.e., top
// level Validation field is applied to all of these versions.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which means you can use AnyOf: [{ properties: { APIVersion: { pattern: "group/v2" }}}, ...] to have different validations for each version.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note that people will use this way and we have no way to deprecate the top-level validation ever. We could instead add validation to the versions and enforce its use as soon as there is more than one version defined.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same is true for subresources. Should we move them to Versions as well? I think we should. Today there is no graceful way to enable the /status subresource without breaking either old or new clients.

Name string
// Served is a flag enabling/disabling this version from being served via REST APIs
Served Boolean
// Storage flags the release as a storage version. There can be only one version
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There must be exactly one version flagged as Storage.

}
```

The Status object will have a list of potential stored objects. This data is necessary to do a storage migration in future (the author can choose to do the migration themselves but there is a plan [TODO(lavalamp): link?] to solve the problem of migration, potentially for both standard and custom types).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stored versions.


Basic validations needed for the `version` field are:

* `Spec.Version` field exists in `Spec.Versions` field (serve should be True for that version if there is any).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we force at least one served version. Do we want that? Do we even have to (to allow e.g. the GC controller to do its work)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't want that. We discussed this on the implementation PR. I will amend this.

Basic validations needed for the `version` field are:

* `Spec.Version` field exists in `Spec.Versions` field (serve should be True for that version if there is any).
* If there is any `Served` `Version` in list of `Versions`, then the `Spec.Version`’s version in the `Spec.Versions` list should have a set `Served` flag. This is for backward compatibility. An old controller expect that version to be served but only the whole CRD is served. CRD Registration controller should unregister a CRD with no serving version.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this "if" implicitly true by the "serve should be True" above?
Can spec.version be empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed previous one. So the if is not true all the time. You can disable all versions and in that case the version defined in Version field does not need to have Served flag enabled. But if we serve ANY version, then the version defined in Version field should point to a Served version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the wording to make this more clear.

// flagged as Storage.
Storage Boolean
//
Conversion CustomResourceConversion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer with omitempty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

}

type CustomResourceConversion struct {
Declarative map[string]CustomResourceDeclarativeConversion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the meaning of the string?
Are declarative conversions sorted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are not sorted. The map key is the target version.


type CustomResourceConversion struct {
Declarative map[string]CustomResourceDeclarativeConversion
External CustomResourceConversionWebhook
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pointer and omitempty

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a webhook definition per version looks odd. I would expect it globally with a set of support conversions pairs. Then you don't have to maintain n webhook structs. Alternatively, we could think about a global named conversion webhook field and reference the names here (like volumes and mounts in pods).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I explained the design further down. It is cleaner to define the webhook on the version but the validation will not let user define more than one webhook. It will just make sure if more than one webhook is defined, they are the same.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't like this kind of redundancy. Why is it cleaner?

Conversion CustomResourceConversion
}

type CustomResourceConversion struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are multiple conversions mutual exclusive for each version pair?

The API structure above allows some invalid states in order to keep a clean easy API. There need to be validation to catch undesire states. The validations would be

* Multiple webhooks are not allowed for a CRD. This will help us improve the performance by keeping the connection open to the webhook if the user defined a chain of version conversions.
* There should be no loop in the conversion graph. This includes v1->v2->v1 or v1->v2->v3->v1. The former is implicit in the assumption that webhook conversion is round-trippable. The second one allows two way to convert v1 to v3 which is not desired.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't a loop fine and necessary? Every path without a loop should be unique.

type RenameConversion {
From JsonPath
To JsonPath
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To avoid undesired renames, better use a base json path and a field rename:

type RenameConversion struct {
  Object JsonPath
  FieldName string
  NewFieldName string
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking we would let structural changes later (carefully) and we can prevent them with validation for now. This way we do not need to change API or add another conversion for structural changes but I am also OK if we are thinking about a MoveConversion later though I liked the end product to only have MoveConversion and not Move and Rename both.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we call it RenameConversion, we should be strict. If we call it MoveConversion we could restrict it to renames for the moment. Both is fine with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being more explicit is better I guess. I went with your RenameConversion suggestion.


* Multiple webhooks are not allowed for a CRD. This will help us improve the performance by keeping the connection open to the webhook if the user defined a chain of version conversions.
* There should be no loop in the conversion graph. This includes v1->v2->v1 or v1->v2->v3->v1. The former is implicit in the assumption that webhook conversion is round-trippable. The second one allows two way to convert v1 to v3 which is not desired.
* Rename operation can be limited at first to not change the structure despite the fact that `To` field is also a json path.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See suggestion above: we only want to rename fields in objects, no moves, no structure changes.

@mbohlool
Copy link
Contributor Author

@sttts I added the webhook to the original document and asked people to look at it and I thought there is no objection on the design. I see most of your comments is on webhooks. We can discuss that here, but as we do not want to implement webhooks in 1.11, Do you think it is better to remove it from here and continue discussion on the Google Doc one. It is easier to discuss things there.

@sttts
Copy link
Contributor

sttts commented Apr 12, 2018

Do you think it is better to remove it from here and continue discussion on the Google Doc one

We can sketch webhook here, but clearly call it a sketch, i.e. to be clear that this must be worked out.

@liggitt
Copy link
Member

liggitt commented Apr 12, 2018

Do you think it is better to remove it from here and continue discussion on the Google Doc one. It is easier to discuss things there.

It seems better to describe future webhook plans generally, not as API fields yet, if they are still being designed. The previous review concentrated on the parts under consideration for implementation in 1.10.


type CustomResourceDeclarativeConversion {
// This type will have collection of conversion operations. Only one of them should be set.
Rename RenameConversion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This have to be an array.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should only one of them be set?

We can have renames, moves, deletions all at the same time. We need a logic to detect conflicts during validation of course.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an array or the CustomResourceDeclarativeConversion can be an array in the map. which one is better? I personally think it is more explicit to make the CustomResourceDeclarativeConversion an array then the conversion would be look like:

v1:
  - Rename: ...
  ...
  - Rename: ...
  ...
  - Move: ...
  ...
...

Also deletion cannot happened according to our API guidelines. User can rename the field to a in-name-marked deprecated to deleted field (e.g. deprecated_... or deleted_...).

}

type CustomResourceConversion struct {
Declarative map[string]CustomResourceDeclarativeConversion
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maps aren't allowed in our API

Conversion CustomResourceConversion
}

type CustomResourceConversion struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more involved than the "no changes" we had talked about before. How about we start small?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first PR will be no change, but for 1.11 we want at least rename. Do you have any objection to the rename design here?

* None of the `Status.StoredVersion` can be removed from `Spec.Versions` list.
* Only one of the versions in `spec.Versions` can flag as `Storage` version.

# **Declarative conversion**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, ok. I think I see what happened. You're saying "these are some thoughts I had", not "I want to build this an merge it in 1.11". If that's the case, can you add a "Declarative conversion design not approved in this document" in big bold letters leading the section?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually these are the things I want to have in 1.11 (only rename). Here we can discuss it if you have any thoughts. These are designed I discussed in one form or another with Stephan, Daniel, and Eric but none approved it yet. I guess that is the goal of this PR.

@k8s-ci-robot
Copy link
Contributor

@mbohlool: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 26, 2018
@mbohlool mbohlool closed this May 21, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/design Categorizes issue or PR as related to design. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants