Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

doc/proposals: adding helm operator proposal #658

Merged
merged 5 commits into from
Nov 5, 2018

Conversation

joelanford
Copy link
Member

Description of the change:
Adding a proposal for the integration of the Helm operator into the SDK

Motivation for the change:
The Helm operator will give SDK users yet another tool to use to develop operators. Having Helm as an option will drive up adoption of the SDK and give users the ability to build operators with Helm, a tool many Kubernetes users already have proficiency with.

@openshift-ci-robot openshift-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Oct 24, 2018
@lilic
Copy link
Member

lilic commented Oct 25, 2018

Just a general question/concern:
If we say operator-sdk go operator is version v0.1.0 for example, does that include the ansible and helm operators? Would that mean that the maturity of helm operator affects the stable release of operator-sdk?

Copy link
Member

@shawn-hurley shawn-hurley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, want to make sure we have a decent idea of how to deal with the common ansible stuff and the differences IMO

* Will contain a helper function to create a Helm client from `controller-runtime` manager.

* /operator-sdk/pkg/helm/controller
* Will contain the Helm controller.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the controller going to be exported? Or just an ‘Add’ method?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an Add method that uses the exported HelmOperatorReconciler.


#### Add

Add functionality will be updated to allow Helm operator developers to add new CRDs/CRs and to update the watches.yaml file for additional Helm charts. The command helps when a user wants to watch more than one CRD for their operator.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this is only for creating CRDs? I think you may just want to re-use the ansible operator stuff for just ‘add crd’.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahh. I didn't see the add crd subcommand the first time around.

My initial thought was that app api would be operator type aware.

  • For go, it would do what it does now.
  • For ansible, it would create the crds, update the watches file, and scaffold out a new roles/<Kind> Ansible role.
  • For helm it would create the crds, update the watches file, and scaffold out a new helm-charts/<Kind> Helm chart.

If that was the pattern, that could also potentially align how operator-sdk new works (i.e. --api-version and --kind would no longer be valid for a new operator of any type)

I was surprised when I ran operator-sdk add api in an ansible operator project and it spit out Go files.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If someone wants to convert to a golang operator from an ansible operator, don't want to lock them in.

I also don't like calling this adding an API, IMO an API is what this does adds the go structures for the API. I would prefer to keep the CRD command as I think it is much more descriptive. When discussing the ansible operator, we never talk about adding API's or watching API's we discuss CRDs and CRs.

I also do not want new for ansible operator does not allow --api-version and --kind. This is really important to the ansible operator workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah the command add api is only valid in the context of a Go operator.
We can use the command add crd here.

I was surprised when I ran operator-sdk add api in an ansible operator project and it spit out Go files.

We need to make sure to check not run the commands specific to the Go type only run in a Go operator project:

  • add api
  • add controller
  • generate k8s

I'll create an issue to fix this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Sounds good. Just wanted to make sure that was a conscious choice and not leftover from the pre-0.1.0 CLI.

So I'll update the proposal to specify that add crd will be used for the helm operator and that other add commands will be invalid.

What about the question of when to scaffold a new boilerplate role or chart? Only on new or also on add crd? Any opinions there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we scaffold a new role for add crd although it kind of makes sense that you would want to associate a new role/chart with a new type.
But then do we update watches.yaml file as well to point to the new chart/role?
And if we do couple it with the add crd command then we should ensure that it handles Go/Ansible/Helm project types differently.

The only downside is that this couples a bunch of different things to the add crd command.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hasbro17 Is there a use case for add crd that wouldn't involve other changes? Along those same lines, for a Go project, when would one use add crd and not add api?

If there is a use case for running add crd standalone, I can see how coupling it to these other things may not be desirable.

But to me, it seems like the user is always going to follow add crd up with other manual changes that will likely depend on the type of the operator. For ansible and helm, I think it's pretty likely users will be manually adding a new role/chart and updating watches.yaml.

If users are doing that manually, it increases the likelihood that they forget to change something somewhere or make a typo that causes their new CRD to fail in some way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joelanford add crd was initially added for the Ansible project.
I don't think there was a use case for it in a Go project, since you almost always need the pkg/apis/... files to register the CRD type with the scheme so that the operator can actually watch or CRUD that type.

Maybe if you're dealing with unstructured types(like the ansible and helm operator do) then you don't need the api definition pkg/apis/.... But I don't think that's a common use case.

So yeah I think running add crd standalone isn't necessary for the Go project when we have add api.
So disregarding the Go project we could couple it with adding a role/chart and updating watches.yaml.
@shawn-hurley WDYT? Leave it up to the users. Couple it with add crd. Or a new subcommand.

operator-sdk up local
```

This should use the known structure and the helm operator code to run the operator from this location. The existing code will need to be updated with a new operator type check for `helm` (in addition to existing `go` and `ansible` types). The command works by running the operator-sdk binary, which includes the Helm operator code, as the operator process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this needs to be solved later, and that is fine, but how are we differentiating between ansible and helm?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For helm at least, the immediately obvious option would be to check for the presence of the helm-charts directory in the project root.


* This proposal assumes that a Helm Operator base image will be available for building Helm operator projects. What generates the Helm operator base image and what is the registry, image name, versioning, etc.?

* There is a moderate amount of complexity already related to how operator types are handled between the `go` and `ansible` types. With the addition of a third type, there may need to be a larger design proposal for operator types. For example, do we need to define an `Operator` interface that each of the operator types can implement for flag verification, scaffolding, project detection, etc.?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think there is complexity, I would prefer to keep the complexity as close to the surface. Something like a factory pattern which each up command(helm, go, ansible) have their own implementation and their own flags possibly?

I also would like to at least have an idea on what to do here in this proposal IMO just so we don’t live in a super complex world while we figure this out? Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shawn-hurley Yeah that's pretty much what I was thinking. I'm not sure what the cleanest way to do that would be though.

Maybe we could do the operator type detection / --type flag parsing before passing things off to cobra and then have each operator type implement the cobra subcommand functions directly or pass the operator implementation to the subcommand functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could do the operator type detection / --type flag parsing before passing things off to cobra and then have each operator type implement the cobra subcommand functions directly or pass the operator implementation to the subcommand functions

This seems like we'll have implicit subcommands:

  • operator-sdk build --type=go ==> operator-sdk go build
  • operator-sdk build --type=ansible ==> operator-sdk ansible build

This goes back to the question of what sub commands should be allowed in what types of projects, and how explicit should we be about specifying the type. #670

Some commands are common across different types(build, up local) whereas other sub commands only make sense for certain project types e.g Go(add api, generate k8s) Ansible/Helm(add crd). It gets a little confusing if we want to allow those exclusive commands across all types to allow migrating to a hybrid Go/Ansible Go/Helm operator.

One option is we stay with the existing convention of specifying the --type flag(or inferring from the project) for each subcommand so we can keep the commands consistent for all 3 types, and handle the logic for each type within the same sub command:

  • operator-sdk up local(infer the type) or operator-sdk up local --type=<type>

This keeps all subcommands consistent, except we still have to specify the --type flag or infer that from the project. The downside is the ambiguity in what commands are allowed where.

Or we take a top down approach as @joelanford suggested (#670 (comment)) and put the operator type as the first root sub command:

  • operator-sdk go build
  • operator-sdk ansible build
  • operator-sdk helm build

The advantage of the type root subcommand is that there's no ambiguity on what commands run in what types of projects with differing arguments and flags.
The drawback is that it's a little verbose(redundant?) for commands that are (mostly)common in their behavior
e.g operator-sdk build.
Also if I end up with a hybrid operator do I run operator-sdk go build or operator-sdk ansible build?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great points. Maybe its a combination. I think we can categorize the CLI into three sets of commands.

  1. Global commands where operator type does not need to be inferred:

    • completion, help, and new
  2. Commands that need to support possible hybrid operators and where the operator type(s) will need to be inferred:

    • up, build, test
  3. Commands that apply to specific types:

    • Go: add api, add controller, generate
    • Ansible: add crd
    • Helm: ?

For 2., as @estroz mentioned (#670 (comment)), we can detect all of the types an operator has and be opinionated about which combinations we'll support, with helpful errors/warnings about non-supported combinations.

For 3., another option may be to keep add and generate as top-level commands, but then have type subcommands of those. So:

  • operator-sdk add go api
  • operator-sdk generate go k8s
  • operator-sdk add ansible crd (or maybe even better, operator-sdk add ansible role)
  • operator-sdk add helm crd (or add helm chart)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 3., another option may be to keep add and generate as top-level commands, but then have type subcommands of those. So:

I was thinking of something along those lines as well. Would make things clear when you do operator-sdk add go --help as it would display all the subcommands.

@hasbro17
Copy link
Contributor

If we say operator-sdk go operator is version v0.1.0 for example, does that include the ansible and helm operators? Would that mean that the maturity of helm operator affects the stable release of operator-sdk?

That's a good question. @shawn-hurley since the base image quay.io/water-hole/ansible-operator is built off the ansible pkgs in the SDK we should start versioning the base image with each release.

We can keep a master or latest image that can be built nightly or as part of our CI. And version tags built for each release.
And the same for the helm-operator. WDYT?

@joelanford
Copy link
Member Author

We can keep a master or latest image that can be built nightly or as part of our CI. And version tags built for each release.
And the same for the helm-operator. WDYT?

Sounds good to me. quay.io/water-hole/helm-operator for the helm operator base image?

I'd suggest the latest tag tracks the operator-sdk git tag with the greatest semantic version and the master tag tracks passing nightly CI builds.

@hasbro17
Copy link
Contributor

@joelanford Yeah you're right. latest is the greatest semvar(or last release). And master tag is the master branch.
For the etcd-operator we had a dev tag built as part of the CI for each PR which is also essentially the master.
https://quay.io/repository/coreos/etcd-operator?tag=latest&tab=tags

And I'm fine with using the org water-hole.
Although this might be a chance to standardize on what organization to use for operator-framework related projects.
Ideally we should have had quay.io/operator-framework org but I think all the other projects like OLM and metering-operator are still in the quay.io/coreos org so those probably won't move. Meaning we can go with water-hole for now.
/cc @spahl

@shawn-hurley We'll need to give write access for the water-hole org to the SDK maintainers if we plan on using that in the future.

* change `add api` to `add crd`
* clarify that `helm-charts` directory will be used to detect that
operator type is `helm`
* add base image build and tag plan
* Change `release.ReleaseManager` back to `release.Manager`
* Improve description of `pkg/helm/controller` package
* Clarifying flags and features of `add crd` subcommand
* Clarifying that `test` command is not supproted
* Clarifying `new` command flag requirements and support
Copy link
Contributor

@hasbro17 hasbro17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The migration story from Helm to a hybrid Helm/Go project via a CLI command is something that we'll follow up outside of the initial integration #670

@joelanford joelanford merged commit 158c004 into operator-framework:master Nov 5, 2018
@joelanford joelanford deleted the proposal-helm branch November 5, 2018 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants