Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename OperatorVersion CRs from name-<operatorVersion> to name-<appVersion>-<operatorVersion> #1684

Merged
merged 25 commits into from
Oct 9, 2020

Conversation

ANeumann82
Copy link
Member

@ANeumann82 ANeumann82 commented Sep 15, 2020

What this PR does / why we need it:

Rename OperatorVersion CRs from name-<operatorVersion> to name-<appVersion>-<operatorVersion>
Migrate existing OperatorVersion and Instance CRs to new naming scheme

Fixes #1681

Requires adjustments in operator repository, PR is here: kudobuilder/operators#295

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Fixed bug in version sorting
Fixed panic while casting
Added test for FilterByName

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@@ -19,7 +19,7 @@ docker build . \
-t "kudobuilder/controller:$KUDO_VERSION"

rm -rf operators
git clone https://github.com/kudobuilder/operators
git clone --branch "an/rename-operatorversions" https://github.com/kudobuilder/operators
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be removed before merging - It requires changes on the operator tests

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@zen-dog
Copy link
Contributor

zen-dog commented Sep 16, 2020

It's just an idea but if we put the operatorVersion first in the name like name-<operatorVersion>-<appVersion> we might be able to stay backward-compatible (with some extra logic) and doesn't have to rename the OVs

@ANeumann82
Copy link
Member Author

It's just an idea but if we put the operatorVersion first in the name like name-<operatorVersion>-<appVersion> we might be able to stay backward-compatible (with some extra logic) and doesn't have to rename the OVs

How so? The appVersion is optional in any case, and if an operator has no appVersion, the name will be name-<operatorVersion> like before, but afaik all of our operators do have appVersions...

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

# Conflicts:
#	pkg/kudoctl/packages/convert/resources.go
Revert server side incluster_resolver

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@ANeumann82 ANeumann82 marked this pull request as ready for review September 22, 2020 13:16
Copy link
Member

@nfnt nfnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see that this is introducing migration to handle these changes now and in the future gracefully without breaking existing installations 👏
I do have questions around these migrations though.

pkg/util/kudo/versions.go Outdated Show resolved Hide resolved
pkg/kudoctl/kudoinit/setup/setup.go Show resolved Hide resolved
ANeumann82 and others added 3 commits September 22, 2020 16:14
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
Copy link
Contributor

@zen-dog zen-dog left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got through the half of it and I think, that there is potentially a way to change OV naming without the migrations. In the end, most of the code does not care whether the OV is named one way or another. The only part where we need to be able to uniquely reference the OV from operator name, operatorVersionn and the appVersion is the server-side InClusterResolver, used to check whether all dependencies exist in the cluster.
If we would bump the current API version along with the OV naming rules, and make the func OperatorVersionName return different names, according to each API version, we should be able to avoid the migration altogether. I have nothing against it, but it is always a risk.

return ovVersionCompare < 0
}

// Compares two versions - tries to use semantic versioning first, falls back to string compare.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already expect and verify semver for all versions and fail otherwise. String comparison seems unnecessary but probably doesn't hurt. I've pushed an addition to VersionVerifier to check the appVersion too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually the appVersion is not required to be a SemVer: https://github.com/kudobuilder/kudo/blob/main/keps/0019-package-api-versioning.md#application-version-app-version

That's why we have string compare in there, and we shouldn't verify the appVersion to be SemVer.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Funny, I looked in that KEP and found that appVersion does follow the SemVer: https://github.com/kudobuilder/kudo/blob/main/keps/0019-package-api-versioning.md#semantic-versioning

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, interesting. I think the reason for having it non-semVer was that the appVersion should reflect the actual version of the used application, which may not be SemVer for all Apps.

@nfnt @zmalik Do you remember which way it actually should be? Looks like we need to fix the KEP-19 to make this clear.

pkg/apis/kudo/v1beta1/operatorversion_types_helpers.go Outdated Show resolved Hide resolved
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Copy link
Member

@nfnt nfnt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Some nits.

pkg/kudoctl/kube/config.go Outdated Show resolved Hide resolved
pkg/kudoctl/kudoinit/migration/migration.go Outdated Show resolved Hide resolved
)

// ForEachNamespace calls the given function for all namespaces in the cluster
func ForEachNamespace(client *kube.Client, f func(ns string) error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: These functions would make sense in pkg/kudoctl/util/kudo as well. But with a kube.Client and kudo.Client, it looks like our usage of the Kubernetes client API will need some refactoring first.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeaaaah.... The usage of the different clients is something that really should be refactored and cleaned up. Same with the loggers.

We should have established rules which client(s) and which logger(s) is used where, (i.e. CLI code, Manager code and shared code), but that's a discussion for a different time :)

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@ANeumann82 ANeumann82 added release/blocker release/breaking-change This PR contains breaking changes and is marked in the release notes labels Oct 2, 2020
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have few concerns around the migration mainly that it will potentially make things worse leaving broken relations between I and OV behind with no easy way to fix it.

Since proper OV naming is only needed for dependencies and dependencies are such a new feature and not really used that much, migration would probably not be my option.

By the way - I think that if we don't do this whole migration the only problem is that I might have kafka installed and then I install something depending on kafka and the versions won't be matches and because of that I will get new kafka installed with the proper naming. As I find this scenario as a whole so unlikely I would even be fine with no migration at all :-)

newOvList.Sort()

// Find first matching OV
ov, _ := newOvList.FindFirstMatch(name, operatorVersion, appVersion).(*kudoapi.OperatorVersion) // nolint:errcheck
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since you're ignoring errors here, please add an comment why that's ok to do :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't it possible that FilterByName still does not yield anything and this ends with error? I don't know, I would rather handle it :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a comment.
FindFirstMatch does not return an error - the error could happen when casting to the OperatorVersion. But we handle the nil case anyway two lines below, so there's no need to really handle the error.

func fullyQualifiedName(kt kudoapi.KudoOperatorTaskSpec) string {
return fmt.Sprintf("%s-%s", kudoapi.OperatorVersionName(kt.Package, kt.OperatorVersion), kt.AppVersion)
operatorVersion := kt.OperatorVersion
if operatorVersion == "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the dependency can really depend on ANY OV? 🤔

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would assume so. I think it's perfectly fine to depend on "package: zookeeper, appVersion: 2.5, operatorVersion: any"

pkg/kudoctl/resources/dependencies/resolve_test.go Outdated Show resolved Hide resolved
pkg/kudoctl/resources/dependencies/resolve_test.go Outdated Show resolved Hide resolved
return migration.ForEachNamespace(m.client, m.migrateInNamespace)
}

func (m *MigrateOvNames) migrateInNamespace(ns string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so this will be run when you upgrade to 0.17 but will this be run again when migrating from 0.17 to 0.18? I don't think we want to do that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would, yes - And it doesn't matter, the migrations are (or should be) idempotent. You can re-run them again and again, they shouldn't modify anything after the first run.


func (m *MigrateOvNames) migrateInNamespace(ns string) error {
clog.V(1).Printf("Run OperatorVersion name migration on namespace %q", ns)
if err := migration.ForEachOperatorVersion(m.client, ns, m.migrateOperatorVersion); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not really atomic because before you migrate instance, it has no OV linked or the link is rather broken :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so what if this fails in the middle? we'll leave the cluster in broken state with no easy way to fix this :/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe the re-run of upgrade would actually fix itself? 🤔 but will it even work if manager is down and everything is "in between"?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah - the migration could have been fixed to be completely reentrant, the code that you reviewed wasn't though. I've removed the migration now, but we should keep that in mind for future migrations.

pkg/apis/kudo/v1beta1/operatorversion_types_helpers.go Outdated Show resolved Hide resolved
clog.V(1).Printf("Adjust OperatorVersion of Instance %q", i.Name)

// Set OperatorVersion
i.Spec.OperatorVersion.Name = newOwner.Name
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I'm not sure operators are prepared to deal with such change at runtime. If {{ .OperatorName }} was used by the deploy plan, the value might have trickled down deep into resources, get exposed in UIs, URLs, etc 🤔
Does this change, when applied, cause the update or deploy plan to be triggered?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. I didn't even think about the "OperatorName" variable -.-

I have removed the migration for now - We'll still have to test an upgrade with more complex operators that use dependencies.

@alenkacz alenkacz modified the milestones: 0.15.0, 0.17.0 Oct 7, 2020
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@ANeumann82
Copy link
Member Author

I have few concerns around the migration mainly that it will potentially make things worse leaving broken relations between I and OV behind with no easy way to fix it.

Since proper OV naming is only needed for dependencies and dependencies are such a new feature and not really used that much, migration would probably not be my option.

By the way - I think that if we don't do this whole migration the only problem is that I might have kafka installed and then I install something depending on kafka and the versions won't be matches and because of that I will get new kafka installed with the proper naming. As I find this scenario as a whole so unlikely I would even be fine with no migration at all :-)

I kind of agree - The migration in the current state is not safe enough and could lead to a broken state. I think this could be fixed though.

The bigger questions seems to be be: Do we want a migration here or not? On one hand I think you're right, it's probably not that big of a deal, and we could probably get away without a migration. On the other hand, it feels wrong to keep an installation in a "wrong" condition.

I have removed the migration for now. Let's see how this works out :)

Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>

# Conflicts:
#	go.mod
#	go.sum
Copy link
Contributor

@alenkacz alenkacz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer no migration to "not always safe migration" so 👍 We should take time to document this though in the release notes cc. @nfnt since he's a release manager for the next release :)

@ANeumann82 ANeumann82 merged commit 5e7546b into main Oct 9, 2020
@ANeumann82 ANeumann82 deleted the an/rename-operatorversions branch October 9, 2020 13:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release/blocker release/breaking-change This PR contains breaking changes and is marked in the release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adjust naming of "operatorversion" CRs
5 participants