-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rename OperatorVersion CRs from name-<operatorVersion>
to name-<appVersion>-<operatorVersion>
#1684
Conversation
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Fixed bug in version sorting Fixed panic while casting Added test for FilterByName Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
@@ -19,7 +19,7 @@ docker build . \ | |||
-t "kudobuilder/controller:$KUDO_VERSION" | |||
|
|||
rm -rf operators | |||
git clone https://github.com/kudobuilder/operators | |||
git clone --branch "an/rename-operatorversions" https://github.com/kudobuilder/operators |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to be removed before merging - It requires changes on the operator tests
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
pkg/kudoctl/kudoinit/migration/01_migrate_ov_names/migrate_ov_names.go
Outdated
Show resolved
Hide resolved
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
It's just an idea but if we put the |
How so? The |
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com> # Conflicts: # pkg/kudoctl/packages/convert/resources.go
Revert server side incluster_resolver Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to see that this is introducing migration to handle these changes now and in the future gracefully without breaking existing installations 👏
I do have questions around these migrations though.
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Aleksey Dukhovniy <alex.dukhovniy@googlemail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got through the half of it and I think, that there is potentially a way to change OV
naming without the migrations. In the end, most of the code does not care whether the OV
is named one way or another. The only part where we need to be able to uniquely reference the OV
from operator name
, operatorVersionn
and the appVersion
is the server-side InClusterResolver, used to check whether all dependencies exist in the cluster.
If we would bump the current API version along with the OV
naming rules, and make the func OperatorVersionName return different names, according to each API version, we should be able to avoid the migration altogether. I have nothing against it, but it is always a risk.
return ovVersionCompare < 0 | ||
} | ||
|
||
// Compares two versions - tries to use semantic versioning first, falls back to string compare. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We already expect and verify semver for all versions and fail otherwise. String comparison seems unnecessary but probably doesn't hurt. I've pushed an addition to VersionVerifier to check the appVersion too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the appVersion is not required to be a SemVer: https://github.com/kudobuilder/kudo/blob/main/keps/0019-package-api-versioning.md#application-version-app-version
That's why we have string compare in there, and we shouldn't verify the appVersion to be SemVer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Funny, I looked in that KEP and found that appVersion
does follow the SemVer: https://github.com/kudobuilder/kudo/blob/main/keps/0019-package-api-versioning.md#semantic-versioning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Huh, interesting. I think the reason for having it non-semVer was that the appVersion should reflect the actual version of the used application, which may not be SemVer for all Apps.
@nfnt @zmalik Do you remember which way it actually should be? Looks like we need to fix the KEP-19 to make this clear.
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Some nits.
) | ||
|
||
// ForEachNamespace calls the given function for all namespaces in the cluster | ||
func ForEachNamespace(client *kube.Client, f func(ns string) error) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: These functions would make sense in pkg/kudoctl/util/kudo
as well. But with a kube.Client
and kudo.Client
, it looks like our usage of the Kubernetes client API will need some refactoring first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeaaaah.... The usage of the different clients is something that really should be refactored and cleaned up. Same with the loggers.
We should have established rules which client(s) and which logger(s) is used where, (i.e. CLI code, Manager code and shared code), but that's a discussion for a different time :)
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have few concerns around the migration mainly that it will potentially make things worse leaving broken relations between I and OV behind with no easy way to fix it.
Since proper OV naming is only needed for dependencies and dependencies are such a new feature and not really used that much, migration would probably not be my option.
By the way - I think that if we don't do this whole migration the only problem is that I might have kafka installed and then I install something depending on kafka and the versions won't be matches and because of that I will get new kafka installed with the proper naming. As I find this scenario as a whole so unlikely I would even be fine with no migration at all :-)
newOvList.Sort() | ||
|
||
// Find first matching OV | ||
ov, _ := newOvList.FindFirstMatch(name, operatorVersion, appVersion).(*kudoapi.OperatorVersion) // nolint:errcheck |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since you're ignoring errors here, please add an comment why that's ok to do :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't it possible that FilterByName still does not yield anything and this ends with error? I don't know, I would rather handle it :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've added a comment.
FindFirstMatch
does not return an error - the error could happen when casting to the OperatorVersion. But we handle the nil
case anyway two lines below, so there's no need to really handle the error.
func fullyQualifiedName(kt kudoapi.KudoOperatorTaskSpec) string { | ||
return fmt.Sprintf("%s-%s", kudoapi.OperatorVersionName(kt.Package, kt.OperatorVersion), kt.AppVersion) | ||
operatorVersion := kt.OperatorVersion | ||
if operatorVersion == "" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the dependency can really depend on ANY OV? 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume so. I think it's perfectly fine to depend on "package: zookeeper, appVersion: 2.5, operatorVersion: any"
return migration.ForEachNamespace(m.client, m.migrateInNamespace) | ||
} | ||
|
||
func (m *MigrateOvNames) migrateInNamespace(ns string) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so this will be run when you upgrade to 0.17 but will this be run again when migrating from 0.17 to 0.18? I don't think we want to do that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would, yes - And it doesn't matter, the migrations are (or should be) idempotent. You can re-run them again and again, they shouldn't modify anything after the first run.
|
||
func (m *MigrateOvNames) migrateInNamespace(ns string) error { | ||
clog.V(1).Printf("Run OperatorVersion name migration on namespace %q", ns) | ||
if err := migration.ForEachOperatorVersion(m.client, ns, m.migrateOperatorVersion); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not really atomic because before you migrate instance, it has no OV linked or the link is rather broken :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so what if this fails in the middle? we'll leave the cluster in broken state with no easy way to fix this :/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe the re-run of upgrade would actually fix itself? 🤔 but will it even work if manager is down and everything is "in between"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - the migration could have been fixed to be completely reentrant, the code that you reviewed wasn't though. I've removed the migration now, but we should keep that in mind for future migrations.
clog.V(1).Printf("Adjust OperatorVersion of Instance %q", i.Name) | ||
|
||
// Set OperatorVersion | ||
i.Spec.OperatorVersion.Name = newOwner.Name |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, I'm not sure operators are prepared to deal with such change at runtime. If {{ .OperatorName }}
was used by the deploy plan, the value might have trickled down deep into resources, get exposed in UIs, URLs, etc 🤔
Does this change, when applied, cause the update or deploy plan to be triggered?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I didn't even think about the "OperatorName" variable -.-
I have removed the migration for now - We'll still have to test an upgrade with more complex operators that use dependencies.
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com>
I kind of agree - The migration in the current state is not safe enough and could lead to a broken state. I think this could be fixed though. The bigger questions seems to be be: Do we want a migration here or not? On one hand I think you're right, it's probably not that big of a deal, and we could probably get away without a migration. On the other hand, it feels wrong to keep an installation in a "wrong" condition. I have removed the migration for now. Let's see how this works out :) |
Signed-off-by: Andreas Neumann <aneumann@mesosphere.com> # Conflicts: # go.mod # go.sum
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I prefer no migration to "not always safe migration" so 👍 We should take time to document this though in the release notes cc. @nfnt since he's a release manager for the next release :)
What this PR does / why we need it:
Rename OperatorVersion CRs from
name-<operatorVersion>
toname-<appVersion>-<operatorVersion>
Migrate existing
OperatorVersion
andInstance
CRs to new naming schemeFixes #1681
Requires adjustments in operator repository, PR is here: kudobuilder/operators#295