-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Decrease resyncPeriodSeconds to 1 min #539
Conversation
Waiting for this fix to be merged #543 |
@misohu please rebase your branch and allow the CI to run. All jobs should pass now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @misohu! I was able to reproduce the issue following canonical/bundle-kubeflow#951 (comment), but the fix does not seem to work for me. I still see the following in the metacontroller-operator
logs:
2024-07-30T16:42:26.736387761Z {"level":"error","ts":1722357746.735958,"msg":"failed to sync kubeflow-pipelines-profile-controller 'v1:Namespace::test': can't reconcile children for Namespace /test: discovery: can't find kind AuthorizationPolicy in apiVersion security.istio.io/v1beta1\n","stacktrace":"metacontroller/pkg/controller/decorator.(*decoratorController).worker\n\t/root/parts/metacontroller/build/pkg/controller/decorator/controller.go:296\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:90\nmetacontroller/pkg/controller/decorator.(*decoratorController).Start.func1.1\n\t/root/parts/metacontroller/build/pkg/controller/decorator/controller.go:270"}
This is my model btw:
$ juju status
Model Controller Cloud/Region Version SLA Timestamp
test-951 uk8s-345 microk8s/localhost 3.4.5 unsupported 16:49:41Z
App Version Status Scale Charm Channel Rev Address Exposed Message
admission-webhook active 1 admission-webhook latest/edge 342 10.152.183.208 no
kfp-profile-controller active 1 kfp-profile-controller latest/edge/pr-539 1465 10.152.183.159 no
metacontroller-operator active 1 metacontroller-operator latest/edge/pr-117 307 10.152.183.133 no
minio res:oci-image@5102166 active 1 minio latest/edge 342 10.152.183.243 no
Unit Workload Agent Address Ports Message
admission-webhook/0* active idle 10.1.60.143
kfp-profile-controller/0* active idle 10.1.60.139
metacontroller-operator/0* active idle 10.1.60.137
minio/0* active idle 10.1.60.142 9000-9001/TCP
I left it running for more than three min and it is still in that state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After talking to @misohu, I confirmed that while the error from the previous message is present, it is (AuthorizationPolicy missing) different from the one reported (PodDefault missing).
I can confirm that the PodDefault is actually created, which is what we care about for this fix.
Thanks @misohu !
Backport: Decrease resyncPeriodSeconds to 1 min (#539)
All the issue details with testing can be found here.