Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decrease resyncPeriodSeconds to 1 min #539

Merged
merged 3 commits into from
Jul 30, 2024

Conversation

misohu
Copy link
Member

@misohu misohu commented Jul 15, 2024

All the issue details with testing can be found here.

@misohu
Copy link
Member Author

misohu commented Jul 25, 2024

Waiting for this fix to be merged #543

@DnPlas
Copy link
Contributor

DnPlas commented Jul 25, 2024

@DnPlas
Copy link
Contributor

DnPlas commented Jul 30, 2024

@misohu please rebase your branch and allow the CI to run. All jobs should pass now.

Copy link
Contributor

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @misohu! I was able to reproduce the issue following canonical/bundle-kubeflow#951 (comment), but the fix does not seem to work for me. I still see the following in the metacontroller-operator logs:

2024-07-30T16:42:26.736387761Z {"level":"error","ts":1722357746.735958,"msg":"failed to sync kubeflow-pipelines-profile-controller 'v1:Namespace::test': can't reconcile children for Namespace /test: discovery: can't find kind AuthorizationPolicy in apiVersion security.istio.io/v1beta1\n","stacktrace":"metacontroller/pkg/controller/decorator.(*decoratorController).worker\n\t/root/parts/metacontroller/build/pkg/controller/decorator/controller.go:296\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/root/go/pkg/mod/k8s.io/apimachinery@v0.23.5/pkg/util/wait/wait.go:90\nmetacontroller/pkg/controller/decorator.(*decoratorController).Start.func1.1\n\t/root/parts/metacontroller/build/pkg/controller/decorator/controller.go:270"}

This is my model btw:

$ juju status
Model     Controller  Cloud/Region        Version  SLA          Timestamp
test-951  uk8s-345    microk8s/localhost  3.4.5    unsupported  16:49:41Z

App                      Version                Status  Scale  Charm                    Channel              Rev  Address         Exposed  Message
admission-webhook                               active      1  admission-webhook        latest/edge          342  10.152.183.208  no
kfp-profile-controller                          active      1  kfp-profile-controller   latest/edge/pr-539  1465  10.152.183.159  no
metacontroller-operator                         active      1  metacontroller-operator  latest/edge/pr-117   307  10.152.183.133  no
minio                    res:oci-image@5102166  active      1  minio                    latest/edge          342  10.152.183.243  no

Unit                        Workload  Agent  Address      Ports          Message
admission-webhook/0*        active    idle   10.1.60.143
kfp-profile-controller/0*   active    idle   10.1.60.139
metacontroller-operator/0*  active    idle   10.1.60.137
minio/0*                    active    idle   10.1.60.142  9000-9001/TCP

I left it running for more than three min and it is still in that state.

Copy link
Contributor

@DnPlas DnPlas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After talking to @misohu, I confirmed that while the error from the previous message is present, it is (AuthorizationPolicy missing) different from the one reported (PodDefault missing).

I can confirm that the PodDefault is actually created, which is what we care about for this fix.

Thanks @misohu !

@DnPlas DnPlas merged commit 7d4de51 into main Jul 30, 2024
52 checks passed
@DnPlas DnPlas deleted the KF-5915-decrease-resyncPeriodSeconds branch July 30, 2024 18:51
misohu added a commit that referenced this pull request Jul 31, 2024
Backport: Decrease resyncPeriodSeconds to 1 min (#539)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants