Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to k8s 1.21.0 tool chain #3926

Merged
merged 9 commits into from
May 17, 2021
Merged

Update to k8s 1.21.0 tool chain #3926

merged 9 commits into from
May 17, 2021

Conversation

tamalsaha
Copy link
Contributor

@tamalsaha tamalsaha commented Apr 27, 2021

Fixes #3714

What this PR does / why we need it:
Update to k8s 1.21.0 tool chain

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:

Release note:

Updates Kubernetes libaries to v1.21.0

@jetstack-bot jetstack-bot added release-note-none Denotes a PR that doesn't merit a release note. dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. needs-kind Indicates a PR lacks a `kind/foo` label and requires one. area/deploy Indicates a PR modifies deployment configuration area/testing Issues relating to testing labels Apr 27, 2021
@jetstack-bot
Copy link
Contributor

Hi @tamalsaha. Thanks for your PR.

I'm waiting for a jetstack or cert-manager member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@jetstack-bot jetstack-bot added needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Apr 27, 2021
@tamalsaha
Copy link
Contributor Author

It seems that the internal api conversion generation is failing. Any idea how to fix this?

@irbekrm
Copy link
Contributor

irbekrm commented Apr 27, 2021

Thank you for the PR!
The failure comes from these lines for this and other APIs.

// FIXME: Provide conversion function to convert apismetav1.ObjectReference to meta.ObjectReference
compileErrorOnMissingConversion()

I think it needs those missing conversion functions implemented.

@tamalsaha
Copy link
Contributor Author

It seems like a k8s bug kubernetes/kubernetes#98380 . Those conversion functions exist but the code generator can't "find" them.

@irbekrm
Copy link
Contributor

irbekrm commented Apr 27, 2021

I see that it is all the meta types' conversion functions from this file that it cannot find.
I wonder why it is those in particular- the author of kubernetes/kubernetes#98380 says that they observe it for types containing an array of pointers to custom objects which does not seem to be the case here.

It seems like the compileOnErrorMissingConversion is added from here in conversion-gen.
I see that the compileErrorOnMissingConversion() was introduced with this commit - looking at our current code it looks like we were hitting that code path already before (i.e here)- it's just that the conversion-gen is now adding compile errors for that.

For comparison, the other custom type of which we have fields in other types is acme(i.e here) and conversion-gen seems to be able to find conversion functions for those types (I didn't see any compileOnErrorMissingConversion added).

@tamalsaha
Copy link
Contributor Author

I believe I have found the root cause of this regression. I have detailed it here kubernetes/kubernetes#101567 and proposed my fix.

This fix is working for our repos where conversion-gen was generating bad code. I tried to do the similar thing in this repo but my knowledge of bazel is limited. So, I could not test it. If you want to test the fix, you need to use my forked conversion-gen binary.

replace k8s.io/code-generator => github.com/kmodules/code-generator v0.21.1-rc.0.0.20210428003838-7eafae069eb0

replace k8s.io/gengo => github.com/kmodules/gengo v0.0.0-20210428002657-a8850da697c2

The problem is that when I run the make generate command proper replace directives are not add to the BUILD.bazel files. If you can add those, we can test this fix here also.

Thanks.

@irbekrm
Copy link
Contributor

irbekrm commented Apr 28, 2021

Thanks for looking into this!

./hack/update-all.sh updates the Bazel files (it will update the versions of gengo and code-generator here for Bazel to pull in).

I tried to run it against the fork of this PR, but still got the same compile errors added- I have not done yet done any extensive testing though.

@tamalsaha
Copy link
Contributor Author

@irbekrm
Copy link
Contributor

irbekrm commented Apr 28, 2021

The generated bazel file does not add replace directive

Do you mean that it doesn't add it when you run gazelle locally?

When I run ./hack/update-all.sh against your fork, I get the replace directives added to go_repository rules, see the diff

...
index 600b90510..b8c5030f2 100644
--- a/hack/build/repos.bzl
+++ b/hack/build/repos.bzl
@@ -3194,9 +3194,9 @@ def go_repositories():
         build_file_generation = "on",
         build_file_proto_mode = "disable",
         importpath = "k8s.io/code-generator",
-        replace = "k8s.io/code-generator",
-        sum = "h1:SQaysped4EtUDk3u1zphnUJiOAwFdhHx9xS3WKAE0x8=",
-        version = "v0.20.2",
+        replace = "github.com/kmodules/code-generator",
+        sum = "h1:dWTsct9tw9So5JuUvAIkIeOjXMlWN04inrf4afVSKug=",
+        version = "v0.21.1-rc.0.0.20210428003838-7eafae069eb0",
     )
     go_repository(
         name = "io_k8s_component_base",
@@ -3220,8 +3220,9 @@ def go_repositories():
         build_file_generation = "on",
         build_file_proto_mode = "disable",
         importpath = "k8s.io/gengo",
-        sum = "h1:JApXBKYyB7l9xx+DK7/+mFjC7A9Bt5A93FPvFD0HIFE=",
-        version = "v0.0.0-20201113003025-83324d819ded",
+        replace = "github.com/kmodules/gengo",
+        sum = "h1:3g7yW/ANlYc3HOCAak24ET66EuFyLQB72csFHAElqWk=",
+        version = "v0.0.0-20210428002657-a8850da697c2",
...

(Using Bazel v4.0.0)

@wallrj
Copy link
Member

wallrj commented Apr 28, 2021

/ok-to-test

@jetstack-bot jetstack-bot added ok-to-test and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Apr 28, 2021
@tamalsaha
Copy link
Contributor Author

Now blocked on #3935

@irbekrm
Copy link
Contributor

irbekrm commented Apr 29, 2021

Now blocked on #3935

I think the build currently breaking might be to do with google/gnostic bump v0.4 -> v0.5.

Here is the output of bazel build //... from this fork

external/io_k8s_apiserver/pkg/util/openapi/proto.go:43:36: cannot use info (type "gopkg.in/yaml.v2".MapSlice) as type *"gopkg.in/yaml.v3".Node in argument to openapi_v2.NewDocument
external/io_k8s_apiserver/pkg/util/openapi/proto.go:43:62: not enough arguments in call to compiler.NewContext
	have (string, nil)
	want (string, *"gopkg.in/yaml.v3".Node, *compiler.Context)
compilepkg: error running subcommand external/go_sdk/pkg/tool/linux_amd64/compile: exit status 2

It seems like there was a breaking change in gnostic in v0.5.0, see here that requires some action from clients.

kubernetes/apiserver is currently using gnostic v0.4.1, so it won't be compatible- seems like that might be causing this build to fail.

I see that it that it has been updated to v0.5.1 in kubernetes/kubernetes though, but has not synced yet? (here)

(bazel build //... succeeds on the current master).

@tamalsaha
Copy link
Contributor Author

@irbekrm , thanks for keeping company here :)

So, the code-generator issue is resolved. I see that it is working. But the conversion-gen needs things to be done in correct order because the generated code from one steps (meta and acme package) is used in the certmanager packgae.

Besides that there are a number of things breaking here:

  • Build is broken #3935 (internal package is used outside internal). Go complier does not like that.
  • I updated the sigs.k8s.io/controller-runtime to v0.9.0-beta.0 as that is the only one that uses k8s 1.21.0. But that is causing some more problems as the api is different.
  • gnostic - I have not seen that yet. Probably other errors are masquerading that. I am going to check that out.

@irbekrm
Copy link
Contributor

irbekrm commented Apr 29, 2021

#3935 (internal package is used outside internal). Go complier does not like that

Just a heads up that the latest CI failure for this PR is not from that ^. We only build with Bazel in CI, see #3935 (comment)

@jetstack-bot jetstack-bot added the area/acme Indicates a PR directly modifies the ACME Issuer code label Apr 29, 2021
@tamalsaha
Copy link
Contributor Author

/test pull-cert-manager-bazel

@tamalsaha
Copy link
Contributor Author

@irbekrm @wallrj could please take a look? I don't understand what the failed tests mean. I don't see anything specific in the logs.

@jetstack-bot jetstack-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 30, 2021
@irbekrm
Copy link
Contributor

irbekrm commented May 4, 2021

Hi, sorry for the delay.

I see that the unit tests in pkg/issuer/acme/http are failing, there's some more logs if you run them directly without Bazel.

Logs from running pkg/issuer/acme/http/ingress_test.go:TestGetIngressesForChallenge:

....
I0504 20:13:23.151298 2163903 reflector.go:203] Reflector from pkg/mod/k8s.io/client-go@v0.21.0/tools/cache/reflector.go:167 configured with expectedType of *unstructured.Unstructured with empty GroupVersionKind.
I0504 20:13:23.151321 2163903 reflector.go:219] Starting reflector *unstructured.Unstructured (10ms) from pkg/mod/k8s.io/client-go@v0.21.0/tools/cache/reflector.go:167
I0504 20:13:23.151329 2163903 reflector.go:255] Listing and watching *unstructured.Unstructured from pkg/mod/k8s.io/client-go@v0.21.0/tools/cache/reflector.go:167
E0504 20:13:23.151449 2163903 runtime.go:76] Observed a panic: coding error: you must register resource to list kind for every resource you're going to LIST when creating the client.  See NewSimpleDynamicClientWithCustomListKinds or register the list into the scheme: networking.istio.io/v1beta1, Resource=virtualservices out of map[]
goroutine 104 [running]:
k8s.io/apimachinery/pkg/util/runtime.logPanic(0x1af1ce0, 0xc00058e260)
...

I have not looked in depth into this yet, I see that the ...you must register resource to list kind for every resource... error was added with this commit to client-go.

A dynamic client was added to cert-manager with #3724 and I think the error might be thrown when we are waiting for the dynamic informer cache to sync here or somewhere around there- I think we will need to look into registering it as the error message says.

I have not yet looked at why the e2e tests are failing.

@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

If we are merging this and including the change to switch to use MultiLock (which it seems like we are), then I think this is a requirement for 1.4 too so we are consistent :)

We chatted about this in the standup just now and decided on this approach. Do /hold if you disagree @munnerz :)

@munnerz
Copy link
Member

munnerz commented May 17, 2021

Happy to move ahead with switching to multi-lock - I just think we need to be sure to be consistent between our different binary components!

Thanks for all the research into this @irbekrm 😄

@jakexks
Copy link
Member

jakexks commented May 17, 2021

/lgtm
/test pull-cert-manager-bazel

@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

Hi @tamalsaha,

Apologies, it looks like a fake secrets client that was added in a recent PR is now breaking the tests as its interface has changed in the bumped k8s api library.

Would it please be possible to update the fake client in test/unit/coreclients/secrets.go with:

 )
diff --git a/test/unit/coreclients/secrets.go b/test/unit/coreclients/secrets.go
index 419717740..d67af87a0 100644
--- a/test/unit/coreclients/secrets.go
+++ b/test/unit/coreclients/secrets.go
@@ -25,6 +25,7 @@ import (
        metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
        "k8s.io/apimachinery/pkg/types"
        "k8s.io/apimachinery/pkg/watch"
+       applyconfigurationscorev1 "k8s.io/client-go/applyconfigurations/core/v1"
        typedcorev1 "k8s.io/client-go/kubernetes/typed/core/v1"
 )
 
@@ -87,10 +88,15 @@ type fakeSecretClient struct {
        ListFn             func() (*corev1.SecretList, error)
        WatchFn            func() (watch.Interface, error)
        PatchFn            func() (*corev1.Secret, error)
+       ApplyFn            func() (*corev1.Secret, error)
        // Currently there is no need to mock this interface
        typedcorev1.SecretExpansion
 }
 
+func (f *fakeSecretClient) Apply(context.Context, *applyconfigurationscorev1.SecretApplyConfiguration, metav1.ApplyOptions) (*corev1.Secret, error) {
+       return f.ApplyFn()
+}
+

Doing this and then running ./hack/update-all.sh should fix it and get this merged 😅

@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

Happy to move ahead with switching to multi-lock - I just think we need to be sure to be consistent between our different binary components!

Thanks for raising the issue with leader election @munnerz 😄 . My bad for initially suggesting that this change will swap ConfigMaps for Leases!
I have created an issue for changing the controller to use Multilock #4013 which we should do before v1.4

Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
Signed-off-by: Tamal Saha <tamal@appscode.com>
@jetstack-bot jetstack-bot removed the lgtm Indicates that a PR is ready to be merged. label May 17, 2021
@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

Thanks @tamalsaha ! I think the latest error would be fixed by running ./hack/update-all.sh

Signed-off-by: Tamal Saha <tamal@appscode.com>
@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

Woohoo 🥳

/lgtm

@jetstack-bot jetstack-bot added the lgtm Indicates that a PR is ready to be merged. label May 17, 2021
@tamalsaha
Copy link
Contributor Author

Thanks, @irbekrm ! What will get it merged?

@irbekrm
Copy link
Contributor

irbekrm commented May 17, 2021

It should get merged automatically by our CI once pull-cert-manager-e2e-v1-20 test passes. It's just re-testing after the /lgtm now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/acme Indicates a PR directly modifies the ACME Issuer code area/deploy Indicates a PR modifies deployment configuration area/testing Issues relating to testing dco-signoff: yes Indicates that all commits in the pull request have the valid DCO sign-off message. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm Indicates that a PR is ready to be merged. ok-to-test release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update k8s toolchain to 1.21.x
10 participants