Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ Update to Kubernetes 1.30 #3140

Merged
merged 27 commits into from
Jul 29, 2024
Merged

✨ Update to Kubernetes 1.30 #3140

merged 27 commits into from
Jul 29, 2024

Conversation

embik
Copy link
Member

@embik embik commented Jul 15, 2024

Summary

This updates kcp to Kubernetes 1.30.0. The corresponding branch for our Kubernetes fork is kcp-1.30. Mainly this PR adds new APIs or removes feature gates where they have been removed by upstream.

The bigger changes in this PR are:

  • Updating to the new ValidatingAdmissionPolicy plugin infrastructure.
  • Rewrite the OpenAPI controller we recently added to accommodate for new upstream types.

The following bigger TODOs exist, but I would like to tackle them separately:

  • Look at enabling the flags I've disabled in this PR, most notably --authentication-config, which seems like a hugely useful feature for a multi-tenant API server like KCP.

Related issue(s)

Fixes #3068

Release Notes

Update to Kubernetes 1.30.0

@embik embik added this to the 1.30-rebase milestone Jul 15, 2024
@embik embik requested review from xrstf, sttts and mjudeikis July 15, 2024 10:16
@kcp-ci-bot kcp-ci-bot added release-note Denotes a PR that will be considered when it comes time to generate release notes. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 15, 2024
@mjudeikis
Copy link
Contributor

🥳 🔥 🎆

@kcp-ci-bot kcp-ci-bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 15, 2024
@embik
Copy link
Member Author

embik commented Jul 15, 2024

/retest

@mjudeikis
Copy link
Contributor

/lgtm
/approve
/hold
@sttts wanna read through or fix on master? :)

@kcp-ci-bot kcp-ci-bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm Indicates that a PR is ready to be merged. labels Jul 15, 2024
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 2c7ea837cae12d637bf20e30cd8bef62fb6bd5bc

@kcp-ci-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mjudeikis

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@kcp-ci-bot kcp-ci-bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 15, 2024
pkg/admission/plugins.go Outdated Show resolved Hide resolved
pkg/openapi/value.go Outdated Show resolved Hide resolved
@@ -261,6 +272,16 @@ var BuiltInAPIs = []internalapis.InternalAPI{
Instance: &admissionregistrationv1alpha1.ValidatingAdmissionPolicyBinding{},
ResourceScope: apiextensionsv1.ClusterScoped,
},
{
Names: apiextensionsv1.CustomResourceDefinitionNames{
Plural: "validatingadmissionpolicybindings",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder whether we better want a quick e2e test that checks that the policies work as expected. Namely, the should of course apply to objects in the same logical cluster, but also next to an APIExport? This would mean we have to replicate them via the cache server.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I definitely want an e2e test for this (there's a note on it in the PR description), but I didn't want to squeeze it into this PR. Do you think it would be better to do this immediately?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the API is enabled, we should test it. We can disable it and postpone to followup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test has been added. It was in fact not working, I had to fix things on k8s and kcp side. Key commits:

Comment on lines 110 to 112
policy, err = kubeClusterClient.Cluster(paths[0]).AdmissionregistrationV1().ValidatingAdmissionPolicies().Create(ctx, policy, metav1.CreateOptions{})

require.NoError(t, err, "failed to create ValidatingAdmissionPolicy")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
policy, err = kubeClusterClient.Cluster(paths[0]).AdmissionregistrationV1().ValidatingAdmissionPolicies().Create(ctx, policy, metav1.CreateOptions{})
require.NoError(t, err, "failed to create ValidatingAdmissionPolicy")
policy, err = kubeClusterClient.Cluster(paths[0]).AdmissionregistrationV1().ValidatingAdmissionPolicies().Create(ctx, policy, metav1.CreateOptions{})
require.NoError(t, err, "failed to create ValidatingAdmissionPolicy")

// Policies take a second to compile. So far there doesn't seem to be a way
// to query if a policy has been successfully processed and is active, so we
// just sleep for a few seconds.
time.Sleep(5 * time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

timeouts always break eventually. Better just repeat the check below until the error appears.

}

// badCowboy violates the policy, so it should be rejected
t.Logf("Creating bad cowboy resource in first logical cluster")
Copy link
Member

@sttts sttts Jul 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style: combine both Log and comment. Having both is noisy.

t.Logf("Creating bad cowboy in first workspace should fail")

@sttts
Copy link
Member

sttts commented Jul 25, 2024

Modulo the e2e style comments, lgtm.

@embik embik force-pushed the k8s-1.30 branch 2 times, most recently from 9d2ab4d to 291c41d Compare July 25, 2024 13:21
@embik
Copy link
Member Author

embik commented Jul 25, 2024

Looks like the failing tests aren't a flake:

maximalpermissionpolicy_authorizer_test.go:232: 
        	Error Trace:	/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/apibinding/maximalpermissionpolicy_authorizer_test.go:404
        	            				/home/prow/go/src/github.com/kcp-dev/kcp/test/e2e/apibinding/maximalpermissionpolicy_authorizer_test.go:232
        	Error:      	Received unexpected error:
        	            	cowboys.wildwest.dev "cowboy-e2e-workspace-fhj5z" is forbidden: ValidatingAdmissionPolicy 'policy-4lr9q' with binding 'binding-4vxvf' denied request: expression 'object.spec.intent != 'bad'' resulted in error: no such key: intent
        	Test:       	TestMaximalPermissionPolicyAuthorizer
        	Messages:   	error creating cowboy in consumer workspace "root:e2e-workspace-jmr78:e2e-workspace-fhj5z"

This isn't always triggered, but it needs to be addressed.

@embik
Copy link
Member Author

embik commented Jul 29, 2024

Addressed the previous error by reworking the internal informer/lister built by the generic policy plugin framework. Should work now. 🤞🏻

@embik embik requested a review from sttts July 29, 2024 10:33
@@ -136,7 +141,8 @@ func (k *KubeValidatingAdmissionPolicy) Validate(ctx context.Context, a admissio
return err
}

return delegate.Validate(ctx, a, o)
err = delegate.Validate(ctx, a, o)
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: the previous was fine

plugin.SetSourceFactory(func(_ informers.SharedInformerFactory, client kubernetes.Interface, dynamicClient dynamic.Interface, restMapper meta.RESTMapper, clusterName logicalcluster.Name) generic.Source[validating.PolicyHook] {
return generic.NewPolicySource(
k.globalKubeSharedInformerFactory.Admissionregistration().V1().ValidatingAdmissionPolicies().Informer().Cluster(clusterName),
k.globalKubeSharedInformerFactory.Admissionregistration().V1().ValidatingAdmissionPolicyBindings().Informer().Cluster(clusterName),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you add replication of policies? I would have expected this to fail if not. The local informers have the local shard objects.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/kcp-dev/kcp/pull/3140/files#diff-56ef0431fd7b0e03ab56df4aba923c02c16a6e523bd27230cd3aa856203c5f9eR228

This is where replication is configured. Through InstallIndexers, this is is wired into (s *Server) addIndexersToInformers, which returns the list of replicated GVRs. This is a relatively recent change made in #3139, which was necessary to that informer installation doesn't create a race condition on leader election changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack

},
}
policy, err = kubeClusterClient.Cluster(ws1Path).AdmissionregistrationV1().ValidatingAdmissionPolicies().Create(ctx, policy, metav1.CreateOptions{})

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit/style: no empty line before error checks.

return true
}
}
t.Logf("Unexpected error when trying to create bad cowboy: %s", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: a success is not unexpected while waiting for the policy to apply.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is within a check for err != nil, so it only logs any error that's not the policy rejection. If resource creation succeeds, it just returns false but doesn't log this line.

…add e2e conformance test

Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
Signed-off-by: Marvin Beckers <marvin@kubermatic.com>
@embik
Copy link
Member Author

embik commented Jul 29, 2024

/retest

flake.

@sttts
Copy link
Member

sttts commented Jul 29, 2024

/lgtm

@kcp-ci-bot kcp-ci-bot added the lgtm Indicates that a PR is ready to be merged. label Jul 29, 2024
@kcp-ci-bot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 4e71586e672cb06c291d6f5a0b9ff4367f665ba1

@embik
Copy link
Member Author

embik commented Jul 29, 2024

/hold cancel

@kcp-ci-bot kcp-ci-bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 29, 2024
@kcp-ci-bot kcp-ci-bot merged commit d3e6d82 into kcp-dev:main Jul 29, 2024
19 checks passed
@embik embik deleted the k8s-1.30 branch July 29, 2024 14:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. dco-signoff: yes Indicates the PR's author has signed the DCO. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

epic: Kubernetes 1.30
4 participants