Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

all: upgrade to k8s v1.24.15 #1398

Merged
merged 1 commit into from
Jun 25, 2023

Conversation

eahydra
Copy link
Member

@eahydra eahydra commented Jun 19, 2023

Ⅰ. Describe what this PR does

  • Upgrade to k8s 0.24.15 (v1.24+).
  • All tests passed.
  • When building components, you need to upgrade to golang 1.18 and above.

known compatibility issues:

  • The scheduler no longer supports insecure serving, and can only query metrics through the secure port. However, the insecure configuration is retained in this PR, and the IP and port to be bound need to be set through the command line parameters port and address.
  • The built-in volumebinding plugin of k8s scheduler framework depends on v1.CSIStorageCapacity , which will cause koord-scheduler to fail to run in v1.22 and earlier versions.
  • The k8s scheduler framework also Watch v1.CSIStorageCapacity, causing the same problem as above. The specific code can be found in pkg/scheduler/eventhandlers.go#L353
  • The k8s scheduler framework changed the PreFilter interface to support returning a list of schedulable nodes.
  • CRI v1alpha2.RuntimeServiceServer adds two new methods PodSandboxStats/ListPodSandboxStats

NOTE:

Compatibility issues brought by v1.CSIStorageCapacity will be more troublesome, which will involve the EventHandlers code of the scheduler framework. This place is not easy to modify, unless we maintain a scheduler framework code separately and do some special processing.

According to the discussion, the solution is as follows:

  • Intercept the Informer construction method of v1.CSIStorageCapacity and forge a custom Informer. This Informer will be responsible for converting v1beta1's CSIStorageCapacity to v1.CSIStorageCapacity.
  • Versions below k8s v1.22 that does not support v1beta1.CSIStorageCapacity or v1.CSIStorageCapacity, the users need to enable the feature-gate DisableCSIStorageCapacityInformer to disable sync the objects via informer.
  • The k8s v1.22 version that support v1beta1.CSIStorageCapacity, enable the feature-gate CompatibleCSIStorageCapacity to sync v1beta1.CSIStorageCapacity and tranform the objects to v1.CSIStorageCapacity.

Ⅱ. Does this pull request fix one issue?

Ⅲ. Describe how to verify it

Ⅳ. Special notes for reviews

V. Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in make test

@codecov
Copy link

codecov bot commented Jun 19, 2023

Codecov Report

Patch coverage: 49.09% and project coverage change: -0.17 ⚠️

Comparison is base (5ad6e81) 64.75% compared to head (9bc2722) 64.58%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1398      +/-   ##
==========================================
- Coverage   64.75%   64.58%   -0.17%     
==========================================
  Files         333      334       +1     
  Lines       34297    34265      -32     
==========================================
- Hits        22209    22131      -78     
- Misses      10453    10521      +68     
+ Partials     1635     1613      -22     
Flag Coverage Δ
unittests 64.58% <49.09%> (-0.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
pkg/koordlet/util/system/system_file.go 66.66% <ø> (ø)
pkg/scheduler/frameworkext/scheduler_adapter.go 0.00% <0.00%> (ø)
.../plugins/compatibledefaultpreemption/preemption.go 77.50% <ø> (-7.87%) ⬇️
pkg/scheduler/plugins/coscheduling/coscheduling.go 70.90% <0.00%> (ø)
pkg/scheduler/plugins/elasticquota/preempt.go 1.91% <0.00%> (-38.82%) ⬇️
pkg/scheduler/plugins/elasticquota/plugin.go 57.14% <14.28%> (-9.07%) ⬇️
pkg/scheduler/frameworkext/framework_extender.go 63.85% <42.85%> (ø)
pkg/scheduler/frameworkext/informers.go 70.58% <70.58%> (ø)
pkg/scheduler/plugins/nodenumaresource/plugin.go 62.16% <75.00%> (ø)
pkg/scheduler/plugins/deviceshare/plugin.go 64.80% <100.00%> (ø)
... and 1 more

... and 4 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@eahydra eahydra force-pushed the update_k8s_to_124 branch 3 times, most recently from 905404d to 898b50b Compare June 20, 2023 04:08
@eahydra
Copy link
Member Author

eahydra commented Jun 20, 2023

Several methods to solve v1.CSIStorageCapacity's incompatibility with k8s v1.22 version:

  1. Intercept the Informer construction method of v1.CSIStorageCapacity and forge a custom Informer. This Informer will be responsible for converting v1beta1's CSIStorageCapacity to v1.CSIStorageCapacity.
  • Pros. is that it can be seamlessly compatible with versions 1.22 and earlier;
  • Cons. is that the implementation is tricky
  1. Fork k8s v1.24 code to koordinator-sh and modify Scheduler Framework related code. And use the fork version by replacing in the go.mod of koordinator.
  • Pros. is that the changes are clearer;
  • Cons. is that a separate Fork branch is maintained, and the maintenance cost is relatively high.

WDYT @hormes @FillZpp

@eahydra eahydra requested a review from ZiMengSheng June 20, 2023 08:29
@eahydra eahydra changed the title all: upgrade to k8s 0.24.15 all: upgrade to k8s v1.24.15 Jun 20, 2023
@FillZpp
Copy link
Member

FillZpp commented Jun 20, 2023

Several methods to solve v1.CSIStorageCapacity's incompatibility with k8s v1.22 version:

  1. Intercept the Informer construction method of v1.CSIStorageCapacity and forge a custom Informer. This Informer will be responsible for converting v1beta1's CSIStorageCapacity to v1.CSIStorageCapacity.
  • Pros. is that it can be seamlessly compatible with versions 1.22 and earlier;
  • Cons. is that the implementation is tricky
  1. Fork k8s v1.24 code to koordinator-sh and modify Scheduler Framework related code. And use the fork version by replacing in the go.mod of koordinator.
  • Pros. is that the changes are clearer;
  • Cons. is that a separate Fork branch is maintained, and the maintenance cost is relatively high.

WDYT @hormes @FillZpp

@eahydra Prefer #1. We could have a flag/featuregate or automatically discovery to decide whether v1 or v1 wrapper should be used.

@eahydra
Copy link
Member Author

eahydra commented Jun 20, 2023

Several methods to solve v1.CSIStorageCapacity's incompatibility with k8s v1.22 version:

  1. Intercept the Informer construction method of v1.CSIStorageCapacity and forge a custom Informer. This Informer will be responsible for converting v1beta1's CSIStorageCapacity to v1.CSIStorageCapacity.
  • Pros. is that it can be seamlessly compatible with versions 1.22 and earlier;
  • Cons. is that the implementation is tricky
  1. Fork k8s v1.24 code to koordinator-sh and modify Scheduler Framework related code. And use the fork version by replacing in the go.mod of koordinator.
  • Pros. is that the changes are clearer;
  • Cons. is that a separate Fork branch is maintained, and the maintenance cost is relatively high.

WDYT @hormes @FillZpp

@eahydra Prefer #1. We could have a flag/featuregate or automatically discovery to decide whether v1 or v1 wrapper should be used.

OK. According to the first method, it is controlled by featureGate.

@eahydra
Copy link
Member Author

eahydra commented Jun 20, 2023

cmd/koordlet/main.go Outdated Show resolved Hide resolved
@eahydra eahydra force-pushed the update_k8s_to_124 branch 2 times, most recently from c9b7b53 to 7f41524 Compare June 21, 2023 02:25
@eahydra eahydra requested a review from saintube June 21, 2023 02:25
Signed-off-by: Joseph <joseph.t.lee@outlook.com>
@ZiMengSheng
Copy link
Contributor

/lgtm

@koordinator-bot koordinator-bot bot added the lgtm label Jun 25, 2023
@eahydra eahydra added this to the v1.3 milestone Jun 25, 2023
Copy link
Member

@saintube saintube left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@hormes
Copy link
Member

hormes commented Jun 25, 2023

  • d earlier;

Prefer 1, the tricky part is for compatibility and is worth the cost.

@koordinator-bot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hormes

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@koordinator-bot koordinator-bot bot merged commit a4b2ed0 into koordinator-sh:main Jun 25, 2023
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants