-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(scheduler): prevents scheduling new replicas to terminating clusters #263
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #263 +/- ##
==========================================
+ Coverage 28.59% 28.71% +0.11%
==========================================
Files 113 113
Lines 13549 13570 +21
==========================================
+ Hits 3875 3896 +21
Misses 9292 9292
Partials 382 382
Flags with carried forward coverage won't be shown. Click here to find out more.
☔ View full report in Codecov by Sentry. |
2219b67
to
026c6d6
Compare
ObjectMeta: metav1.ObjectMeta{ | ||
Name: clusterName, | ||
}, | ||
} | ||
if terminating { | ||
cluster.DeletionTimestamp = &metav1.Time{Time: time.Now()} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit (no need to change it for this PR): could simplify this to metav1.Now
su *framework.SchedulingUnit, | ||
clusters []*fedcorev1a1.FederatedCluster, | ||
) { | ||
for _, cluster := range clusters { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we run this regardless of scheduling mode?
The current scheduler does not filter clusters being deleted because it may be necessary to retain resources in member clusters. However, this situation resulted in:
filter
andscore
.The scheduler may schedule new replicas to this terminating cluster. However the sync controller will not dispatch resources to the terminating clusters. This will cause the workload to fail to scale up in federation.
This PR ensures that the terminating cluster will not scale up by setting
MaxReplicas
for the terminating cluster. Specifically:When the
MaxClusters
exists at the same time, the scheduler will also give priority to non-terminated clusters when scheduling is triggered, specifically: