Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Enable set ServerLoadProtection fot Master/Worker #15439

Merged
merged 1 commit into from
Jan 14, 2024

Conversation

ruanwenjun
Copy link
Member

Purpose of the pull request

  • Refactor the master/ worker server overload logic.
  • Enable closing the overload protection
  • Support set jvmMemory/SystemMemory/cpuUsage/diskUsage protection.
  • Get the metrics from actuator.

Brief change log

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

If your pull request contain incompatible change, you should also add it to docs/docs/en/guide/upgrede/incompatible.md

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch 4 times, most recently from a2f0c99 to 9f46edb Compare January 8, 2024 02:11
@codecov-commenter
Copy link

codecov-commenter commented Jan 8, 2024

Codecov Report

Attention: 119 lines in your changes are missing coverage. Please review.

Comparison is base (a405abe) 38.11% compared to head (246a274) 38.00%.

❗ Current head 246a274 differs from pull request most recent head 704e6fc. Consider uploading reports for the commit 704e6fc to get more accurate results

Files Patch % Lines
...rver/master/config/MasterServerLoadProtection.java 30.00% 13 Missing and 1 partial ⚠️
...ler/server/master/metrics/MasterServerMetrics.java 0.00% 12 Missing ⚠️
...rver/worker/config/WorkerServerLoadProtection.java 40.00% 9 Missing and 3 partials ⚠️
...eduler/server/master/task/MasterHeartBeatTask.java 0.00% 11 Missing ⚠️
...e/dolphinscheduler/server/master/MasterServer.java 0.00% 10 Missing ⚠️
...e/dolphinscheduler/server/worker/WorkerServer.java 0.00% 9 Missing ⚠️
...lphinscheduler/common/model/BaseHeartBeatTask.java 0.00% 8 Missing ⚠️
...inscheduler/alert/registry/AlertHeartbeatTask.java 0.00% 7 Missing ⚠️
...r/server/master/registry/MasterRegistryClient.java 14.28% 5 Missing and 1 partial ⚠️
...r/master/dispatch/host/LowerWeightHostManager.java 0.00% 5 Missing ⚠️
... and 12 more
Additional details and impacted files
@@             Coverage Diff              @@
##                dev   #15439      +/-   ##
============================================
- Coverage     38.11%   38.00%   -0.11%     
+ Complexity     4697     4686      -11     
============================================
  Files          1299     1304       +5     
  Lines         44777    44769       -8     
  Branches       4797     4798       +1     
============================================
- Hits          17067    17015      -52     
- Misses        25863    25904      +41     
- Partials       1847     1850       +3     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch 3 times, most recently from d11ef47 to d18c963 Compare January 8, 2024 05:48
deploy/kubernetes/dolphinscheduler/values.yaml Outdated Show resolved Hide resolved
deploy/kubernetes/dolphinscheduler/values.yaml Outdated Show resolved Hide resolved
deploy/kubernetes/dolphinscheduler/values.yaml Outdated Show resolved Hide resolved
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch from d18c963 to d6d30cc Compare January 8, 2024 06:53
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch 3 times, most recently from 6408e09 to 0b51840 Compare January 8, 2024 08:58
@NonNull RegistryClient registryClient) {
super("MasterHeartBeatTask", masterConfig.getHeartbeatInterval().toMillis());
super("MasterHeartBeatTask", masterConfig.getMaxHeartbeatInterval().toMillis());

Check warning

Code scanning / CodeQL

Dereferenced variable may be null Warning

Variable
masterConfig
may be null at this access as suggested by
this
null guard.
@NonNull Supplier<Integer> workerWaitingTaskCount) {
super("WorkerHeartBeatTask", workerConfig.getHeartbeatInterval().toMillis());
@NonNull WorkerTaskExecutorThreadPool workerTaskExecutorThreadPool) {
super("WorkerHeartBeatTask", workerConfig.getMaxHeartbeatInterval().toMillis());

Check warning

Code scanning / CodeQL

Dereferenced variable may be null Warning

Variable
workerConfig
may be null at this access as suggested by
this
null guard.
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch 3 times, most recently from 8f66358 to 91b0eca Compare January 8, 2024 10:50
@github-actions github-actions bot added the test label Jan 8, 2024
Copy link

sonarcloud bot commented Jan 8, 2024

Please retry analysis of this Pull-Request directly on SonarCloud

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch 4 times, most recently from 364fae9 to 306f4ca Compare January 9, 2024 02:25
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch from 306f4ca to 73270ee Compare January 9, 2024 06:11
}
if (systemMetrics.getSystemMemoryUsedPercentage() > maxSystemMemoryUsagePercentageThresholds) {
log.info(
"Worker OverLoad: the SystemMemoryUsedPercentage: {} is over then the MaxSystemMemoryUsagePercentageThresholds {}",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, typo.

@ruanwenjun ruanwenjun added the improvement make more easy to user or prompt friendly label Jan 10, 2024
@ruanwenjun ruanwenjun changed the title [Feature] Enable set ServerLoadProtection fot Master/Worker [Improvement] Enable set ServerLoadProtection fot Master/Worker Jan 10, 2024
import org.junit.jupiter.api.Assertions;
import org.junit.jupiter.api.Test;

class MasterServerLoadProtectionTest {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Comment on lines +31 to +34
.jvmMemoryUsedPercentage(0.71)
.systemMemoryUsedPercentage(0.71)
.totalCpuUsedPercentage(0.71)
.diskUsedPercentage(0.71)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
.jvmMemoryUsedPercentage(0.71)
.systemMemoryUsedPercentage(0.71)
.totalCpuUsedPercentage(0.71)
.diskUsedPercentage(0.71)
.jvmMemoryUsedPercentage(0.81)
.systemMemoryUsedPercentage(0.81)
.totalCpuUsedPercentage(0.81)
.diskUsedPercentage(0.81)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I change the default configuration to 0.7, so this test case doesn't need to change.

Copy link
Contributor

@caishunfeng caishunfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@caishunfeng caishunfeng added this to the 3.3.0 milestone Jan 12, 2024
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addServerProtection branch from bf49597 to 704e6fc Compare January 13, 2024 03:40
Copy link

sonarcloud bot commented Jan 13, 2024

Quality Gate Failed Quality Gate failed

Failed conditions

29.7% Coverage on New Code (required ≥ 60%)
C Reliability Rating on New Code (required ≥ A)

See analysis details on SonarCloud

idea Catch issues before they fail your Quality Gate with our IDE extension SonarLint SonarLint

@ruanwenjun ruanwenjun merged commit 7b9c9e0 into apache:dev Jan 14, 2024
58 of 87 checks passed
@ruanwenjun ruanwenjun deleted the dev_wenjun_addServerProtection branch January 14, 2024 14:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend document improvement make more easy to user or prompt friendly ready-to-merge test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants