Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature-7024] Add waiting strategy to support master/worker can recover from registry lost #11368

Merged
merged 4 commits into from
Aug 13, 2022

Conversation

ruanwenjun
Copy link
Member

@ruanwenjun ruanwenjun commented Aug 9, 2022

Purpose of the pull request

Close #7024

Brief change log

  • Add ConnectStrategy used for registry connection disconnect/reconnect
  • Add stop/waiting strategy
  • Use ServerLifyCycleManager to replace Stopper

Verify this pull request

This pull request is code cleanup without any test coverage.

(or)

This pull request is already covered by existing tests, such as (please describe tests).

(or)

This change added tests and can be verified as follows:

(or)

If your pull request contain incompatible change, you should also add it to docs/docs/en/guide/upgrede/incompatible.md

@SbloodyS SbloodyS added the feature new feature label Aug 9, 2022
@SbloodyS SbloodyS added this to the 3.1.0 milestone Aug 9, 2022
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addWaitingStrategy branch from e6bca04 to 2782747 Compare August 9, 2022 07:26
@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addWaitingStrategy branch 4 times, most recently from 4f60974 to 7c8c4b9 Compare August 10, 2022 03:37
@codecov-commenter
Copy link

codecov-commenter commented Aug 10, 2022

Codecov Report

Merging #11368 (7bb2337) into dev (496c2d4) will decrease coverage by 0.21%.
The diff coverage is 22.69%.

❗ Current head 7bb2337 differs from pull request most recent head cf93ac0. Consider uploading reports for the commit cf93ac0 to get more accurate results

@@             Coverage Diff              @@
##                dev   #11368      +/-   ##
============================================
- Coverage     39.43%   39.22%   -0.22%     
  Complexity     4622     4622              
============================================
  Files           980      987       +7     
  Lines         37260    37543     +283     
  Branches       4176     4178       +2     
============================================
+ Hits          14695    14727      +32     
- Misses        21031    21279     +248     
- Partials       1534     1537       +3     
Impacted Files Coverage Δ
...che/dolphinscheduler/alert/AlertSenderService.java 49.64% <0.00%> (ø)
...org/apache/dolphinscheduler/alert/AlertServer.java 51.28% <0.00%> (ø)
...ler/common/lifecycle/ServerLifeCycleException.java 0.00% <0.00%> (ø)
...duler/common/lifecycle/ServerLifeCycleManager.java 0.00% <0.00%> (ø)
...olphinscheduler/common/lifecycle/ServerStatus.java 0.00% <0.00%> (ø)
...e/dolphinscheduler/server/master/MasterServer.java 0.00% <0.00%> (ø)
...inscheduler/server/master/config/MasterConfig.java 0.00% <0.00%> (ø)
...ver/master/consumer/TaskPriorityQueueConsumer.java 0.00% <0.00%> (ø)
...master/dispatch/executor/NettyExecutorManager.java 0.00% <ø> (ø)
...eduler/server/master/event/WorkflowEventQueue.java 0.00% <0.00%> (ø)
... and 30 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addWaitingStrategy branch 7 times, most recently from ffdc43f to cf93ac0 Compare August 11, 2022 05:46
Copy link
Contributor

@caishunfeng caishunfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruanwenjun Please add/update the docs.

@ruanwenjun ruanwenjun force-pushed the dev_wenjun_addWaitingStrategy branch from 0ddea11 to 9cf8392 Compare August 12, 2022 02:04
@github-actions github-actions bot removed the Python label Aug 12, 2022
@sonarcloud
Copy link

sonarcloud bot commented Aug 12, 2022

SonarCloud Quality Gate failed.    Quality Gate failed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 19 Code Smells

17.5% 17.5% Coverage
4.9% 4.9% Duplication

@ruanwenjun
Copy link
Member Author

ruanwenjun commented Aug 12, 2022

@zhongjiajie This PR has some changes about the doc, please take a look.

Copy link
Contributor

@caishunfeng caishunfeng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ruanwenjun ruanwenjun merged commit 7ff34c3 into apache:dev Aug 13, 2022
@ruanwenjun ruanwenjun deleted the dev_wenjun_addWaitingStrategy branch August 13, 2022 01:52
ruanwenjun added a commit to ruanwenjun/dolphinscheduler that referenced this pull request Sep 7, 2022
…ver from registry lost (apache#11368)

* Add waiting strategy to support master/worker can recover from registry lost

* throw exception when zookeeper registry start failed due to interrupted

(cherry picked from commit 7ff34c3)
ruanwenjun added a commit to ruanwenjun/dolphinscheduler that referenced this pull request Sep 7, 2022
…ver from registry lost (apache#11368)

* Add waiting strategy to support master/worker can recover from registry lost

* throw exception when zookeeper registry start failed due to interrupted

(cherry picked from commit 7ff34c3)
xdu-chenrj pushed a commit to xdu-chenrj/dolphinscheduler that referenced this pull request Oct 13, 2022
…ver from registry lost (apache#11368)

* Add waiting strategy to support master/worker can recover from registry lost

* throw exception when zookeeper registry start failed due to interrupted
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Improvement][MasterWorker] Self-recovery when master or worker lost connection from registry center
4 participants