Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ExecutorService instead of event loop for Netty connection #13904

Merged
merged 15 commits into from
Apr 7, 2024

Conversation

hanpen24
Copy link
Contributor

This change addresses the issue of the event loop being blocked for an extended period, improving overall performance and responsiveness.

What is the purpose of the change

This PR addresses the issue #13853 where connectionClient.doConnect() was blocking the event loop for up to 3 seconds.
To resolve this, the connection process has been moved to a separate thread using an ExecutorService.

Brief changelog

Changes:

  • Removed the scheduling of connectionClient.doConnect() on the event loop.
  • Added submission of connectionClient.doConnect() to an ExecutorService to execute in a separate thread.

Verifying this change

Checklist

  • Make sure there is a GitHub_issue field for the change (usually before you start working on it). Trivial changes like typos do not require a GitHub issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Check if is necessary to patch to Dubbo 3 if you are work on Dubbo 2.7
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add sample in dubbo samples project.
  • Add some description to dubbo-website project if you are requesting to add a feature.
  • GitHub Actions works fine on your own branch.
  • If this contribution is large, please follow the Software Donation Guide.

@AlbumenJ AlbumenJ requested a review from icodening March 12, 2024 09:40
@AlbumenJ
Copy link
Member

image
Please fix failed case

@hanpen24 hanpen24 marked this pull request as draft March 12, 2024 10:00
@hanpen24
Copy link
Contributor Author

Ok. After fixing failed case, I'll make this ready.

},
1L,
TimeUnit.SECONDS);
executor.submit(() -> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please create a new isolated executor to reconnect

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much.
I resolved the error by defining the ExecutorService within the method. However, a new error has occurred, so I will check it.
Considering the current modification, instantiating it inside the method might be costly. Would it be better to have an instance of the ExecutorRepository class as a field in the NettyConnectionClient class?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. We need to maintain this executor inExecutorRepository

@hanpen24 hanpen24 marked this pull request as ready for review March 13, 2024 15:00
@hanpen24 hanpen24 marked this pull request as draft March 13, 2024 15:28
@codecov-commenter
Copy link

codecov-commenter commented Mar 17, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 70.42%. Comparing base (2f6b4c0) to head (1ba41c1).
Report is 25 commits behind head on 3.2.

Additional details and impacted files
@@            Coverage Diff             @@
##              3.2   #13904      +/-   ##
==========================================
+ Coverage   70.39%   70.42%   +0.03%     
==========================================
  Files        1607     1607              
  Lines       70070    70094      +24     
  Branches    10098    10102       +4     
==========================================
+ Hits        49327    49367      +40     
+ Misses      16116    16084      -32     
- Partials     4627     4643      +16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -90,6 +95,8 @@ protected void initConnectionClient() {
this.closePromise = new DefaultPromise<>(GlobalEventExecutor.INSTANCE);
this.init = new AtomicBoolean(false);
this.increase();
this.isReconnecting = new AtomicBoolean(false);
scheduledExecutorService = Executors.newSingleThreadScheduledExecutor();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This executor should be managed by org.apache.dubbo.common.threadpool.manager.FrameworkExecutorRepository

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the suggestion. I will make the necessary modifications.

@hanpen24 hanpen24 marked this pull request as ready for review March 19, 2024 10:01
@hanpen24
Copy link
Contributor Author

@icodening @AlbumenJ
I have made the revisions. Could you please review them

@@ -374,7 +385,7 @@ public void operationComplete(ChannelFuture future) {
"Failed to connect to server: " + getConnectAddress());
}
},
1L,
1,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sync with org.apache.dubbo.remoting.exchange.support.header.HeaderExchangeClient#calculateReconnectDuration to get the duration

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlbumenJ
Thank you for your review.
I have modified the code to get the idle time before reconnection. By default, it will now wait for 60000ms.

public AbstractClient(URL url, ChannelHandler handler) throws RemotingException {
super(url, handler);
// set default needReconnect true when channel is not connected
needReconnect = url.getParameter(Constants.SEND_RECONNECT_KEY, true);

applicationModel = url.getOrDefaultApplicationModel();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Directly get Framework Model here would be better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AlbumenJ
Thank you for your review.
I've made the adjustments based on your suggestions. Could you please check if it aligns with your expectations?

@@ -138,6 +138,7 @@ void connectSyncTest() throws Throwable {

nettyPortUnificationServer.bind();
// auto reconnect
Thread.sleep(60000);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use awaitility to verify

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've made the adjustments that use awaitility.

Copy link

@AlbumenJ
Copy link
Member

@icodening PTAL

Copy link
Contributor

@icodening icodening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@AlbumenJ
Copy link
Member

@EarthChen PTAL

Copy link
Member

@EarthChen EarthChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EarthChen EarthChen merged commit f03823e into apache:3.2 Apr 7, 2024
19 checks passed
Will-well pushed a commit to Will-well/dubbo that referenced this pull request Apr 9, 2024
…e#13904)

* Use ExecutorService instead of event loop for Netty connection

This change addresses the issue of the event loop being blocked for an extended period, improving overall performance and responsiveness.

* fix log getConnectAddress

* applay format

* applay format

* applay format

* applay format

* applay format

* Use an independent ExecutorService

* Use ScheduledExecutorService for scheduling tasks

* delete unnecessary files

* Modify to stop ExecutorService using shutdownNow

* Modify to use ScheduledExecutor managed by FrameworkExecutorRepository

* Synchronize reconnectDuration with HeaderExchangeClient's reconnectDuration

* get framework model directly

use awaitility
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants