-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there heartbeat for motan to check RPC connections? #986
Comments
The heartbeat mechanism in Motan is only used to probe unavailable server nodes, not to maintain link validity. The TCP KeepAlive mechanism is used to do this,so the net.ipv4.tcp_keepalive_time=1200 will affect the reconnection behavior. The code is in here. There are two ways to try:
|
@rayzhang0603 If we restart the service insance which the VIP was banded to before, and then switch to the other instance, should we use the motan-transport-netty4? For the motan version 1.1.10, in the com.weibo.api.motan.transport.AbstractSharedPoolClient#getChannel(), there are following logics:
I have following comprehensions, please help to correct me if I have mistakes:
|
Yes, you can use the About AbstractSharedPoolClient, the first two points are correct. But In addition, the heartbeat mechanism is triggered by the unavailability of the client, not the unavailability of the connection. When the continuous failure of requests in the client reaches the |
@rayzhang0603 more questions again:
|
Sorry, the This exception log will be printed before the The heartbeat method does not need to be exported explicitly, the HeartMessageHandleWrapper will handle it automatically |
@rayzhang0603 |
@rayzhang0603 Which can lead to the rebuild once? Thanks a lot in advance. |
Dear all,
Here is our case:
We have no registry center, and have 2 intances for the same service, using VIP banded to one instance, and client calls the service via the VIP. when the VIP is switched to another instance, the client does not check the original RPC connection, which will lead to invocations fail. But after about 20 minutes, the connection recovers (rebuild the connections, invocations are successful).
So my quesrions are as bellows:
BTW, we have the net.ipv4.tcp_keepalive_time=1200, does it have the relations wth that?
Thanks in advance.
The text was updated successfully, but these errors were encountered: