Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GrpcClient] The thread name may be incorrect after reconect to the next server #9954

Closed
lpaz010 opened this issue Feb 15, 2023 · 6 comments · Fixed by #10004
Closed

[GrpcClient] The thread name may be incorrect after reconect to the next server #9954

lpaz010 opened this issue Feb 15, 2023 · 6 comments · Fixed by #10004
Labels
kind/discussion Category issues related to discussion

Comments

@lpaz010
Copy link
Contributor

lpaz010 commented Feb 15, 2023

描述
目前develop分支中GrpcClient#createGrpcExecutor在创建线程池grpcExecutor时,可以传入serverIp当作线程名的一部分;当GRPC客户端发生reconect时选取下一个地址时grpcExecutor不会再次初始化,所以线程名没有变,实际上serverIp可能已经变了。

可能存在问题
由于实际serverIp已经变了,线程名上的serverIp还体现了第一次连接的地址上。

代码片段

    // GrpcClient
    @Override
    public Connection connectToServer(ServerInfo serverInfo) {
        try {
            if (grpcExecutor == null) {
                this.grpcExecutor = createGrpcExecutor(serverInfo.getServerIp());
            }
        ...省略
    }
    // RpcClient
    protected void reconnect(final ServerInfo recommendServerInfo, boolean onRequestFail) {
            ...省略
            // 1.get a new server
            ServerInfo serverInfo = null;
            try {
                serverInfo = recommendServer.get() == null ? nextRpcServer() : recommendServer.get();
                // 2.create a new channel to new server
                Connection connectionNew = connectToServer(serverInfo);
        ...省略
    }
@lpaz010
Copy link
Contributor Author

lpaz010 commented Feb 16, 2023

描述
BaseGrpcServer的配置中,相关默认时间的单位不统一,即存在纳秒,也存在毫秒。

可能存在问题
对Grpc服务端进行相关时间的配置时,运行结果不符合预期。

代码片段
GrpcServerConstants中第96~100行,这些默认值的单位是纳秒

static final long DEFAULT_GRPC_KEEP_ALIVE_TIME = GrpcUtil.DEFAULT_SERVER_KEEPALIVE_TIME_NANOS;
        
static final long DEFAULT_GRPC_KEEP_ALIVE_TIMEOUT = GrpcUtil.DEFAULT_SERVER_KEEPALIVE_TIMEOUT_NANOS;
        
static final long DEFAULT_GRPC_PERMIT_KEEP_ALIVE_TIME = TimeUnit.MINUTES.toNanos(5L);

BaseGrpcServer中第76~78行在初始化时,单位是毫秒

    @Override
    public void startServer() throws Exception {
        final MutableHandlerRegistry handlerRegistry = new MutableHandlerRegistry();
        
        addServices(handlerRegistry, new GrpcConnectionInterceptor());
        
        server = ServerBuilder.forPort(getServicePort()).executor(getRpcExecutor())
                .maxInboundMessageSize(getMaxInboundMessageSize()).fallbackHandlerRegistry(handlerRegistry)
                .compressorRegistry(CompressorRegistry.getDefaultInstance())
                .decompressorRegistry(DecompressorRegistry.getDefaultInstance())
                .addTransportFilter(new AddressTransportFilter(connectionManager))
                .keepAliveTime(getKeepAliveTime(), TimeUnit.MILLISECONDS) // here 
                .keepAliveTimeout(getKeepAliveTimeout(), TimeUnit.MILLISECONDS) // here 
                .permitKeepAliveTime(getPermitKeepAliveTime(), TimeUnit.MILLISECONDS) // here 
                .build();
        
        server.start();
    }

@lpaz010
Copy link
Contributor Author

lpaz010 commented Feb 16, 2023

对于上面的议题,是否可以通过如下来解决:
Q1:线程名字是否去掉serverIp的拼接,保持与原老的版本一致
Q2:默认时间的单位统一,设置默认单位为毫秒

@KomachiSion
Copy link
Collaborator

Q1 可能是为了方便排查,需要讨论下是否修改。
Q2 grpc定义的默认值为纳秒,Nacos的是毫秒, 这里可能需要统一成毫秒。

@KomachiSion KomachiSion added the kind/discussion Category issues related to discussion label Feb 20, 2023
@lpaz010
Copy link
Contributor Author

lpaz010 commented Feb 20, 2023

@pixystone 老师,您的看法呢

@pixystone
Copy link
Contributor

pixystone commented Feb 21, 2023

这里名称主要是为了方便Server节点之间通信的client方便排查问题,因为他们IP不会变。
可以有两种方案:

  1. 单独区分出来,仅Server间使用的client带上serverIp
  2. 所有的client在connect和reconnect时使用的线程单独增加属性serverIp,并感知变更,在toString和取name时都带上serverIp信息

@lpaz010
Copy link
Contributor Author

lpaz010 commented Feb 21, 2023

这里名称主要是为了方便Server节点之间通信的client方便排查问题,因为他们IP不会变。 可以有两种方案:

  1. 单独区分出来,仅Server间使用的client带上serverIp
  2. 所有的client在connect和reconnect时使用的线程单独增加属性serverIp,并感知变更,在toString和取name时都带上serverIp信息
  1. GrpcClusterClient稳定IP不会变
  2. GrpcSdkClient内的grpcExexutor的ThreadFactory实例只初始化一次,异步reconnect不创建新executor好像不行吧

YunWZ added a commit to YunWZ/nacos that referenced this issue Feb 24, 2023
YunWZ added a commit to YunWZ/nacos that referenced this issue Feb 24, 2023
@KomachiSion KomachiSion linked a pull request Feb 27, 2023 that will close this issue
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/discussion Category issues related to discussion
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants