Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(3.2): Triple Reactor OneToMany Handler null pointer fix and DubboFilter support #14125

Merged
merged 9 commits into from
May 11, 2024

Conversation

caoyanan666
Copy link
Contributor

@caoyanan666 caoyanan666 commented Apr 23, 2024

What is the purpose of the change

  1. AbstractTripleReactorSubscriber initializes downstream only upon calling the subscribe method. Consequently, if an onError event happens before subscription, it can result in a null pointer, causing the request to hang. For instance, if there's an exception during parameter validation in a request, this would trigger onError directly without Flux reaching the subscribe step.
  2. The invoke method of OneToManyMethodHandler directly returns a CompletableFuture.completedFuture(null), bypassing DubboFilter. For instance, if I have a DubboAPMFilter, it monitors the response time of CompletableFuture.completedFuture(null) as 0 milliseconds. Hence, in ServerTripleReactorSubscriber, I maintain a CompletableFuture to return the actual execution Future to the Invoker instead of returning a fake CompletableFuture.completedFuture(null).

Brief changelog

Verifying this change

Checklist

  • Make sure there is a GitHub_issue field for the change (usually before you start working on it). Trivial changes like typos do not require a GitHub issue. Your pull request should address just this issue, without pulling in other changes - one PR resolves one issue.
  • Each commit in the pull request should have a meaningful subject line and body.
  • Write a pull request description that is detailed enough to understand what the pull request does, how, and why.
  • Check if is necessary to patch to Dubbo 3 if you are work on Dubbo 2.7
  • Write necessary unit-test to verify your logic correction, more mock a little better when cross module dependency exist. If the new feature or significant change is committed, please remember to add sample in dubbo samples project.
  • Add some description to dubbo-website project if you are requesting to add a feature.
  • GitHub Actions works fine on your own branch.
  • If this contribution is large, please follow the Software Donation Guide.

@caoyanan666
Copy link
Contributor Author

@AlbumenJ @icodening @EarthChen PTAL

@AlbumenJ
Copy link
Member

AlbumenJ commented May 8, 2024

@caoyanan666 Please fix the conflicts

caoyanan666 and others added 3 commits May 9, 2024 10:33
# Conflicts:
#	dubbo-plugin/dubbo-reactive/src/main/java/org/apache/dubbo/reactive/calls/ReactorServerCalls.java
@caoyanan666
Copy link
Contributor Author

image
@AlbumenJ Does the unit test code have a GitHub project link? I want to run it locally to give it a try.

@caoyanan666 caoyanan666 closed this May 9, 2024
@caoyanan666 caoyanan666 reopened this May 9, 2024
@caoyanan666
Copy link
Contributor Author

caoyanan666 commented May 9, 2024

https://github.com/apache/dubbo/actions/runs/9011837780/job/24760283649?pr=14125
这个13787失败我排查到是因为超时
[12/23] [dubbo-samples-test-13787:1/1] TEST FAILURE: Run tests timeout, version: -Ddubbo.version=3.2.13-SNAPSHOT -Dspring-boot.version=2.7.6 -Djava.version=8, please check logs: /home/runner/work/dubbo/dubbo/99-integration/dubbo-samples-test-13787/target/logs
然后继续排查发现因为provider容器异常退出
dependency failed to start: container dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1 exited (0)
Redirect container logs: dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1
然后看容器inspect显示没有监听到20800端口
"Output": "checking tcp ports: 127.0.0.1:20880;, start at: 0, timeout: 1\nchecking tcp port [127.0.0.1:20880] ...\ntelnet: Unable to connect to remote host: Connection refused\nTrying 127.0.0.1...\ncheck tcp port [127.0.0.1:20880] is timeout: 1 s\ncheck ports failure\n"
然后我看服务启动日志是正常启动后直接就shutdown了,这个是为啥
2024-05-09 03:33:08.630 INFO 16 --- [ main] org.apache.dubbo.metadata.MetadataInfo : [DUBBO] metadata revision changed: null -> cc997f680e96066e3e0d4ecd5d38730f, app: provider, services: 1, dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3
2024-05-09 03:33:08.744 INFO 16 --- [ main] o.a.dubbo.samples.test.DubboProvider : Started DubboProvider in 4.473 seconds (JVM running for 5.559)
dubbo service started
2024-05-09 03:33:08.752 INFO 16 --- [ionShutdownHook] o.a.d.c.deploy.DefaultModuleDeployer : [DUBBO] Dubbo Module[1.1.1] is stopping., dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3
@AlbumenJ 帮忙给个排查思路

@caoyanan666
Copy link
Contributor Author

https://github.com/apache/dubbo/actions/runs/9011837780/job/24760283649?pr=14125 这个13837失败我排查到是因为超时 [12/23] [dubbo-samples-test-13787:1/1] TEST FAILURE: Run tests timeout, version: -Ddubbo.version=3.2.13-SNAPSHOT -Dspring-boot.version=2.7.6 -Djava.version=8, please check logs: /home/runner/work/dubbo/dubbo/99-integration/dubbo-samples-test-13787/target/logs 然后继续排查发现因为provider容器异常退出 dependency failed to start: container dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1 exited (0) Redirect container logs: dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1 然后看容器inspect显示没有监听到20800端口 "Output": "checking tcp ports: 127.0.0.1:20880;, start at: 0, timeout: 1\nchecking tcp port [127.0.0.1:20880] ...\ntelnet: Unable to connect to remote host: Connection refused\nTrying 127.0.0.1...\ncheck tcp port [127.0.0.1:20880] is timeout: 1 s\ncheck ports failure\n" 然后我看服务启动日志是正常启动后直接就shutdown了,这个是为啥 2024-05-09 03:33:08.630 INFO 16 --- [ main] org.apache.dubbo.metadata.MetadataInfo : [DUBBO] metadata revision changed: null -> cc997f680e96066e3e0d4ecd5d38730f, app: provider, services: 1, dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3 2024-05-09 03:33:08.744 INFO 16 --- [ main] o.a.dubbo.samples.test.DubboProvider : Started DubboProvider in 4.473 seconds (JVM running for 5.559) dubbo service started 2024-05-09 03:33:08.752 INFO 16 --- [ionShutdownHook] o.a.d.c.deploy.DefaultModuleDeployer : [DUBBO] Dubbo Module[1.1.1] is stopping., dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3 @AlbumenJ 帮忙给个排查思路

这个原因有可能是端口起在了20881上,具体为啥20880没启动成功,日志里面定位不到
2024-05-09 03:33:08.478 WARN 16 --- [ main] org.apache.dubbo.config.ServiceConfig : [DUBBO] Use random available port(20881) for protocol dubbo, dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3, error code: 5-8. This may be caused by , go to https://dubbo.apache.org/faq/5/8 to find instructions.

@caoyanan666
Copy link
Contributor Author

https://github.com/apache/dubbo/actions/runs/9011837780/job/24760283649?pr=14125 这个13837失败我排查到是因为超时 [12/23] [dubbo-samples-test-13787:1/1] TEST FAILURE: Run tests timeout, version: -Ddubbo.version=3.2.13-SNAPSHOT -Dspring-boot.version=2.7.6 -Djava.version=8, please check logs: /home/runner/work/dubbo/dubbo/99-integration/dubbo-samples-test-13787/target/logs 然后继续排查发现因为provider容器异常退出 dependency failed to start: container dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1 exited (0) Redirect container logs: dubbo-samples-test-13787--dubbo-samples-test-13787-provider-1 然后看容器inspect显示没有监听到20800端口 "Output": "checking tcp ports: 127.0.0.1:20880;, start at: 0, timeout: 1\nchecking tcp port [127.0.0.1:20880] ...\ntelnet: Unable to connect to remote host: Connection refused\nTrying 127.0.0.1...\ncheck tcp port [127.0.0.1:20880] is timeout: 1 s\ncheck ports failure\n" 然后我看服务启动日志是正常启动后直接就shutdown了,这个是为啥 2024-05-09 03:33:08.630 INFO 16 --- [ main] org.apache.dubbo.metadata.MetadataInfo : [DUBBO] metadata revision changed: null -> cc997f680e96066e3e0d4ecd5d38730f, app: provider, services: 1, dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3 2024-05-09 03:33:08.744 INFO 16 --- [ main] o.a.dubbo.samples.test.DubboProvider : Started DubboProvider in 4.473 seconds (JVM running for 5.559) dubbo service started 2024-05-09 03:33:08.752 INFO 16 --- [ionShutdownHook] o.a.d.c.deploy.DefaultModuleDeployer : [DUBBO] Dubbo Module[1.1.1] is stopping., dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3 @AlbumenJ 帮忙给个排查思路

这个原因有可能是端口起在了20881上,具体为啥20880没启动成功,日志里面定位不到 2024-05-09 03:33:08.478 WARN 16 --- [ main] org.apache.dubbo.config.ServiceConfig : [DUBBO] Use random available port(20881) for protocol dubbo, dubbo version: 3.2.13-SNAPSHOT, current host: 172.27.0.3, error code: 5-8. This may be caused by , go to https://dubbo.apache.org/faq/5/8 to find instructions.

好像是3.2所有的ci都卡在这了, https://github.com/apache/dubbo/actions/runs/9011299521/job/24758927418?pr=14166 ,坐等修复...

@AlbumenJ
Copy link
Member

Try to fix it in apache/dubbo-integration-cases#25

@AlbumenJ
Copy link
Member

You can ignore it

@caoyanan666
Copy link
Contributor Author

@AlbumenJ PTAL

@@ -53,7 +53,7 @@ public void subscribe(final CallStreamObserver<T> downstream) {
if (downstream == null) {
throw new NullPointerException();
}
if (this.downstream == null && SUBSCRIBED.compareAndSet(false, true)) {
if (SUBSCRIBED.compareAndSet(false, true)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please help optimize the code style. Thanks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran spotless:apply again. Do you mean the hard coding of true and false?

/**
* The Subscriber in server to passing the data produced by user publisher to responseStream.
*/
public class ServerTripleReactorSubscriber<T> extends AbstractTripleReactorSubscriber<T> {

private final List<T> collectedData = new ArrayList<>();
private final CompletableFuture<List<T>> completableFuture = new CompletableFuture<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Make naming completableFuture more meaningful.
  2. Why is List type? Is combined future?

@caoyanan666 caoyanan666 requested a review from CrazyHZM May 11, 2024 02:55
Copy link

Copy link
Member

@EarthChen EarthChen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@EarthChen EarthChen merged commit 5d1c0ea into apache:3.2 May 11, 2024
19 checks passed
@CrazyHZM CrazyHZM added this to the 3.2.13 milestone May 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants