-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Attempt to stabilize the WindowedSubscriber test that uses VTS time advancing to close window and test comments. #39418
Attempt to stabilize the WindowedSubscriber test that uses VTS time advancing to close window and test comments. #39418
Conversation
...s/src/test/java/com/azure/messaging/servicebus/WindowedSubscriberFluxWindowIsolatedTest.java
Outdated
Show resolved
Hide resolved
16459c4
to
b760bde
Compare
/azp run java - eventhubs - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run java - servicebus - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
… and test comments
b760bde
to
360eb36
Compare
/azp run java - servicebus - tests |
Azure Pipelines successfully started running 1 pipeline(s). |
} else { | ||
logger.atError().log("VirtualTimeScheduler unavailable (unexpected close by the test)."); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Liudmila and I’ve been investigating a strange random CI error with the following stack trace –
[ForkJoinPool-1-worker-1] WARN com.azure.messaging.servicebus.WindowedSubscriber$WindowWork - {"az.sdk.message":"Terminating the work. Error while scheduling or waiting for timeout.","exception":"Scheduler unavailable","demand":1,"workId":0,"pending":1}
reactor.core.Exceptions$ReactorRejectedExecutionException: Scheduler unavailable
at reactor.core.Exceptions.failWithRejected(Exceptions.java:271)
at reactor.core.publisher.Operators.onRejectedExecution(Operators.java:1024)
at reactor.core.publisher.MonoDelay.subscribe(MonoDelay.java:64)
at reactor.core.publisher.Mono.subscribe(Mono.java:4491)
at reactor.core.publisher.Mono.subscribeWith(Mono.java:4606)
at reactor.core.publisher.Mono.subscribe(Mono.java:4458)
at reactor.core.publisher.Mono.subscribe(Mono.java:4394)
at reactor.core.publisher.Mono.subscribe(Mono.java:4366)
at com.azure.messaging.servicebus.WindowedSubscriber$WindowWork.beginTimeoutTimer(WindowedSubscriber.java:859)
at com.azure.messaging.servicebus.WindowedSubscriber$WindowWork.init(WindowedSubscriber.java:700)
at com.azure.messaging.servicebus.WindowedSubscriber$WindowWork.access$1100(WindowedSubscriber.java:576)
at com.azure.messaging.servicebus.WindowedSubscriber.initWorkOnce(WindowedSubscriber.java:386)
at com.azure.messaging.servicebus.WindowedSubscriber.drainLoop(WindowedSubscriber.java:264)
at com.azure.messaging.servicebus.WindowedSubscriber.drain(WindowedSubscriber.java:223)
at com.azure.messaging.servicebus.WindowedSubscriber.enqueueRequestImpl(WindowedSubscriber.java:144)
at com.azure.messaging.servicebus.WindowedSubscriberFluxWindowIsolatedTest.lambda$shouldPickEnqueuedWindowRequestsOnSubscriptionReady$19(WindowedSubscriberFluxWindowIsolatedTest.java:326)
at reactor.test.DefaultStepVerifierBuilder$DefaultStepVerifier.toVerifierAndSubscribe(DefaultStepVerifierBuilder.java:862)
at reactor.test.DefaultStepVerifierBuilder$DefaultStepVerifier.verify(DefaultStepVerifierBuilder.java:831)
at reactor.test.DefaultStepVerifierBuilder$DefaultStepVerifier.verify(DefaultStepVerifierBuilder.java:823)
at reactor.test.DefaultStepVerifierBuilder.verifyComplete(DefaultStepVerifierBuilder.java:690)
at com.azure.messaging.servicebus.WindowedSubscriberFluxWindowIsolatedTest.shouldPickEnqueuedWindowRequestsOnSubscriptionReady(WindowedSubscriberFluxWindowIsolatedTest.java:344)
All the tests in this class are running under VirtualTimeScheduler
(VTS) and once in a while the test "shouldPickEnqueuedWindowRequestsOnSubscriptionReady“ fails with "Scheduler unavailable".
From the stack trace this error comes from the scheduler (should be VTS in case of test) when WindowedSubscriber tries to register the timeout as part of starting the first WindowWork.
While in the real-world scenario the scheduler for timeout will be the global ParallelScheduler
not VTS, both schedulers can throw this error only when it’s shutdown.
In the test we shut down the local VTS (scoped to the test) only when the test is completed, not in the middle of the test run. It’s known that VTS infra tracks the current VTS instance in a static member, so we had seen similar "scheduler unavailable" when two test using VTS infra runs in parallel and completion of one disposes the static current VTS. But as of today, we run all VTS tests in Isolated mode, so this should not happen.
Given this is so random failure and also observed happening only when CI is loaded, adding this logging function to shed some light into what’s going on here. The tests log using this function in case the VTS gets closed unexpectedly, this helps us,
- To see if the problem is local to the test, I.e., test run forward and closes the VTS.
- To see if the problem triggered from outside of the test, i.e., the VTS infra itself tear down the static current VTS holder.
- To see if there is for some reason VTS is not even gets injected.
No description provided.