-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Service Bus high prefetch count getting stuck and locking receiver processor #31356
Comments
@liukun-msft could you please take a look at this issue? |
Hi @epomatti Thanks for providing the detailed information. This is a known issue #30483. The application got stuck because the default thread pool size (8 * cpu cores) is smaller than the maxConcurrentCalls. Can you try to set vm option "-Dreactor.schedulers.defaultBoundedElasticSize=x" and x is greater than maxConcurrentCalls? like add vm option "-Dreactor.schedulers.defaultBoundedElasticSize=200" for your application when prefetch = 100 and maxConcurrentCalls = 100. But this solution works when prefetch count is small. We are currently working on a fix from the code side. |
Thanks @liukun-msft that seem to have worked. I'll follow the main thread for updates. I'll try higher values and span my nodes horizontally if I need higher throughput. Probably related, when closing the processor client there are still errors:
|
For anyone using Spring Boot, the property needs to be registered when using Maven during development: <plugin>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-maven-plugin</artifactId>
<configuration>
<systemPropertyVariables>
<reactor.schedulers.defaultBoundedElasticSize>${reactor.schedulers.defaultBoundedElasticSize}</reactor.schedulers.defaultBoundedElasticSize>
</systemPropertyVariables>
</configuration>
</plugin> I found it easier to just pass it like this and avoid changing the mvn spring-boot:run -Dspring-boot.run.profiles=dev -Dspring-boot.run.jvmArguments="-Dreactor.schedulers.defaultBoundedElasticSize=200" |
Describe the bug
My Service Bus receiver is getting stuck when I'm using
prefetch count
andmax concurrent calls
. Settings are like this:100
100
When I add 10,000 messages to the queue the Receiver Processor gets stuck, to the point where no more messages are consumed and remain in the queue.
Restarting the application while keeping the configuration does NOT solve the issue. Only when reducing the
prefetch count
significantly then messages started to be consumed again.UPDATE: The error also happens with different combinations. For example: 20 prefetch and 500 max concurrent, but it goes a little longer before failing.
UPDATED 2: Isolated the code to simulate the issue in this repo:
Exception or Stack Trace
I've uploaded the full stack trace here in this gist, and will post excerpts below.
Several information message for "adding credits" start to appear:
Then messages for lock renewal:
A new response channel is started:
Warning and errors start to appear:
Then an "Update disposition request timed out" error:
Information log when remote link is closed:
Finally, the error "Cannot perform operations on a disposed receiver." and "Cannot update disposition with no link":
To Reproduce
Steps to reproduce the behavior:
Code Snippet
Application configuration:
Service Bus consumer:
Screenshots
Inserting 10,000 messages:
Messages stuck. Not even restarting the client solves the issue.
Setup (please complete the following information):
Information Checklist
Kindly make sure that you have added all the following information above and checkoff the required fields otherwise we will treat the issuer as an incomplete report
The text was updated successfully, but these errors were encountered: