Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addressing bulk ingester stuck threads #870

Merged
merged 3 commits into from
Aug 26, 2024
Merged

Conversation

l-trotta
Copy link
Contributor

When the Bulk Ingester is configured in such a way that the number of external threads performing the add operation is much higher than maxConcurrentRequests (usually more than 50 to 1), after completing all of the request but the last one, the Bulk Ingester stops.

If there's a flusher thread, it will keep adding requests one by one, and sending them one by one: this is because ~95% of the external threads will get stuck while adding, and a send operation will only free one of those at a time.

Replacing signal() with signalAll() to free the threads is a possible solution, but it's also expensive and considering this is an extreme case it would affect performances for every other case; so the chosen solution is to add another signal() after a successful add operation, assuring that stuck threads will be released quickly if the condition that got them stuck no longer applies.

Should fix #651

Copy link
Member

@swallez swallez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Added the backport label to 7.17 which is also affected.

@l-trotta l-trotta merged commit aca5752 into main Aug 26, 2024
7 checks passed
@l-trotta l-trotta deleted the bulk-ingester-thread-lock-fix branch August 26, 2024 13:39
github-actions bot pushed a commit that referenced this pull request Aug 26, 2024
* signaling after successfully adding

* stress test unit test

* removed memory calc
github-actions bot pushed a commit that referenced this pull request Aug 26, 2024
* signaling after successfully adding

* stress test unit test

* removed memory calc
l-trotta added a commit that referenced this pull request Aug 26, 2024
* signaling after successfully adding

* stress test unit test

* removed memory calc

Co-authored-by: Laura Trotta <153528055+l-trotta@users.noreply.github.com>
l-trotta added a commit that referenced this pull request Aug 26, 2024
* signaling after successfully adding

* stress test unit test

* removed memory calc

Co-authored-by: Laura Trotta <153528055+l-trotta@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Threads lock scenario at BulkIngester // FnCondition with high concurrency setup
2 participants