-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Receive] Fix race condition when adding multiple new tenants at once #7941
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it's not a fix really but a revert. Thanks for the test which reproduces the bug, it will be much easier to fix. I think we should fix this instead of reverting because the cost of creating this slice is not trivial when you have thousands of selects per second.
hi @GiedriusS I got your point, this client list should be relatively stable most of the time (wasteful of memory to create slices), I spent some time to actually fix it, appreciate another review |
I think the e2e test failure is tranisent, but i don't have permission to rerun |
519b937
to
ea3c2a0
Compare
Signed-off-by: Yi Jin <yi.jin@databricks.com>
Signed-off-by: Yi Jin <yi.jin@databricks.com>
Signed-off-by: Yi Jin <yi.jin@databricks.com>
Signed-off-by: Yi Jin <yi.jin@databricks.com>
8582bfc
to
982408e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 thanks a lot for this and sorry for the problems
Head branch was pushed to by a user without write access
28580fd
to
83b09f5
Compare
Signed-off-by: Yi Jin <yi.jin@databricks.com>
no worries, I've tried to fix the tests, might help merge it since all checks pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use force pushes in the future as it makes it hard to follow what changes between reviews, thanks!
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * add a change log Signed-off-by: Yi Jin <yi.jin@databricks.com> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <yi.jin@databricks.com> * address comments Signed-off-by: Yi Jin <yi.jin@databricks.com> --------- Signed-off-by: Yi Jin <yi.jin@databricks.com>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * add a change log Signed-off-by: Yi Jin <yi.jin@databricks.com> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <yi.jin@databricks.com> * address comments Signed-off-by: Yi Jin <yi.jin@databricks.com> --------- Signed-off-by: Yi Jin <yi.jin@databricks.com> Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com>
* Merge pull request #7674 from didukh86/query_frontend_tls_redis_fix Query-frontend: Fix connection to Redis cluster with TLS. Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Capnp: Use segment from existing message (#7945) * Capnp: Use segment from existing message Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com> * Downgrade capnproto Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com> --------- Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com> Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * [Receive] Fix race condition when adding multiple new tenants at once (#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * add a change log Signed-off-by: Yi Jin <yi.jin@databricks.com> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <yi.jin@databricks.com> * address comments Signed-off-by: Yi Jin <yi.jin@databricks.com> --------- Signed-off-by: Yi Jin <yi.jin@databricks.com> Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Cut patch release v0.37.1 Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Update promql-engine for subquery fix (#7953) Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Sidecar: Ensure limit param is positive for compatibility with older Prometheus (#7954) Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Update changelog Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> * Fix changelog Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> --------- Signed-off-by: Saswata Mukherjee <saswataminsta@yahoo.com> Signed-off-by: Filip Petkovski <filip.petkovsky@gmail.com> Signed-off-by: Yi Jin <yi.jin@databricks.com> Co-authored-by: Filip Petkovski <filip.petkovsky@gmail.com> Co-authored-by: Yi Jin <96499497+jnyi@users.noreply.github.com>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * add a change log Signed-off-by: Yi Jin <yi.jin@databricks.com> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <yi.jin@databricks.com> * address comments Signed-off-by: Yi Jin <yi.jin@databricks.com> --------- Signed-off-by: Yi Jin <yi.jin@databricks.com>
…thanos-io#7941) * [Receive] fix race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * add a change log Signed-off-by: Yi Jin <yi.jin@databricks.com> * memorize tsdb local clients without race condition Signed-off-by: Yi Jin <yi.jin@databricks.com> * fix data race in testing with some concurrent safe helper functions Signed-off-by: Yi Jin <yi.jin@databricks.com> * address comments Signed-off-by: Yi Jin <yi.jin@databricks.com> --------- Signed-off-by: Yi Jin <yi.jin@databricks.com>
Update: actually fix the issue instead of reverting the old one, memorize the TSDB client list is valuable to avoid creating thousands of slices in memory, see a6fbb9f
This reverted PR #7782 and fixed Issue #7892
Reproducible by newly added unit tests
TestMultiTSDBAddNewTenant
:After this fix, unit test would pass
Changes
Verification