-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ccl/multiregionccl: TestMultiRegionDataDriven failed #74783
Comments
ccl/multiregionccl.TestMultiRegionDataDriven failed with artifacts on master @ 912964e02ddd951c77d4f71981ae18b3894e9084:
Help
See also: How To Investigate a Go Test Failure (internal)
Same failure on other branches
|
ccl/multiregionccl.TestMultiRegionDataDriven failed with artifacts on master @ 9b0065ca2450205a34d4237be7b317c2d895658c:
Help
See also: How To Investigate a Go Test Failure (internal)
Same failure on other branches
|
ccl/multiregionccl.TestMultiRegionDataDriven failed with artifacts on master @ e1068d77afbd39b162978281c9da7cbea49c1c3a:
Help
See also: How To Investigate a Go Test Failure (internal)
Same failure on other branches
|
(Ignoring the last tracing failure that Andrei fixed) These tests have been failing for a while, even before #73876 (which certainly couldn't have helped with its additional asynchrony). Here's one from 21.2: #73829. +cc #75543 (a bazel dup?). Looking at the test directives, we seem make use of arbitrary sleeps, perhaps that's why? cockroach/pkg/ccl/multiregionccl/datadriven_test.go Lines 124 to 125 in e49415f
In #75304 we observed the need to sometimes to sometimes retry the query itself to accommodate stale distsender caches, our parallel here being this directive, so perhaps that's not as much of a problem here:
For the "wait-for-zone-config-changes" primitive, we can transfer the leaseholder by issuing a one-off transfer request, which I don't think guarantees that the lease would stay pinned to the destination. Likely a red-herring for these failures, but worth noting I think. cockroach/pkg/ccl/multiregionccl/datadriven_test.go Lines 72 to 88 in e49415f
I'm not planning on looking at these tests further unless pushed. +cc @arulajmani / @ajstorm / @nvanbenschoten: should we skip these tests on master? Do we think we'll have the time to polish them up? While they do provide good coverage, I do think it'd take time to make them a bit more robust, and in relation to #73876 at least, I'm not expecting any work here catch bugs as much as ensure this test is flake-proof. |
hasn't triggered for over a month |
They have on release-22.1 which I guess Arul's currently assigned to (#77908). I suspect this will just fail again but happy connect the history back to this issue when it happens. This is a pretty long running test, 45s+ at times, if it's not flaking in CI it's also possible because it's just not getting stressed enough. I hope we can either rewrite this test to be faster or rip it out entirely. |
ccl/multiregionccl.TestMultiRegionDataDriven failed with artifacts on master @ c3d71ac887844bef174abb6dab2a4e1ce9270ab7:
Help
See also: How To Investigate a Go Test Failure (internal)
Parameters in this failure:
Same failure on other branches
This test on roachdash | Improve this report!
Jira issue: CRDB-12258
The text was updated successfully, but these errors were encountered: