-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachtest: clearrange/checks=true failed #38720
Comments
timed out importing |
Same cause as #38772, but let's leave this open to avoid re-triaging. |
SHA: https://github.com/cockroachdb/cockroach/commits/1ca35fc4a0e2665e7f6efd945e65a0db97984fa7 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1396096&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/7dab0dcfd37c389af357c302c073b9611b5ada25 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1398203&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/1ad0ecc8cbddf82c9fedb5a5c5e533e72a657ff7 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1399000&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/7111a67b2ea3a19c2f312f8d214b8823f431cac0 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1400942&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/26edea51118a0e16b61748c08068bfa6f76543ca Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1404886&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/ff04012ed2d2c0c8e30e4de106ca0a350bca8c3e Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1404856&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/cfdaadc3514e7e8660f6c009ba159fdfd604f0a8 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1409070&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/65055d6c16bf9386d8c4f4f9cd23e0a848814dc9 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1411157&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/da56c792e968574b8f1d9ef3fdb45d56a530221a Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1415578&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/5bd37e8eb58ca66b9293c234bc572411057fec3a Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1417287&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/51a6fdedf0ce1d1329d40d801a7deaf8206b6b07 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1420118&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/51a6fdedf0ce1d1329d40d801a7deaf8206b6b07 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1436116&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/01ee0704865391599abef3bbc89f462117f8007a Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1445527&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/93860e69f96aa3a86bd8bb42f310fb2629d53f39 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1447036&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/9a982e902638e116ed6a76f4fa635a0a1445d88a Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1447054&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/7ca0a86b8595c097fd8f27581b1509c47f17e8a3 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1450654&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/40f8f0eb00f4b3bf5bac11fb5ae132e33a492713 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1452154&tab=buildLog
|
SHA: https://github.com/cockroachdb/cockroach/commits/497167b1c596eda2b70bed91c51ebf39b4356c33 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1453099&tab=buildLog
|
This test is our only one that sets |
The clearrange is the only test running with this option, and it fired. Increase our coverage of stats mismatches to hopefully find a better repro target. See cockroachdb#38720 (comment). Release note: None
SHA: https://github.com/cockroachdb/cockroach/commits/239513342a2d23f683bbc1d386f87ff59cc78d10 Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1539668&tab=artifacts#/clearrange/checks=true
|
41432: roachprod: fatal nodes on stats mismatch r=bdarnell a=tbg The clearrange is the only test running with this option, and it fired. Increase our coverage of stats mismatches to hopefully find a better repro target. See #38720 (comment). Release note: None Co-authored-by: Tobias Schottdorf <tobias.schottdorf@gmail.com>
In light of the stats inconsistency seen in [clearrange], we want to be stricter about verifying the stats in nightly testing. This commit makes sure `./cockroach debug check-store` is fast enough to do so: On a ~71GB fully compacted store directory it reliably takes well below two minutes (on GCE local SSD). [clearrange]: cockroachdb#38720 (comment) Release note (performance improvement): The `./cockroach debug check-store` command is now faster.
We've got a repro of the stats inconsistency on #37815 (comment). I'm stressing that test overnight to get my hands on a data dir. |
In light of the stats inconsistency seen in [clearrange], we want to be stricter about verifying the stats in nightly testing. This commit makes sure `./cockroach debug check-store` is fast enough to do so: On a ~71GB fully compacted store directory it reliably takes well below two minutes (on GCE local SSD). [clearrange]: cockroachdb#38720 (comment) Release note (performance improvement): The `./cockroach debug check-store` command is now faster.
SHA: https://github.com/cockroachdb/cockroach/commits/262e6f2499e34eb4373d0450fa9f6a820a609b2c Parameters: To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1565222&tab=artifacts#/clearrange/checks=true
|
In the clearrange logs, I'm seeing this message all over the place:
This is from cockroach/pkg/storage/replica_range_lease.go Lines 913 to 918 in 371e289
and we're not supposed to be able to hit it in this test. |
Here's what we see in the logs and the range status report:
At this point it seems like the replicate queue runs on n5, the new leaseholder which detects the joint configuration and attempts to move out of it:
We then immediately see
Which continues until the test times out. Below find the lease history with a translation to timestamps:
|
In theory the check below should prevent the commands which removes n5 from being applied but it seems to not be doing the trick: cockroach/pkg/storage/replica_raft.go Lines 293 to 306 in 0f47384
|
The bug I'm pretty sure is that the command to See the logic exercised by this test: cockroach/pkg/roachpb/data_test.go Lines 1666 to 1674 in 0f47384
Typing up a patch now. |
I guess actually there's two ways we can prevent this specific problem from occurring with the first being easier but leaving me with more questions about how we get out of a bad scenario and the second having a less obvious implementation.
|
I thought 2) was true: cockroach/pkg/storage/batcheval/cmd_lease.go Lines 50 to 70 in 9376c8a
Is this code just broken because it looks up the replica for the store evaluating the command, rather than the transfer target? Seems like it... |
That's pretty 🤦♂️ but is at least an easy fix. |
In light of the stats inconsistency seen in [clearrange], we want to be stricter about verifying the stats in nightly testing. This commit makes sure `./cockroach debug check-store` is fast enough to do so: On a ~71GB fully compacted store directory it reliably takes well below two minutes (on GCE local SSD). [clearrange]: cockroachdb#38720 (comment) Release note (performance improvement): The `./cockroach debug check-store` command is now faster.
SHA: https://github.com/cockroachdb/cockroach/commits/9322e07476de447799c5d3011eb2874930ee2993
Parameters:
To repro, try:
Failed test: https://teamcity.cockroachdb.com/viewLog.html?buildId=1375546&tab=buildLog
The text was updated successfully, but these errors were encountered: