-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: fix TestCreateStatisticsCanBeCancelled txn retry hang #109029
Conversation
3541485
to
3668aa4
Compare
3668aa4
to
ccfad8a
Compare
#109007 is an issue for |
@rafiss This was a Stupid mistake on my part. I accidentally moved the code around before pushing this. Let me validate one more time, turning to draft temporarily. |
RFAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/stats/create_stats_job_test.go
line 148 at r2 (raw file):
sqlDB.Exec(t, `CREATE DATABASE d`) sqlDB.Exec(t, `CREATE TABLE d.t (x INT PRIMARY KEY)`)
Let's create this table with WITH (sql_stats_automatic_collection_enabled = false)
. That will prevent auto stats from running.
pkg/sql/stats/create_stats_job_test.go
line 164 at r2 (raw file):
// If we end up ever retrying the txn, we should not block on // the allow request channel again. setTableID(descpb.InvalidID)
Which txn is retrying? I'm not sure it's possible for the statistics collection txn to retry if we're waiting in the store testing knob. But maybe there's a race between automatic stats collection for s1 and manual stats collection?
Looking at the stack again, it is definitely auto stats collection fighting the manual one. Oh good point, yeah the txn probably will be AOST in that case too. So, that makes a lot more sense here (disabling the knob is probably a good idea to be paranoid still). |
f8b9d82
to
4cef84b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r3, all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @fqazi)
@michae2 TFTR! |
hm seems like CI failed on the test again |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/stats/create_stats_job_test.go
line 131 at r3 (raw file):
defer leaktest.AfterTest(t)() defer log.Scope(t).Close(t) skip.UnderStress(t, "test can be slow to quiesce because of filter")
Let's unskip this and make sure it passes under stress.
pkg/sql/stats/create_stats_job_test.go
line 163 at r3 (raw file):
// If we end up ever retrying the txn, we should not block on // the allow request channel again. setTableID(descpb.InvalidID)
Statistics collection runs in parallel. Maybe multiple scans hit the knob at once?
pkg/sql/stats/create_stats_job_test.go
line 174 at r3 (raw file):
}) err := <-errCh allowRequest <- struct{}{}
Instead of just waiting for cancellation at this point, maybe we need another SucceedsSoon
loop that is repeatedly sending to allowRequest
and checking errCh
in a select
.
4cef84b
to
9071d57
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2)
pkg/sql/stats/create_stats_job_test.go
line 131 at r3 (raw file):
Previously, michae2 (Michael Erickson) wrote…
Let's unskip this and make sure it passes under stress.
@michae2 I did that and switched to post that channel in a repeatedly succeeds soon. Let's see if CI is happy under stress.
9071d57
to
a8bb526
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yay, it passed! Some comments but
Reviewed all commit messages.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @fqazi)
pkg/sql/stats/create_stats_job_test.go
line 152 at r5 (raw file):
setTableID(tID) // Run CREATE STATISTICS and wait for to create the job.
nit: maybe "Run CREATE STATISTICS and wait for it to create the job."
pkg/sql/stats/create_stats_job_test.go
line 170 at r5 (raw file):
return err }) // Allow the filter to pass everything an error is received.
nit: maybe "Allow the filter to pass everything until an error is received."
pkg/sql/stats/create_stats_job_test.go
line 180 at r5 (raw file):
return nil case allowRequest <- struct{}{}: }
Seems like there should be a default
case so that we return control to the SucceedsSoon
if neither channel is ready?
Previously, this test could hang if there was an automatic stats came in concurrently with a manual stats collection, where the request filter would end up hanging and being called twice. To address this patch will disable automatic stats collections on the table. Fixes: cockroachdb#109007 Release note: None
a8bb526
to
0d7ad67
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r6, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @michae2)
@michae2 TFTR! bors r+ |
Build succeeded: |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error setting reviewers, but backport branch blathers/backport-release-23.1-109029 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/109187/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. [] Backport to branch 23.1.x failed. See errors above. error creating merge commit from 0d7ad67 to blathers/backport-release-23.1.9-rc-109029: POST https://api.github.com/repos/cockroachdb/cockroach/merges: 409 Merge conflict [] you may need to manually resolve merge conflicts with the backport tool. Backport to branch 23.1.9-rc failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
blathers backport release-23.1.9-rc |
Encountered an error creating backports. Some common things that can go wrong:
You might need to create your backport manually using the backport tool. error setting reviewers, but backport branch blathers/backport-release-23.1.9-rc-109029 is ready: POST https://api.github.com/repos/cockroachdb/cockroach/pulls/109231/requested_reviewers: 422 Reviews may only be requested from collaborators. One or more of the teams you specified is not a collaborator of the cockroachdb/cockroach repository. [] Backport to branch release-23.1.9-rc failed. See errors above. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf. |
Previously, this test could hang if there was an automatic
stats came in concurrently with a manual stats collection,
where the request filter would end up hanging and being called twice.
To address this patch will disable automatic stats collections
on the table.
Fixes: #109007
Release note: None