-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
logictest: support mixed binary version tests with upgrades #91881
Conversation
ba973ea
to
5886ffd
Compare
5886ffd
to
06fcb6c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/sql/logictest/logic.go
line 1236 at r1 (raw file):
} func (t *logicTest) newTestServerCluster(bootstrapBinaryPath string, upgradeBinaryPath string) {
nit: maybe add a comment that "upgradeBinaryPath is set according to cfg.CockroachGoUpgradeVersion. If the latter is unset, upgradeBinaryPath is of the same value as bootstrapBinaryPath"
pkg/sql/logictest/logic.go
line 1251 at r1 (raw file):
testserver.StoreOnDiskOpt(), testserver.CockroachBinaryPathOpt(bootstrapBinaryPath), testserver.UpgradeCockroachBinaryPathOpt(upgradeBinaryPath),
This may be a bit random, but just curious if we would ever want a mix of 3 versions in a test?
pkg/sql/logictest/logic.go
line 3018 at r1 (raw file):
return errors.Errorf(`could not perform "upgrade", not a cockroach-go/testserver cluster`) } nodeNum, err := strconv.Atoi(fields[1])
nit: nodeIdx
maybe better?
pkg/sql/logictest/logic.go
line 3045 at r1 (raw file):
} if err := t.testserverCluster.WaitForInitFinishForNode(nodeNum); err != nil {
Do we still need this WaitForInitFinishForNode? Look like it's doing the same thing as the for loop above.
pkg/sql/logictest/logic.go
line 3871 at r1 (raw file):
// Ensure that all of the created descriptors can round-trip through json. { // If `useCockroachGoTestserver` is true and we do an upgrade,
Not requesting for changes, but wanted to know whether this check for descriptors in necessary for all tests? When using cockroach-go test server, will we want to wait till migrations finish then run this check?
pkg/sql/logictest/testdata/logic_test/testserver_upgrade_node
line 1 at r1 (raw file):
# LogicTest: cockroach-go-testserver-22.2-master
nit: Would this name be a bit confusing once the master moves forward to a later version? Would it be better if it's something like testserver-22.2-single-version
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)
pkg/sql/logictest/logic.go
line 1236 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
nit: maybe add a comment that "upgradeBinaryPath is set according to cfg.CockroachGoUpgradeVersion. If the latter is unset, upgradeBinaryPath is of the same value as bootstrapBinaryPath"
done - with a small correction
pkg/sql/logictest/logic.go
line 1251 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
This may be a bit random, but just curious if we would ever want a mix of 3 versions in a test?
no, we shouldn't need that, since CRDB only supports upgrading from one version to the next
pkg/sql/logictest/logic.go
line 3018 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
nit:
nodeIdx
maybe better?
done
pkg/sql/logictest/logic.go
line 3045 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
Do we still need this WaitForInitFinishForNode? Look like it's doing the same thing as the for loop above.
yeah, i'll remove this one, nice catch
pkg/sql/logictest/logic.go
line 3871 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
Not requesting for changes, but wanted to know whether this check for descriptors in necessary for all tests? When using cockroach-go test server, will we want to wait till migrations finish then run this check?
i don't think it's necessary for all tests - it's kind of like a defensive check, and doesn't seem worth it to make it work during version upgrades
pkg/sql/logictest/testdata/logic_test/testserver_upgrade_node
line 1 at r1 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
nit: Would this name be a bit confusing once the master moves forward to a later version? Would it be better if it's something like
testserver-22.2-single-version
?
when master moves forward to a new version, this test should start failing, so then the test configs will be update or removed as needed
efce051
to
3cee6b6
Compare
3cee6b6
to
01bece1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @rafiss)
pkg/sql/logictest/logic.go
line 3045 at r5 (raw file):
t.Fatal(err) } for i := 0; i < 3; i++ {
Hmm, so now with the ci green, seems that though we upgrade a single node (by killing the process and restart a new one), we still need to wait till all nodes in the cluster are ping-able. 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)
pkg/sql/logictest/logic.go
line 3045 at r5 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
Hmm, so now with the ci green, seems that though we upgrade a single node (by killing the process and restart a new one), we still need to wait till all nodes in the cluster are ping-able. 🤔
yeah, i wanted to experiment with this to see if that would fix it. i believe this is correct, since the testserver uses Kill
when upgrading a node, which means that there could be temporary cluster unavailability (for example, if the leaseholder for the system database is killed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)
pkg/sql/logictest/logic.go
line 3045 at r5 (raw file):
Previously, rafiss (Rafi Shamim) wrote…
yeah, i wanted to experiment with this to see if that would fix it. i believe this is correct, since the testserver uses
Kill
when upgrading a node, which means that there could be temporary cluster unavailability (for example, if the leaseholder for the system database is killed)
i've also made cockroachdb/cockroach-go#154 so that this logic is part of the testserver
Do we want to replace here with for i := 0; i < 3; i ++ {
if err := ts.WaitForInitFinishForNode(i); err != nil {
return err
}
} as well? |
This makes the next commit for mixed-version testing easier. Release note: None
The logictest framework can now use the cockroach-go testserver to test a multi-node cluster in which each node is running a different binary version. Previously, only mixed logical versions could be tested. Release note: None
This prevents flakiness that is caused by temporary cluster unavailability after a node is killed and restarted. Release note: None
05480d0
to
acdc9ba
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ZhouXing19)
pkg/sql/logictest/logic.go
line 1272 at r5 (raw file):
Previously, ZhouXing19 (Jane Xing) wrote…
Do we want to replace here with
for i := 0; i < 3; i ++ { if err := ts.WaitForInitFinishForNode(i); err != nil { return err } }as well?
i think this one is fine as-is, since during the initial startup there is no non-graceful killing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!! Thanks for answering my questions
Reviewed 5 of 15 files at r6.
Reviewable status: complete! 1 of 0 LGTMs obtained
tftr! bors r=ZhouXing19 |
Build failed: |
bors r=ZhouXing19 |
Build succeeded: |
fixes #72451
The logictest framework can now use the cockroach-go testserver to test a multi-node cluster in which each node is running a different binary version.
Previously, only mixed logical versions could be tested.
Release note: None