-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flaky tests that use vtcombo #16178
Conversation
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #16178 +/- ##
==========================================
- Coverage 68.57% 68.55% -0.02%
==========================================
Files 1544 1544
Lines 197863 197873 +10
==========================================
- Hits 135677 135651 -26
- Misses 62186 62222 +36 ☔ View full report in Codecov by Sentry. |
Signed-off-by: Manan Gupta <manan@planetscale.com>
d326945
to
66123c6
Compare
Signed-off-by: Manan Gupta <manan@planetscale.com>
I'm not super familiar with this code, but couldn't we use port |
Yeah, actually binding and then retrying or using But maybe something for a follow up PR? This change likely makes it significantly less flaky so it's at least an improvement but still there's a race possible. |
I just checked, and apparently specifying port |
I agree this is a good first step though! 👍 |
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
* [release-19.0] Bump to `v19.0.5-SNAPSHOT` after the `v19.0.4` release (vitessio#15889) Signed-off-by: Andres Taylor <andres@planetscale.com> * [release-19.0] fix: handle info_schema routing (vitessio#15899) (vitessio#15906) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Update VTAdmin build script (vitessio#15839) (vitessio#15850) Signed-off-by: notfelineit <notfelineit@gmail.com> Signed-off-by: <> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Frances Thai <francesthai@Francess-MacBook-Pro.local> * [release-19.0] Update env.sh so that is does not error when running on Mac (vitessio#15835) (vitessio#15915) Signed-off-by: bddicken <bddicken@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] fix: derived table join column expression to be part of add join predicate on rewrite (vitessio#15956) (vitessio#15960) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-19.0] fix: insert on duplicate update to add list argument in the bind variables map (vitessio#15961) (vitessio#15967) Signed-off-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Harshit Gangal <harshit@planetscale.com> * [release-19.0] test: Cleaner plan tests output (vitessio#15922) (vitessio#15964) Signed-off-by: Andres Taylor <andres@planetscale.com> * [release-19.0] connpool: Allow time out during shutdown (vitessio#15979) (vitessio#16003) Signed-off-by: Vicent Marti <vmg@strn.cat> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] fix: remove keyspace when merging subqueries (vitessio#16019) (vitessio#16027) Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Add DCO workflow (vitessio#16052) (vitessio#16056) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Upgrade the Golang version to `go1.22.4` (vitessio#16061) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-19.0] Remove DCO workaround (vitessio#16087) (vitessio#16091) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Do not load table stats when booting `vttablet`. (vitessio#15715) (vitessio#16100) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> * [release-19.0] Add timeout to all the contexts used for RPC calls in vtorc (vitessio#15991) (vitessio#16103) Signed-off-by: Manan Gupta <manan@planetscale.com> * [release-19.0] Update braces package (vitessio#16115) (vitessio#16118) Signed-off-by: Frances Thai <notfelineit@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] fix: order by subquery planning (vitessio#16049) (vitessio#16132) Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-19.0] Fix `vtexplain` not handling `UNION` queries with `weight_string` results correctly. (vitessio#16129) (vitessio#16157) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> * Run more test on release-19 branch (vitessio#16152) Signed-off-by: Harshit Gangal <harshit@planetscale.com> * [release-19.0] Fix flakiness in `vtexplain` unit test case. (vitessio#16159) (vitessio#16167) Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> * [release-19.0] Online DDL shadow table: rename referenced table name in self referencing FK (vitessio#16205) (vitessio#16207) Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Fix flaky tests that use vtcombo (vitessio#16178) (vitessio#16212) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> * [release-19.0] Handle Nullability for Columns from Outer Tables (vitessio#16174) (vitessio#16185) Co-authored-by: Andrés Taylor <andres@planetscale.com> * [release-19.0] VDiff CLI: Fix VDiff `show` bug (vitessio#16177) (vitessio#16198) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] VReplication Workflow: set state correctly when restarting workflow streams in the copy phase (vitessio#16217) (vitessio#16222) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * [release-19.0] vtctldclient: Apply (Shard | Keyspace| Table) Routing Rules commands don't work (vitessio#16096) (vitessio#16124) Signed-off-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> * [release-19.0] Fix vtgate crash in group concat (vitessio#16254) Signed-off-by: Manan Gupta <manan@planetscale.com> * [release-19.0] Fix Incorrect Optimization with LIMIT and GROUP BY (vitessio#16263) (vitessio#16267) Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-19.0] Fix the `v19.0.0` release notes and use the `vitess/lite` image for the MySQL container (vitessio#16282) (vitessio#16285) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> * [release-19.0] VReplication: Properly handle target shards w/o a primary in Reshard (vitessio#16283) (vitessio#16291) Signed-off-by: Matt Lord <mattalord@gmail.com> Co-authored-by: Matt Lord <mattalord@gmail.com> * [release-19.0] CI: Fix for xtrabackup install failures (vitessio#16329) (vitessio#16332) Signed-off-by: Matt Lord <mattalord@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Matt Lord <mattalord@gmail.com> * [release-19.0] Upgrade the Golang version to `go1.22.5` (vitessio#16322) Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-19.0] Fix the install dependencies script in Docker (vitessio#16340) (vitessio#16346) Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] planner: Handle ORDER BY inside derived tables (vitessio#16353) (vitessio#16359) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Andres Taylor <andres@planetscale.com> * [release-19.0] Fix Join Predicate Cleanup Bug in Route Merging (vitessio#16386) (vitessio#16389) Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> * [release-19.0] fix issue with aggregation inside of derived tables (vitessio#16366) (vitessio#16384) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Andrés Taylor <andres@planetscale.com> * [release-19.0] Use default schema reload config values when config file is empty (vitessio#16393) (vitessio#16410) Signed-off-by: Matt Lord <mattalord@gmail.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Fix subquery planning having an aggregation that is used in order by as long as we can merge it all into a single route (vitessio#16402) (vitessio#16407) Signed-off-by: Manan Gupta <manan@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> * [release-19.0] Fix panic in schema tracker in presence of keyspace routing rules (vitessio#16383) (vitessio#16406) Signed-off-by: Manan Gupta <manan@planetscale.com> * [release-19] Vitess tester workflow (vitessio#16127) (vitessio#16418) Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> * [release-19.0] feat: add a LIMIT 1 on EXISTS subqueries to limit network overhead (vitessio#16153) (vitessio#16191) Signed-off-by: Andres Taylor <andres@planetscale.com> Co-authored-by: Andrés Taylor <andres@planetscale.com> * [release-19.0] Code Freeze for `v19.0.5` (vitessio#16448) Signed-off-by: Andres Taylor <andres@planetscale.com> * [release-19.0] Release of `v19.0.5` (vitessio#16450) Signed-off-by: Andres Taylor <andres@planetscale.com> * Fix new test Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> --------- Signed-off-by: Andres Taylor <andres@planetscale.com> Signed-off-by: notfelineit <notfelineit@gmail.com> Signed-off-by: <> Signed-off-by: bddicken <bddicken@gmail.com> Signed-off-by: Harshit Gangal <harshit@planetscale.com> Signed-off-by: Vicent Marti <vmg@strn.cat> Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr> Signed-off-by: GitHub <noreply@github.com> Signed-off-by: Arthur Schreiber <arthurschreiber@github.com> Signed-off-by: Manan Gupta <manan@planetscale.com> Signed-off-by: Frances Thai <notfelineit@gmail.com> Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com> Signed-off-by: Rohit Nayak <rohit@planetscale.com> Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Signed-off-by: Matt Lord <mattalord@gmail.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Co-authored-by: Andrés Taylor <andres@planetscale.com> Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com> Co-authored-by: Frances Thai <francesthai@Francess-MacBook-Pro.local> Co-authored-by: Harshit Gangal <harshit@planetscale.com> Co-authored-by: vitess-bot <139342327+vitess-bot@users.noreply.github.com> Co-authored-by: frouioui <frouioui@users.noreply.github.com> Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr> Co-authored-by: Arthur Schreiber <arthurschreiber@github.com> Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com> Co-authored-by: Manan Gupta <manan@planetscale.com> Co-authored-by: Rohit Nayak <rohit@planetscale.com> Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com> Co-authored-by: Matt Lord <mattalord@gmail.com>
Description
Some tests like
TestCanGetKeyspaces
have been flaky. They occasionally end up panicking -The problem was 2 fold. The first being, that when we setup a new cluster we only assert that we don't have an error, we don't require it. This causes the test to continue to run even when the cluster setup fails. We eventually end up using the cluster, which is nil (cause of the failure), and this panics the test.
So, the first thing we do is change all
assert.Error
calls torequire.Error
calls that setup a cluster using vtcombo.Once we do this, we see the actual underlying error that fails these tests -
The problem is that we are choosing a random port for starting a cluster, and that port might already be in use by some other process. This PR fixes this issue by changing the logic of finding a random port to also verify that the port in question is also available for a tcp connection. If not, we retry and try a different random port.
I have verified that this fix indeed works, by 1 running the test multiple times, and 2 by ensuring that we correctly move away from ports that are already in use -
Related Issue(s)
Checklist
Deployment Notes