Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix RemoveTablet during TabletExternallyReparented causing connection issues #16371

Merged
merged 9 commits into from
Aug 9, 2024

Conversation

arthurschreiber
Copy link
Contributor

@arthurschreiber arthurschreiber commented Jul 11, 2024

Description

This pull request fixes an issue in the health checking logic where an external primary failover in a keyspace/shard can cause two primaries to be marked as healthy. This can occur when a tablet is removed during an external reparenting operation.

This bug can lead to queries being sent to the old primary, or in case the demoted primary vttablet process is restarted, will lead to vttablet: Connection closed errors (which are mostly invisible because queries will be retried automatically on the "actual" primary).

When RemoveTablet was called, the list of healthy tablets was recomputed, but the logic did not account for the fact that there can only be a single entry in the list of healthy primary tablets. This PR changes the logic in deleteTablet to mirror the special handling for PRIMARY tablets that also exists in updateHealth. This ensures that we can never end up in a situation where there is more than one healthy primary tablet for a given shard.

Related Issue(s)

Fixes: #16373

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

Copy link
Contributor

vitess-bot bot commented Jul 11, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Jul 11, 2024
@github-actions github-actions bot added this to the v21.0.0 milestone Jul 11, 2024
@arthurschreiber arthurschreiber force-pushed the arthur/fix-delete-tablet-inconsistency branch from 3d37750 to 9bbed96 Compare July 11, 2024 19:17
…o tablets are marked as primary.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
This ensures that we can never end up in a situation where there is more
than one healthy primary tablet for a given shard.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
@arthurschreiber arthurschreiber marked this pull request as ready for review July 11, 2024 21:48
@arthurschreiber arthurschreiber added Backport to: release-18.0 Needs to be back ported to release-18.0 Backport to: release-19.0 Needs to be back ported to release-19.0 Backport to: release-20.0 Needs to be backport to release-20.0 and removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsWebsiteDocsUpdate What it says labels Jul 11, 2024
Copy link

codecov bot commented Jul 11, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 69.05%. Comparing base (eb29999) to head (4052306).
Report is 76 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16371      +/-   ##
==========================================
+ Coverage   68.69%   69.05%   +0.35%     
==========================================
  Files        1547     1556       +9     
  Lines      198297   202409    +4112     
==========================================
+ Hits       136228   139781    +3553     
- Misses      62069    62628     +559     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@arthurschreiber
Copy link
Contributor Author

Added the code that fixes this issue, and updated the PR body description accordingly.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
go/vt/discovery/healthcheck_test.go Show resolved Hide resolved
<-resultChan
<-resultChan

hc.RemoveTablet(thirdTablet)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does removing an unrelated tablet trigger the bug?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see this

When RemoveTablet was called, the list of healthy tablets was recomputed, but the logic did not account for the fact that there can only be a single entry in the list of healthy primary tablets.


hc.RemoveTablet(thirdTablet)

// tablet 1 is the primary now
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment is wrong, it should be

Suggested change
// tablet 1 is the primary now
// `secondTablet` should be the primary now

hc.recomputeHealthy(key)
if tabletType == topodata.TabletType_PRIMARY {
// If tablet type is primary, we should only have one tablet in the healthy list.
hc.recomputeHealthyPrimary(key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an interesting bug. We have safeguards in updateHealth which make sure that there is exactly one element in healthy for the PRIMARY tablet type, but the recomputation here had no such checks.
While the fix seems correct, what do you think about pushing it into recomputeHealthy instead of creating a new function call here?
Right now this is the only call site where recomputeHealthy is called for the PRIMARY tablet type, but it seems more compact and future proof to change recomputeHealthy to maintain the invariant.

Copy link
Contributor Author

@arthurschreiber arthurschreiber Jul 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the fix seems correct, what do you think about pushing it into recomputeHealthy instead of creating a new function call here?

I feel like having two separate methods helps highlighting the difference between the recomputation logic for primaries vs any other replica type more clearly.

I was thinking that maybe having a recomputeHealthy method that then calls out to either recomputeHealthyReplicas and recomputeHealthyPrimaries would probably be the best for clarity (and the go compiler should just inline all of these calls anyway, so not a performance issue either).

But then I noticed that pushing down the tablet type check into recomputeHealth would require passing down the key and tabletType as arguments, which seems redundant, because key contains the tablet type already, just as a string. Not passing down key would require passing down other arguments to be able to build the key, and then things just start to become more messy.

As I mentioned in Slack, I'm very interested giving the healthcheck code a polishing pass separately from this fix, so I'd suggest we punt on this for now. 🙇‍♂️ What do you think?

Copy link
Member

@deepthi deepthi Jul 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that instead of passing the "key", you would now have to pass the target. When this was written, it just seemed convenient to use the key because it has already been computed. So we have

			hc.recomputeHealthy(targetKey)

However, later on, we are doing a key computation for the second call

			oldTargetKey := KeyFromTarget(prevTarget)
			hc.recomputeHealthy(oldTargetKey)

It does not seem too invasive to change recomputeHealthy to be

func (hc *HealthCheckImpl) recomputeHealthy(target *query.Target) {
			key := KeyFromTarget(target)
			if target.TabletType == ..._PRIMARY {
			...
}

…municate the intent.

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
@shlomi-noach shlomi-noach mentioned this pull request Jul 23, 2024
24 tasks
@systay systay mentioned this pull request Jul 23, 2024
23 tasks
@shlomi-noach shlomi-noach mentioned this pull request Jul 23, 2024
28 tasks
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
…/fix-delete-tablet-inconsistency

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Copy link
Member

@deepthi deepthi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it. Nice targeted fix to a long-standing issue.

@deepthi deepthi requested a review from mattlord August 6, 2024 15:30
@deepthi
Copy link
Member

deepthi commented Aug 6, 2024

@arthurschreiber can you edit the description to match the final implementation?

@deepthi deepthi added the NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work label Aug 6, 2024
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @arthurschreiber! I only had a few minor questions and a nit.

I'll let you adjust the title and merge.

go/vt/discovery/healthcheck.go Outdated Show resolved Hide resolved
// cell or cell alias. It also performs filtering of tablets based on replication lag,
// if configured to do so.
//
// This should not be called for primary tablets.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we not allow it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately key is a string that concats a bunch of values, so checking whether the type is PRIMARY is not straightforward.

@deepthi and I discussed various options for refactoring this, but I feel like that shouldn't be part of this PR.

// clear the healthy list for the primary.
//
// See the logic in `updateHealth` for more details.
alias := tabletAliasString(topoproto.TabletAliasString(healthy[0].Tablet.Alias))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we check and handle cases where healthy has a length > 1 just to be safe? I'm not sure, but the thought came up.

Copy link
Contributor Author

@arthurschreiber arthurschreiber Aug 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think maintaining the invariant that healthy only ever contains a single entry for PRIMARY (and verifying this in the tests) seems "good enough" to me? I'm also not sure what the correct behaviour would be if we detect more than one entry in the list, because that means the invariant is broken and all guarantees are invalid.

// See the logic in `updateHealth` for more details.
alias := tabletAliasString(topoproto.TabletAliasString(healthy[0].Tablet.Alias))
if alias == tabletAlias {
hc.healthy[key] = []*TabletHealth{}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we do this rather than removing the key? Not implying it's somehow wrong. 🙂

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just mirroring what happens in updateHealth. 🤷

@arthurschreiber arthurschreiber removed the NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work label Aug 8, 2024
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
@arthurschreiber arthurschreiber force-pushed the arthur/fix-delete-tablet-inconsistency branch from 557a3d6 to 4052306 Compare August 8, 2024 18:57
@arthurschreiber arthurschreiber merged commit bf0c5f8 into main Aug 9, 2024
221 checks passed
@arthurschreiber arthurschreiber deleted the arthur/fix-delete-tablet-inconsistency branch August 9, 2024 07:45
vitess-bot pushed a commit that referenced this pull request Aug 9, 2024
…tion issues (#16371)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
vitess-bot pushed a commit that referenced this pull request Aug 9, 2024
…tion issues (#16371)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
vitess-bot pushed a commit that referenced this pull request Aug 9, 2024
…tion issues (#16371)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
arthurschreiber pushed a commit that referenced this pull request Aug 9, 2024
… causing connection issues (#16371) (#16568)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
arthurschreiber pushed a commit that referenced this pull request Aug 9, 2024
… causing connection issues (#16371) (#16567)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
arthurschreiber pushed a commit that referenced this pull request Aug 9, 2024
… causing connection issues (#16371) (#16566)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
venkatraju pushed a commit to slackhq/vitess that referenced this pull request Aug 29, 2024
…tion issues (vitessio#16371)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
timvaillancourt added a commit to slackhq/vitess that referenced this pull request Nov 7, 2024
* [release-19.0] Bump to `v19.0.5-SNAPSHOT` after the `v19.0.4` release (vitessio#15889)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] fix: handle info_schema routing (vitessio#15899) (vitessio#15906)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Update VTAdmin build script (vitessio#15839) (vitessio#15850)

Signed-off-by: notfelineit <notfelineit@gmail.com>
Signed-off-by: <>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Frances Thai <francesthai@Francess-MacBook-Pro.local>

* [release-19.0] Update env.sh so that is does not error when running on Mac (vitessio#15835) (vitessio#15915)

Signed-off-by: bddicken <bddicken@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] fix: derived table join column expression to be part of add join predicate on rewrite (vitessio#15956) (vitessio#15960)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] fix: insert on duplicate update to add list argument in the bind variables map (vitessio#15961) (vitessio#15967)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>

* [release-19.0] test: Cleaner plan tests output (vitessio#15922) (vitessio#15964)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] connpool: Allow time out during shutdown (vitessio#15979) (vitessio#16003)

Signed-off-by: Vicent Marti <vmg@strn.cat>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] fix: remove keyspace when merging subqueries (vitessio#16019) (vitessio#16027)

Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Add DCO workflow (vitessio#16052) (vitessio#16056)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Upgrade the Golang version to `go1.22.4` (vitessio#16061)

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: frouioui <frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Remove DCO workaround (vitessio#16087) (vitessio#16091)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Do not load table stats when booting `vttablet`. (vitessio#15715) (vitessio#16100)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>

* [release-19.0] Add timeout to all the contexts used for RPC calls in vtorc (vitessio#15991) (vitessio#16103)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* [release-19.0] Update braces package (vitessio#16115) (vitessio#16118)

Signed-off-by: Frances Thai <notfelineit@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] fix: order by subquery planning (vitessio#16049) (vitessio#16132)

Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Fix `vtexplain` not handling `UNION` queries with `weight_string` results correctly. (vitessio#16129) (vitessio#16157)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>

* Run more test on release-19 branch (vitessio#16152)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>

* [release-19.0] Fix flakiness in `vtexplain` unit test case. (vitessio#16159) (vitessio#16167)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>

* [release-19.0] Online DDL shadow table: rename referenced table name in self referencing FK (vitessio#16205) (vitessio#16207)

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Fix flaky tests that use vtcombo (vitessio#16178) (vitessio#16212)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>

* [release-19.0] Handle Nullability for Columns from Outer Tables (vitessio#16174) (vitessio#16185)

Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] VDiff CLI: Fix VDiff `show` bug (vitessio#16177) (vitessio#16198)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] VReplication Workflow: set state correctly when restarting workflow streams in the copy phase (vitessio#16217) (vitessio#16222)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] vtctldclient: Apply (Shard | Keyspace| Table) Routing Rules commands don't work (vitessio#16096) (vitessio#16124)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Fix vtgate crash in group concat (vitessio#16254)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* [release-19.0] Fix Incorrect Optimization with LIMIT and GROUP BY (vitessio#16263) (vitessio#16267)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Fix the `v19.0.0` release notes and use the `vitess/lite` image for the MySQL container (vitessio#16282) (vitessio#16285)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>

* [release-19.0] VReplication: Properly handle target shards w/o a primary in Reshard (vitessio#16283) (vitessio#16291)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* [release-19.0] CI: Fix for xtrabackup install failures (vitessio#16329) (vitessio#16332)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* [release-19.0] Upgrade the Golang version to `go1.22.5` (vitessio#16322)

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: frouioui <frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Fix the install dependencies script in Docker (vitessio#16340) (vitessio#16346)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] planner: Handle ORDER BY inside derived tables (vitessio#16353) (vitessio#16359)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Fix Join Predicate Cleanup Bug in Route Merging (vitessio#16386) (vitessio#16389)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] fix issue with aggregation inside of derived tables (vitessio#16366) (vitessio#16384)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] Use default schema reload config values when config file is empty (vitessio#16393) (vitessio#16410)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Fix subquery planning having an aggregation that is used in order by as long as we can merge it all into a single route (vitessio#16402) (vitessio#16407)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Fix panic in schema tracker in presence of keyspace routing rules (vitessio#16383) (vitessio#16406)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* [release-19] Vitess tester workflow (vitessio#16127) (vitessio#16418)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>

* [release-19.0] feat: add a LIMIT 1 on EXISTS subqueries to limit network overhead (vitessio#16153) (vitessio#16191)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] Code Freeze for `v19.0.5` (vitessio#16448)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Release of `v19.0.5` (vitessio#16450)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Bump to `v19.0.6-SNAPSHOT` after the `v19.0.5` release (vitessio#16456)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] fix: reference table join merge (vitessio#16488) (vitessio#16496)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Improve the queries upgrade/downgrade CI workflow by using same test code version as binary (vitessio#16494) (vitessio#16501)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] bugfix: don't treat join predicates as filter predicates (vitessio#16472) (vitessio#16474)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] VTAdmin: Upgrade websockets js package (vitessio#16504) (vitessio#16512)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* [release-19.0] bugfix: Allow cross-keyspace joins (vitessio#16520) (vitessio#16523)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] simplify merging logic (vitessio#16525) (vitessio#16532)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Fix: Offset planning in hash joins (vitessio#16540) (vitessio#16551)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>

* [release-19.0] Fix `RemoveTablet` during `TabletExternallyReparented` causing connection issues (vitessio#16371) (vitessio#16567)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* v19 backport: Throttler/vreplication: fix app name used by VPlayer (vitessio#16578) (vitessio#16580)

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* [release-19.0] Upgrade the Golang version to `go1.22.6` (vitessio#16543)

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: frouioui <frouioui@users.noreply.github.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* v19 backport: Online DDL: avoid SQL's `CONVERT(...)`, convert programmatically if needed (vitessio#16603)

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* [release-19.0] Remove mysql57/percona57 bootstrap images (vitessio#16620) (vitessio#16622)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>

* [release-19.0] Fix query plan cache misses metric (vitessio#16562) (vitessio#16627)

Signed-off-by: shanth96 <shanth.sathiyaseelan@shopify.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] VReplication workflows: retry "wrong tablet type" errors (vitessio#16645) (vitessio#16652)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <57520317+rohit-nayak-ps@users.noreply.github.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] VStream API: validate that last PK has fields defined (vitessio#16478) (vitessio#16486)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Update micromatch to 4.0.8 (vitessio#16660) (vitessio#16666)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Replace ErrorContains checks with Error checks before running upgrade downgrade (vitessio#16700)

Signed-off-by: Manan Gupta <manan@planetscale.com>

* [release-19.0] JSON Encoding: Use Type_RAW for marshalling json (vitessio#16637) (vitessio#16681)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] FindErrantGTIDs: superset is not an errant GTID situation (vitessio#16725) (vitessio#16728)

Signed-off-by: deepthi <deepthi@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Move from 4-cores larger runners to `ubuntu-latest` (vitessio#16714) (vitessio#16717)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Upgrade the Golang version to `go1.22.7` (vitessio#16721)

Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: frouioui <frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Code Freeze for `v19.0.6` (vitessio#16745)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Release of `v19.0.6` (vitessio#16747)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Bump to `v19.0.7-SNAPSHOT` after the `v19.0.6` release (vitessio#16753)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Remove mysql57 from docker images (vitessio#16763)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] VTAdmin: Address security vuln in path-to-regexp node pkg (vitessio#16770) (vitessio#16772)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* Backport: Fix ACL checks for CTEs (vitessio#16642) (vitessio#16776)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* [release-19.0] VTAdmin: Fix serve-handler's path-to-regexp dep and add default schema refresh (vitessio#16778) (vitessio#16783)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* [release-19.0] Bump com.google.protobuf:protobuf-java from 3.24.3 to 3.25.5 in /java (vitessio#16809) (vitessio#16837)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [release-19.0] VTAdmin: Upgrade deps to address security vulns (vitessio#16843) (vitessio#16846)

Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>

* [release-19.0] Support passing filters to `discovery.NewHealthCheck(...)` (vitessio#16170) (vitessio#16871)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* [release-19.0] Fail fast when builtinbackup fails to restore a single file (vitessio#16856) (vitessio#16867)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Upgrade Golang to 1.22.8 (vitessio#16895)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] VTTablet: smartconnpool: notify all expired waiters (vitessio#16897) (vitessio#16901)

Signed-off-by: Brendan Dougherty <brendan.dougherty@shopify.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Fix race in `replicationLagModule` of `go/vt/throttle` (vitessio#16078) (vitessio#16899)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Tim Vaillancourt <tim@timvaillancourt.com>

* [release-19.0] Bump commons-io:commons-io from 2.7 to 2.14.0 in /java (vitessio#16889) (vitessio#16930)

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [release-19.0] fixes bugs around expression precedence and LIKE (vitessio#16934 & vitessio#16649) (vitessio#16945)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* [release-19.0] Flaky test fixes (vitessio#16940) (vitessio#16958)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>

* [release-19.0] fix: route engine to handle column truncation for execute after lookup (vitessio#16981) (vitessio#16984)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>

* [release-19.0] bugfix: add HAVING columns inside derived tables (vitessio#16976) (vitessio#16978)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] Fix deadlock between health check and topology watcher (vitessio#16995) (vitessio#17008)

Signed-off-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] Add support for `MultiEqual` opcode for lookup vindexes. (vitessio#16975) (vitessio#17039)

Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>

* [release-19.0] bugfix: treat EXPLAIN like SELECT (vitessio#17054) (vitessio#17056)

Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>

* [release-19.0] Delegate Column Availability Checks to MySQL for Single-Route Queries (vitessio#17077) (vitessio#17085)

Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Andres Taylor <andres@planetscale.com>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>

* Bugfix for Panic on Joined Queries with Non-Authoritative Tables in Vitess 19.0 (vitessio#17103)

Signed-off-by: Andres Taylor <andres@planetscale.com>

* [release-19.0] Improve Schema Engine's TablesWithSize80 query (vitessio#17066) (vitessio#17089)

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* [release-19.0] Fix unreachable errors when taking a backup (vitessio#17062) (vitessio#17110)

Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>

* [release-19.0] Code Freeze for `v19.0.7` (vitessio#17148)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* [release-19.0] Release of `v19.0.7` (vitessio#17149)

Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>

* restore test conditional for v18 vttablet

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* restore more test conditional for v18 binaries

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* restore whitespace

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Revert "[release-19.0] Improve the queries upgrade/downgrade CI workflow by using same test code version as binary (vitessio#16494) (vitessio#16501)"

This reverts commit 25a80ac.

* add missing table from cleanup

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Andres Taylor <andres@planetscale.com>
Signed-off-by: notfelineit <notfelineit@gmail.com>
Signed-off-by: <>
Signed-off-by: bddicken <bddicken@gmail.com>
Signed-off-by: Harshit Gangal <harshit@planetscale.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Florent Poinsard <florent.poinsard@outlook.fr>
Signed-off-by: GitHub <noreply@github.com>
Signed-off-by: Arthur Schreiber <arthurschreiber@github.com>
Signed-off-by: Manan Gupta <manan@planetscale.com>
Signed-off-by: Frances Thai <notfelineit@gmail.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Rohit Nayak <rohit@planetscale.com>
Signed-off-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: shanth96 <shanth.sathiyaseelan@shopify.com>
Signed-off-by: deepthi <deepthi@planetscale.com>
Signed-off-by: dependabot[bot] <support@github.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Brendan Dougherty <brendan.dougherty@shopify.com>
Co-authored-by: Andrés Taylor <andres@planetscale.com>
Co-authored-by: vitess-bot[bot] <108069721+vitess-bot[bot]@users.noreply.github.com>
Co-authored-by: Frances Thai <francesthai@Francess-MacBook-Pro.local>
Co-authored-by: Harshit Gangal <harshit@planetscale.com>
Co-authored-by: vitess-bot <139342327+vitess-bot@users.noreply.github.com>
Co-authored-by: frouioui <frouioui@users.noreply.github.com>
Co-authored-by: Florent Poinsard <florent.poinsard@outlook.fr>
Co-authored-by: Arthur Schreiber <arthurschreiber@github.com>
Co-authored-by: Manan Gupta <35839558+GuptaManan100@users.noreply.github.com>
Co-authored-by: Manan Gupta <manan@planetscale.com>
Co-authored-by: Rohit Nayak <rohit@planetscale.com>
Co-authored-by: Florent Poinsard <35779988+frouioui@users.noreply.github.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: Rohit Nayak <57520317+rohit-nayak-ps@users.noreply.github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Backport to: release-18.0 Needs to be back ported to release-18.0 Backport to: release-19.0 Needs to be back ported to release-19.0 Backport to: release-20.0 Needs to be backport to release-20.0 Component: VTGate Type: Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: [vtgate] RemoveTablet during TabletExternallyReparented event can lead to healthcheck corruption
3 participants