Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tablet throttler: throttler-config-via-topo defaults 'true', deprecation message for old flags #13130

Merged

Conversation

shlomi-noach
Copy link
Contributor

@shlomi-noach shlomi-noach commented May 23, 2023

Description

  • v16 introduced a new vttablet flag: --throttler-config-via-topo, see the docs: https://vitess.io/docs/16.0/reference/features/tablet-throttler/
  • In v16 this flag defaults false, and the old per-tablet --enable-lag-throttler configuration is still supported.
  • For v17, the target of this PR, this PR sets the default for --throttler-config-via-topo to true. The old configuration is still supported but if used there's a deprecation warning.
  • In v18, the old configuration & logic will be compeletely removed and it will be assumed that --throttler-config-via-topo is always true whether specified or not. The flag will issue a deprecation message.
  • In v19 we will remove the flag --throttler-config-via-topo.

We remove all references to --enable-lag-throttler in Vitess's own endtoend tests, and use the dynamic throttler config everywhere.

Related Issue(s)

Follow up to:

Checklist

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…lerConfig' everywhere

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach shlomi-noach added Type: Enhancement Logical improvement (somewhere between a bug and feature) Component: TabletManager labels May 23, 2023
@vitess-bot vitess-bot bot added NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels May 23, 2023
@vitess-bot
Copy link
Contributor

vitess-bot bot commented May 23, 2023

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • If this is a change that users need to know about, please apply the release notes (needs details) label so that merging is blocked unless the summary release notes document is included.
  • If a test is added or modified, there should be a documentation on top of the test to explain what the expected behavior is what the test does.

If a new flag is being introduced:

  • Is it really necessary to add this flag?
  • Flag names should be clear and intuitive (as far as possible)
  • Help text should be descriptive.
  • Flag names should use dashes (-) as word separators rather than underscores (_).

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow should be required, the maintainer team should be notified.

Bug fixes

  • There should be at least one unit or end-to-end test.
  • The Pull Request description should include a link to an issue that describes the bug.

Non-trivial changes

  • There should be some code comments as to why things are implemented the way they are.

New/Existing features

  • Should be documented, either by modifying the existing documentation or creating new documentation.
  • New features should have a link to a feature request issue or an RFC that documents the use cases, corner cases and test cases.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • vtctl command output order should be stable and awk-able.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from VTop, if used there.

@github-actions github-actions bot added this to the v17.0.0 milestone May 23, 2023
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…'true'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…emand_duration'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

Various vreplication tests are failing. We enable the throttler in endtoend/vreplication/cluster_test.go and then check all tablets to see that the configuration is applied. On some tablets this test times out (at 1min timeout which is painfully long). Not sure yet what's up, still looking.

@shlomi-noach
Copy link
Contributor Author

Sample failure: https://github.com/vitessio/vitess/actions/runs/5056630919/jobs/9075115936?pr=13130

2023-05-23T12:14:38.2318934Z I0523 12:11:43.411558   17733 cluster_test.go:568] Finished creating shard c0-
2023-05-23T12:14:38.2319318Z I0523 12:11:43.411573   17733 cluster_test.go:571] Applying throttler config for keyspace customer
2023-05-23T12:14:38.2320142Z I0523 12:11:43.413036   17733 vtctldclient_process.go:67] Executing vtctldclient with command: vtctldclient --server 127.0.0.1:15999 UpdateThrottlerConfig --enable --threshold 30.000000 --custom-query  --check-as-check-shard customer (attempt 1 of 10)
2023-05-23T12:14:38.2320755Z I0523 12:11:43.703824   17733 cluster_test.go:574] Waiting for throttler config to be applied on all shards
2023-05-23T12:14:38.2321292Z I0523 12:11:43.704212   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-603
2023-05-23T12:14:38.2321826Z I0523 12:11:43.708998   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-600
2023-05-23T12:14:38.2322334Z I0523 12:11:43.709716   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-601
2023-05-23T12:14:38.2322838Z I0523 12:11:43.710200   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-602
2023-05-23T12:14:38.2323334Z I0523 12:11:43.710920   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-701
2023-05-23T12:14:38.2323827Z I0523 12:11:43.711373   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-702
2023-05-23T12:14:38.2374812Z I0523 12:11:43.711896   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-703
2023-05-23T12:14:38.2375439Z I0523 12:11:43.712458   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-700
2023-05-23T12:14:38.2376094Z I0523 12:11:43.713035   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-200
2023-05-23T12:14:38.2376590Z I0523 12:11:43.714059   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-201
2023-05-23T12:14:38.2376907Z === NAME  TestBasicV2Workflows
2023-05-23T12:14:38.2378407Z     util.go:131: timed out waiting for the zone1-201 tablet's throttler status enabled to be true with the correct config after 1m0s; last seen value: {"Keyspace":"customer","Shard":"-80","IsLeader":false,"IsOpen":true,"IsEnabled":false,"IsDormant":true,"Query":"select unix_timestamp(now(6))-max(ts/1000000000) as replication_lag from `__vt_e2e-test`.heartbeat","Threshold":1,"AggregatedMetrics":{},"MetricsHealth":{}}
2023-05-23T12:14:38.2379264Z I0523 12:12:43.719734   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-202
2023-05-23T12:14:38.2379778Z I0523 12:12:43.723084   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-203
2023-05-23T12:14:38.2381306Z     util.go:131: timed out waiting for the zone2-203 tablet's throttler status enabled to be true with the correct config after 1m0s; last seen value: {"Keyspace":"customer","Shard":"-80","IsLeader":false,"IsOpen":true,"IsEnabled":false,"IsDormant":true,"Query":"select unix_timestamp(now(6))-max(ts/1000000000) as replication_lag from `__vt_e2e-test`.heartbeat","Threshold":1,"AggregatedMetrics":{},"MetricsHealth":{}}
2023-05-23T12:14:38.2382164Z I0523 12:13:43.725539   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-300
2023-05-23T12:14:38.2382770Z I0523 12:13:43.730041   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-301
2023-05-23T12:14:38.2447747Z I0523 12:13:43.731270   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-302
2023-05-23T12:14:38.2448279Z I0523 12:13:43.732833   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-303
2023-05-23T12:14:38.2448792Z I0523 12:13:43.733302   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-400
2023-05-23T12:14:38.2449300Z I0523 12:13:43.734463   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-401
2023-05-23T12:14:38.2449803Z I0523 12:13:43.735122   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-402
2023-05-23T12:14:38.2450286Z I0523 12:13:43.736168   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-403
2023-05-23T12:14:38.2450768Z I0523 12:13:43.736691   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-502
2023-05-23T12:14:38.2451248Z I0523 12:13:43.737214   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone2-503
2023-05-23T12:14:38.2451734Z I0523 12:13:43.738068   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-500
2023-05-23T12:14:38.2452227Z I0523 12:13:43.738804   17733 cluster_test.go:581] + Waiting for throttler config to be applied on zone1-501
2023-05-23T12:14:38.2452615Z I0523 12:13:43.739572   17733 cluster_test.go:585] Throttler config applied on all shards
2023-05-23T12:14:38.2453263Z I0523 12:13:43.749920   17733 vtctldclient_process.go:67] Executing vtctldclient with command: vtctldclient --server 127.0.0.1:15999 GetShard customer/-40 (attempt 1 of 10)
2023-05-23T12:14:38.2453892Z I0523 12:13:43.887528   17733 vtgate_process.go:205] Waiting for healthy status of 2 customer.-40.replica tablets in cell zone1
2023-05-23T12:14:38.2454480Z I0523 12:13:43.909585   17733 vtgate_process.go:205] Waiting for healthy status of 1 customer.-40.rdonly tablets in cell zone1
2023-05-23T12:14:38.2455154Z I0523 12:13:43.916846   17733 vtctldclient_process.go:67] Executing vtctldclient with command: vtctldclient --server 127.0.0.1:15999 GetShard customer/40-80 (attempt 1 of 10)
2023-05-23T12:14:38.2455782Z I0523 12:13:43.970841   17733 vtgate_process.go:205] Waiting for healthy status of 2 customer.40-80.replica tablets in cell zone1
2023-05-23T12:14:38.2456504Z I0523 12:13:43.978908   17733 vtgate_process.go:205] Waiting for healthy status of 1 customer.40-80.rdonly tablets in cell zone1
2023-05-23T12:14:38.2457171Z I0523 12:13:43.983193   17733 vtctldclient_process.go:67] Executing vtctldclient with command: vtctldclient --server 127.0.0.1:15999 GetShard customer/80-c0 (attempt 1 of 10)
2023-05-23T12:14:38.2457802Z I0523 12:13:44.014598   17733 vtgate_process.go:205] Waiting for healthy status of 2 customer.80-c0.replica tablets in cell zone1
2023-05-23T12:14:38.2458386Z I0523 12:13:44.016839   17733 vtgate_process.go:205] Waiting for healthy status of 1 customer.80-c0.rdonly tablets in cell zone1
2023-05-23T12:14:38.2459060Z I0523 12:13:44.018391   17733 vtctldclient_process.go:67] Executing vtctldclient with command: vtctldclient --server 127.0.0.1:15999 GetShard customer/c0- (attempt 1 of 10)
2023-05-23T12:14:38.2459689Z I0523 12:13:44.048688   17733 vtgate_process.go:205] Waiting for healthy status of 2 customer.c0-.replica tablets in cell zone1
2023-05-23T12:14:38.2460282Z I0523 12:13:44.050931   17733 vtgate_process.go:205] Waiting for healthy status of 1 customer.c0-.rdonly tablets in cell zone1
2023-05-23T12:14:38.2461276Z I0523 12:13:44.053676   17733 vtctlclient_process.go:210] Executing vtctlclient with command: vtctlclient --server 127.0.0.1:15999 Reshard -- --max_replication_lag_allowed=2542087h --source_shards -80,80- --target_shards -40,40-80,80-c0,c0- --defer-secondary-keys --cells zone1 Create customer.wf1 (attempt 1 of 10)
2023-05-23T12:14:38.2461900Z     resharding_workflows_v2_test.go:65: 
2023-05-23T12:14:38.2462673Z         	Error Trace:	/home/runner/work/vitess/vitess/go/test/endtoend/vreplication/resharding_workflows_v2_test.go:65
2023-05-23T12:14:38.2463860Z         	            				/home/runner/work/vitess/vitess/go/test/endtoend/vreplication/resharding_workflows_v2_test.go:404
2023-05-23T12:14:38.2465025Z         	            				/home/runner/work/vitess/vitess/go/test/endtoend/vreplication/resharding_workflows_v2_test.go:282
2023-05-23T12:14:38.2465482Z         	Error:      	Received unexpected error:
2023-05-23T12:14:38.2466091Z         	            	exit status 1: Waiting for workflow to start:
2023-05-23T12:14:38.2531170Z         	            	0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 0% ... 
2023-05-23T12:14:38.2532227Z         	            	The workflow did not start within 30s. The workflow may simply be slow to start or there may be an issue.
2023-05-23T12:14:38.2533510Z         	            	Check the status using the 'Workflow customer.wf1 show' client command for details.
2023-05-23T12:14:38.2535406Z         	            	Reshard Error: rpc error: code = Unknown desc = timed out waiting for workflow to start
2023-05-23T12:14:38.2536682Z         	            	E0523 12:14:15.417566   40839 main.go:105] remote error: rpc error: code = Unknown desc = timed out waiting for workflow to start
2023-05-23T12:14:38.2537130Z         	Test:       	TestBasicV2Workflows
2023-05-23T12:14:38.2537454Z I0523 12:14:25.479757   17733 cluster_test.go:640] vtgate teardown successful
2023-05-23T12:14:38.2537809Z I0523 12:14:25.479788   17733 cluster_test.go:640] vtgate teardown successful

As you can see above, the test timed out at 1min waiting for 2 tablets to pick up the throttler config changes. Because they didn't get the config changes, the subsequent workflow fails to make progress.

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…ale/vitess into throttler-config-default-enable

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord
Copy link
Contributor

mattlord commented Jun 1, 2023

@shlomi-noach finally all the tests are passing!

Here's a summary of my findings when investigating the vreplication e2e test failures -- please let me know if you think I'm wrong/mistaken on any part(s).

In the end there were 5 main issues:

  1. The way we were updating the throttler in the vreplication e2e tests -- when adding each shard, after adding the tablets for it -- was triggering a race condition that could be encountered while the throttler was still opening and the SrvKeyspace record was updated. This caused the goroutine executing the SrvKeyspaceWatcher callback function to try and update the throttler config concurrent with the retryReadAndApplyThrottlerConfig function in Open(). This was addressed by ensuring that the initMutex is always taken and held when applying throttler config changes in: cd2cc1b and 261b06a and 48e71a5
  2. The http response in our e2e Wait function would at times not contain one or more of the keys we were looking for/at, which does not really indicate a failure, but would cause the test to fail. That was addressed in: 8309b8b
  3. By default vreplication will choose a rdonly or replica tablet when available -- and they are available in the vreplication tests -- to be used as the vstreamer (source) for workflows and vdiff2 runs. This meant that we were hitting this bug: VReplication Workflows and VStream API gets stuck in the copy phase if tablet type is set only as a replica #13175 That was addressed temporarily here: 46a4c17 (and subsequently rolled back when we merged the fix: Tablet throttler: be explicit about client app name, exempt some apps from checks and heartbeat renewals #13195)
  4. When doing Reshards, the new shards are created with tablets that explicitly have their QueryService disabled. So they are e.g. PRIMARY Not Serving. This means that the throttler is not Open. This is fine, as the throttler should return OK/200 in this case. The problem was that we waited for these tablets to have the throttler enabled and failed the test after the timeout. This was addressed in: 0fc37df (and in follow-ups to also ignore non-primary non-serving types)
  5. We should be considering isOpen in the ready/running/CanServe/CheckResult determination since that’s the only time the throttler can function properly (with open DB conn pools etc). There were a few spots where we were instead treating isEnabled alone as that signal. So we were acting AS IF the throttler was ready/running when it was ONLY enabled -- which is now dynamic with the topo method. I believe that distinction was papered over before because isEnabled was not dynamic and was controlled entirely through Open()/Close() (based on the static tablet flags). This was addressed in: 261b06a and 48e71a5

Side Note: I saw that some of the tests that look at log messages became flaky as they would sometimes see this as the most recent log message instead of what they expected: WatchSrvKeyspace clearing cached entry for. It appears that this was likely caused by us at times trying to watch the SrvKeyspace record when it did not (yet) exist. This would then lead to a NoNode error from the topo server and that log message in this scenario. So I elided that useless message when it occurs from that code path (end result being that value is already nil'd out when that code block hits) in bbbb351

mattlord and others added 4 commits June 1, 2023 18:54
Signed-off-by: Matt Lord <mattalord@gmail.com>
This reverts commit 9aef276 as this
change was not correct.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

Here's a summary of my findings when investigating the vreplication e2e test failures -- please let me know if you think I'm wrong/mistaken on any part(s).

@mattlord thank you for the summary! I have further refinements, pushed in ba7aa05, all relating to the newly introduced IsOpen()/Isrunning() functions:

  1. The various throttler queries: SHOW VITESS_THROTTLER STATUS, SHOW VITESS_THROTTLED_APPS, ALTER VITESS_MIGRATION .. [THROTTLE|UNTHROTTLE] - all should be able to run as long as the throttler IsOpen(), and irrespective of whether the throttler IsEnabled() or not. The user should be allowed to throttler a migration even if at that moment the throttler is actually disabled. Likewise, you should definitely be able to see throttled apps/status when the throttler is disabled.
  2. There is an old, existing safety mechanism in Online DDL, that mitigates the scenario where a steamer reads from a replica. The scenario is of course now fixed in Tablet throttler: be explicit about client app name, exempt some apps from checks and heartbeat renewals #13195, but due to compatibility we have to keep this specific Online DDL mitigation in place in v17. So far so good. However, a slight simplification is that the mitigation is only required when the throttler IsRunning(): ba7aa05#diff-059c9f46e8d270d9c5514ef2b08679035eb0daaa8d95074e34ef43a81d50dc37L3596

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
@shlomi-noach
Copy link
Contributor Author

Doc updates are in vitessio/website#1489. Removing the NeedsWebsiteDocUpdate label.

@shlomi-noach shlomi-noach removed NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request labels Jun 5, 2023
shlomi-noach and others added 2 commits June 5, 2023 12:52
…t-for functions, and (2) issue different queries per goroutine

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yay! ❤️

@shlomi-noach shlomi-noach merged commit 8d3a1fb into vitessio:main Jun 5, 2023
@shlomi-noach shlomi-noach deleted the throttler-config-default-enable branch June 5, 2023 16:00
@vitess-bot
Copy link
Contributor

vitess-bot bot commented Jun 5, 2023

I was unable to backport this Pull Request to the following branches: release-17.0.

shlomi-noach added a commit to planetscale/vitess that referenced this pull request Jun 5, 2023
…ion message for old flags (vitessio#13130)

* Table throttler: --throttler-config-via-topo now defaults to 'true'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* add deprecation message

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* endtoend tests: remove '--enable-lag-throttler' and use 'UpdateThrottlerConfig' everywhere

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* always use vtctldclient

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* use cluster.VtctldClientProcess

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* disable --throttler-config-via-topo in old throttler tests

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Remove --throttler-config-via-topo where used, since it now defaults 'true'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fix vreplication cluster setup, waiting for throttler config to apply

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* changelog

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* extend throttler threshold

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* a bit more verbose

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fixed CLI test

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* remove old '--enable-lag-throttler' flag, introduce '--heartbeat_on_demand_duration'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* more log info in throttler.Open()

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* more logging

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Revert to --heartbeat_enable

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Protect throttler config change application with initMutex

And in e2e test update the throttler config on the keyspace
when it's created. Only wait for the new tablets in a shard
to have the throttler enabled when adding a Shard.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* More CI testing

Signed-off-by: Matt Lord <mattalord@gmail.com>

* CI testing cont

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Yes...

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Somebody doesn't like force pushes so msg here

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Increase on-demand heartbeat duration from 10s to 1m

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use only on-demand heartbeats everywhere

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use same throttler config everywhere

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Update all keyspaces and don't fail test on missing JSON keys

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use constant heartbeats in vrepl e2e tests

Until vitessio#13175 is
fixed.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Increase workflow command timeout

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Don't wait for throttler on non-serving primaries

Signed-off-by: Matt Lord <mattalord@gmail.com>

* vitessio#13175 is fixed, therefore re-instating on-deman heartbeats

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Added ToC

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Tweak comment and kick CI

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Treat isOpen as the ready/running signal.

Also align all initMutex usage.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Re-adjust comment

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Adjust CheckIsReady() to match OnlineDDL's expectation/usage

This was only using IsReady() before, now it's using
IsOpen() and IsReady().

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Get rid of log messages from SrvKeyspaceWatcher when no node/key

Signed-off-by: Matt Lord <mattalord@gmail.com>

* More corrections/tweaks

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use more convenient/clear new IsRunning function

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Revert "Use more convenient/clear new IsRunning function"

This reverts commit 9aef276 as this
change was not correct.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Further fix correct use of IsOpen(), IsRunning(), IsEnabled()

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* throttler.throttledApps cannot be nil

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* minor refactory/beautify for test

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fix flakiness of tabletmanager_throttler_topo test by: (1) proper wait-for functions, and (2) issue different queries per goroutine

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Fix typo in release notes

Signed-off-by: Matt Lord <mattalord@gmail.com>

---------

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
mattlord added a commit that referenced this pull request Jun 5, 2023
…ion message for old flags (#13130) (#13237)

* Table throttler: --throttler-config-via-topo now defaults to 'true'



* add deprecation message



* endtoend tests: remove '--enable-lag-throttler' and use 'UpdateThrottlerConfig' everywhere



* always use vtctldclient



* use cluster.VtctldClientProcess



* disable --throttler-config-via-topo in old throttler tests



* Remove --throttler-config-via-topo where used, since it now defaults 'true'



* fix vreplication cluster setup, waiting for throttler config to apply



* changelog



* extend throttler threshold



* a bit more verbose



* fixed CLI test



* remove old '--enable-lag-throttler' flag, introduce '--heartbeat_on_demand_duration'



* more log info in throttler.Open()



* more logging



* Revert to --heartbeat_enable



* Protect throttler config change application with initMutex

And in e2e test update the throttler config on the keyspace
when it's created. Only wait for the new tablets in a shard
to have the throttler enabled when adding a Shard.



* More CI testing



* CI testing cont



* Yes...



* Somebody doesn't like force pushes so msg here



* Increase on-demand heartbeat duration from 10s to 1m



* Use only on-demand heartbeats everywhere



* Use same throttler config everywhere



* Update all keyspaces and don't fail test on missing JSON keys



* Use constant heartbeats in vrepl e2e tests

Until #13175 is
fixed.



* Increase workflow command timeout



* Don't wait for throttler on non-serving primaries



* #13175 is fixed, therefore re-instating on-deman heartbeats



* Added ToC



* Tweak comment and kick CI



* Treat isOpen as the ready/running signal.

Also align all initMutex usage.



* Re-adjust comment



* Adjust CheckIsReady() to match OnlineDDL's expectation/usage

This was only using IsReady() before, now it's using
IsOpen() and IsReady().



* Get rid of log messages from SrvKeyspaceWatcher when no node/key



* More corrections/tweaks



* Use more convenient/clear new IsRunning function



* Revert "Use more convenient/clear new IsRunning function"

This reverts commit 9aef276 as this
change was not correct.



* Further fix correct use of IsOpen(), IsRunning(), IsEnabled()



* throttler.throttledApps cannot be nil



* minor refactory/beautify for test



* fix flakiness of tabletmanager_throttler_topo test by: (1) proper wait-for functions, and (2) issue different queries per goroutine



* Fix typo in release notes



---------

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
arvind-murty pushed a commit to arvind-murty/vitess that referenced this pull request Jun 7, 2023
…ion message for old flags (vitessio#13130)

* Table throttler: --throttler-config-via-topo now defaults to 'true'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* add deprecation message

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* endtoend tests: remove '--enable-lag-throttler' and use 'UpdateThrottlerConfig' everywhere

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* always use vtctldclient

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* use cluster.VtctldClientProcess

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* disable --throttler-config-via-topo in old throttler tests

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Remove --throttler-config-via-topo where used, since it now defaults 'true'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fix vreplication cluster setup, waiting for throttler config to apply

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* changelog

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* extend throttler threshold

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* a bit more verbose

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fixed CLI test

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* remove old '--enable-lag-throttler' flag, introduce '--heartbeat_on_demand_duration'

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* more log info in throttler.Open()

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* more logging

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Revert to --heartbeat_enable

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Protect throttler config change application with initMutex

And in e2e test update the throttler config on the keyspace
when it's created. Only wait for the new tablets in a shard
to have the throttler enabled when adding a Shard.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* More CI testing

Signed-off-by: Matt Lord <mattalord@gmail.com>

* CI testing cont

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Yes...

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Somebody doesn't like force pushes so msg here

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Increase on-demand heartbeat duration from 10s to 1m

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use only on-demand heartbeats everywhere

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use same throttler config everywhere

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Update all keyspaces and don't fail test on missing JSON keys

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use constant heartbeats in vrepl e2e tests

Until vitessio#13175 is
fixed.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Increase workflow command timeout

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Don't wait for throttler on non-serving primaries

Signed-off-by: Matt Lord <mattalord@gmail.com>

* vitessio#13175 is fixed, therefore re-instating on-deman heartbeats

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Added ToC

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Tweak comment and kick CI

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Treat isOpen as the ready/running signal.

Also align all initMutex usage.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Re-adjust comment

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Adjust CheckIsReady() to match OnlineDDL's expectation/usage

This was only using IsReady() before, now it's using
IsOpen() and IsReady().

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Get rid of log messages from SrvKeyspaceWatcher when no node/key

Signed-off-by: Matt Lord <mattalord@gmail.com>

* More corrections/tweaks

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Use more convenient/clear new IsRunning function

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Revert "Use more convenient/clear new IsRunning function"

This reverts commit 9aef276 as this
change was not correct.

Signed-off-by: Matt Lord <mattalord@gmail.com>

* Further fix correct use of IsOpen(), IsRunning(), IsEnabled()

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* throttler.throttledApps cannot be nil

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* minor refactory/beautify for test

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* fix flakiness of tabletmanager_throttler_topo test by: (1) proper wait-for functions, and (2) issue different queries per goroutine

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

* Fix typo in release notes

Signed-off-by: Matt Lord <mattalord@gmail.com>

---------

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Co-authored-by: Matt Lord <mattalord@gmail.com>
JiekerTime pushed a commit to JiekerTime/vitess that referenced this pull request Mar 8, 2024
* fix conflicts

* merge origin v18.0.0

* Back to dev mode after v18.0.0 (vitessio#14475)

* Release of v18.0.0 (vitessio#14406)

* [release-18.0] VReplication: Handle multiple streams in UpdateVReplicationWorkflow RPC (vitessio#14447) (vitessio#14468)

* [release-18.0] VDiff: "show all" should only report vdiffs for the specified keyspace and workflow (vitessio#14442) (vitessio#14466)

* [release-18.0] TestStreamMigrateMainflow: fix panic in test (vitessio#14425)

* [release-18.0] vtgate: Allow more errors for the warning check (vitessio#14421) (vitessio#14423)

* [release-18.0] servenv: Remove double close() logic (vitessio#14457) (vitessio#14459)

* [release-18.0] viper: register dynamic config with both disk and live (vitessio#14453) (vitessio#14455)

* [release-18.0] vtgate/engine: Fix race condition in join logic (vitessio#14435) (vitessio#14441)

* tx_throttler: remove topo watchers metric (vitessio#14444)

* [release-18.0] VDiff tablet selection: pick non-serving tablets in Reshard workflows (vitessio#14413) (vitessio#14418)

* Code freeze of release-18.0 (vitessio#14405)

* [release-18.0] Online DDL: lint DDL strategy flags (vitessio#14373) (vitessio#14399)

* [release-18.0] tuple: serialized form (vitessio#14392) (vitessio#14394)

* [release-18.0] schemadiff: fix missing `DROP CONSTRAINT` in duplicate/redundant constraints scenario. (vitessio#14387) (vitessio#14391)

* [release-18.0] Performance Fixes for Vitess 18 (vitessio#14383) (vitessio#14393)

* [release-18.0] Vtctld SwitchReads: fix bug where writes were also being switched as part of switching reads when all traffic was switched using SwitchTraffic (vitessio#14360) (vitessio#14379)

* [release-18.0] VDiff: wait for shard streams of one table diff to complete for before starting that of the next table (vitessio#14345) (vitessio#14382)

* [release-18.0] [Docker] Fix VTadmin build  (vitessio#14363) (vitessio#14378)

* [release-18.0] Fix backup on s3 like storage (vitessio#14311) (vitessio#14362)

* [release-18.0] use aggregation engine over distinct engine when overlapping order by (vitessio#14359) (vitessio#14361)

* [release-18.0] Automatic approval of `vitess-bot` clean backports (vitessio#14352) (vitessio#14357)

* [release-18.0] evalengine: Misc bugs (vitessio#14351) (vitessio#14354)

* [release-18.0] OnlineDDL: reduce vrepl_stress workload in forks (vitessio#14302) (vitessio#14349)

* [release-18.0] VReplication: Add --all-cells flag to create sub-commands (vitessio#14341) (vitessio#14343)

* [release-18.0] Incremental backup: fix race condition in reading 'mysqlbinlog' output (vitessio#14330) (vitessio#14335)

* [release-18.0] release notes: edit summary for consistency (vitessio#14319) (vitessio#14320)

* [release-18.0] Bump @babel/traverse from 7.21.4 to 7.23.2 in /web/vtadmin (vitessio#14304) (vitessio#14308)

* [release-18.0] Rename Foreign Key enum values in VSchema  and drop `FK_` prefix (vitessio#14274) (vitessio#14299)

* [release-18.0] VReplication: error on vtctldclient commands w/o tablet types (vitessio#14294) (vitessio#14298)

* [release-18.0] Make vtctldclient mount command more standard (vitessio#14281) (vitessio#14283)

* [release-18.0] VReplication: Add traffic state to vtctldclient workflow status output (vitessio#14280) (vitessio#14282)

* [release-18.0] fix: analyze statement parsing and planning (vitessio#14268) (vitessio#14275)

* [release-18.0] Bypass cobra completion commands so they still function (vitessio#14217) (vitessio#14234)

* [release-18.0] Bump golang.org/x/net from 0.14.0 to 0.17.0 (vitessio#14260) (vitessio#14264)

* Add vtctldclient info to the 18.0 summary (vitessio#14259)

* [release-18.0] Bump postcss from 8.4.21 to 8.4.31 in /web/vtadmin (vitessio#14173) (vitessio#14258)

* [release-18.0] Bump github.com/cyphar/filepath-securejoin from 0.2.3 to 0.2.4 (vitessio#14239) (vitessio#14253)

* [release-18.0] Tablet throttler: fix race condition by removing goroutine call (vitessio#14179) (vitessio#14198)

* [release-18.0] fix:  insert with negative value (vitessio#14244) (vitessio#14247)

* [release-18.0] Fix anonymous paths in cobra code-gen (vitessio#14185) (vitessio#14238)

* [release-18.0] Throttler: set timeouts on gRPC communication and on topo communication (vitessio#14165) (vitessio#14167)

* [release-18.0] Move all examples to vtctldclient (vitessio#14226) (vitessio#14241)

* [release-18.0] VReplication: Add missing info to vtctldclient workflow SHOW output (vitessio#14225) (vitessio#14240)

* [release-18.0] Upgrade the Golang version to `go1.21.3` (vitessio#14230)

* [release-18.0] Optimize the GetWorkflows RPC (vitessio#14212) (vitessio#14233)

* [Release 18.0]: Online DDL: timeouts for all gRPC calls (vitessio#14182) (vitessio#14189)

* [release-18.0] Migrate Materialize command to vtctldclient (vitessio#14184) (vitessio#14214)

* [Release 18.0] Backport  of vitessio#17174 (vitessio#14210)

* [release-18.0] Upgrade the Golang version to `go1.21.2` (vitessio#14195)

* [release-18.0] Migrate CreateLookupVindex and ExternalizeVindex to vtctldclient (vitessio#14086) (vitessio#14183)

* Back to dev mode after `v18.0.0-rc1` release (vitessio#14169)

* Release of v18.0.0-rc1 (vitessio#14136)

* [release-18.0] docker: add dedicated vtorc container (vitessio#14126) (vitessio#14148)

* [release-18.0] gen4: Support explicit column aliases on derived tables (vitessio#14129) (vitessio#14156)

* Code freeze of release-18.0 (vitessio#14131)

* VTGate FK stress tests suite: improvements (vitessio#14098)

* DDL execution to commit open transaction (vitessio#14110)

* Summary changes for foreign keys (vitessio#14112)

* Move subqueries to use the operator model (vitessio#13750)

* servenv: Allow for explicit bind address (vitessio#13188)

* vtorc: add detected_problems counter (vitessio#13967)

* fix bad copy-paste in zkctld docgen (vitessio#14123)

* VDiff: Cleanup the controller for a VDiff before deleting it (vitessio#14107)

* Remove deprecated flags before `v18.0.0` (vitessio#14071)

* Remove FOSSA Test from CI until we can do it in a secure way (vitessio#14119)

* go/cmd/vtbackup: wait for plugins to finish initializing (vitessio#14113)

* bugfix: change column name and type to json (vitessio#14093)

* `vtctld`/`vtorc`: improve reparenting stats (vitessio#13723)

* E2E Fuzzing testing for foreign keys (vitessio#13980)

* Backup/restore: provision and restore a tablet with point-in-time recovery flags (vitessio#13964)

* anonymize homedirs in generated docs (vitessio#14101)

* VDiff: properly split cell values in record when using TabletPicker (vitessio#14099)

* miscellaneous cobras (vitessio#14069)

* Reduce network pressure on multi row insert (vitessio#14064)

* [cli] cobra zookeeper (vitessio#14094)

* switch casing in onlineddl subcommand help text (vitessio#14091)

* Fix Fk verification and update queries to accommodate for bindVariables being NULL (vitessio#14061)

* actually test vtcombo (vitessio#14095)

* VDiff: Migrate client command to vtctldclient (vitessio#13976)

* ci: pool-related test flakyness (vitessio#14076)

* Improve the rewriter to simplify more queries (vitessio#14059)

* update docgen to embed commit ID in autogenerated doc frontmatter (vitessio#14056)

* [CLI] cobra lots of things (vitessio#14007)

* Add VSchema DDL support for dropping sequence and auto increment (vitessio#13882)

* json: Fix quoting JSON keys (vitessio#14066)

* Flakes: Address TestMigrate Failures (vitessio#12866)

* [cli] migrate mysqlctl and mysqlctld to cobra (vitessio#13946)

* evalengine: Mark UUID() function as non-constant (vitessio#14051)

* Fix cascading Delete failure while using Prepared statements (vitessio#14048)

* remove query_analyzer binary and release (vitessio#14055)

* go/vt/mysqlctl: instrument s3 upload time (vitessio#12500)

* Add session flag for stream execute grpc api (vitessio#14046)

* Endtoend: stress tests for VTGate FOREIGN KEY support (vitessio#13799)

* metrics: change vtbackup_duration_by_phase to binary-valued vtbackup_phase (vitessio#12973)

* TableGC: support DROP VIEW (vitessio#14020)

* Support arbitrary ZooKeeper config lines (vitessio#13829)

* Bump protobufjs from 7.2.3 to 7.2.5 in /web/vtadmin (vitessio#13833)

* Bump tough-cookie and @cypress/request in /vitess-mixin/e2e (vitessio#13768)

* Remove excessive logging in transactions (vitessio#14021)

* Fix bug in `fileNameFromPosition` test helper (vitessio#13778)

* MoveTables Cancel: drop denied tables on target when dropping source/target tables (vitessio#14008)

* java: update to latest dependencies for grpc and protobuf (vitessio#13996)

* OnlineDDL: cleanup cancelled migration artifacts; support `--retain-artifacts=<duration>` DDL strategy flag (vitessio#14029)

* moved timeout test to different package (vitessio#14028)

* fix: cost to include subshard opcode (vitessio#14023)

* VReplication VPlayer: set foreign_key_checks on initialization (vitessio#14013)

* go/cmd/vtbackup: report replication status metrics during catch-up phase (vitessio#13995)

* Fix the `SELECT` query we run on the child table to verify that update is allowed on a RESTRICT constraint (vitessio#13991)

* fix data race in join engine primitive olap streaming mode execution (vitessio#14012)

* test: added test to check binlogs to contain the cascade events (vitessio#13970)

* Fix `TestLeftJoinUsingUnsharded` and remove instability when running E2E locally (vitessio#13973)

* Enable failures in `tools/e2e_test_race.sh` and fix races (vitessio#13654)

* Improve release process documentation (vitessio#14000)

* Make `Static Code Checks Etc` fail if the `./changelog` folder is out-of-date (vitessio#14003)

* Fix foreign key plan tests expectation (vitessio#13997)

* Flakes: Add recently added 'select rows_copied' query to ignore list (vitessio#13993)

* Foreign key cascade: retain "for update" lock on select query plans (vitessio#13985)

* Fix `NOT IN` expression used in the SET NULL for a child table on an update  (vitessio#13988)

* consolidate docs (vitessio#13959)

* VDiff: correct handling of default source and target cells (vitessio#13969)

* [cobra] vtgate and vttablet (vitessio#13943)

* handle large number of predicates without timing out (vitessio#13979)

* Fix missing deprecated flags in `vttablet` and `vtgate` (vitessio#13975)

* wrangler,workflow/workflow: materialize from intersecting source shards based on primary vindexes (vitessio#13782)

* Cache v3 (vitessio#13939)

* Implement Reshard in vtctldclient (vitessio#13792)

* Disallow Insert with Duplicate key update and Replace Into queries on foreign key column, set locks on fk queries (vitessio#13953)

* End to end testing suite for foreign keys (vitessio#13870)

* copy over existing vreplication rows copied to local counter if resuming from another tablet (vitessio#13949)

* vtctldclient OnlineDDL: support `throttle`, `unthrottle` (vitessio#13916)

* Change internal vindex type recommendation for integrals to xxhash (vitessio#13956)

* VTOrc converts a tablet to DRAINED type if it detects errant GTIDs on it (vitessio#13873)

* Fix `ApplySchema --batch-size` with ` --allow-zero-in-date` (vitessio#13951)

* Tablet throttler: empty list of probes on non-leader (vitessio#13926)

* VReplication: Handle SQL NULL and JSON 'null' correctly for JSON columns (vitessio#13944)

* [vtctld] more cobra binaries (vitessio#13930)

* Merge branch 'planetscale-fk-verify-update-planning'

* [main] Upgrade the Golang version to `go1.21.1` (vitessio#13933)

* update fk error messages

* cascade with new value should work

* small refactor - removed unused code

* return verify error based on type

* store the verification type

* update plan output test

* small refactor

* feat: add code to verify valdity of ON UPDATE RESTRICT foreign keys wherever required

* Rewrite `USING` to `ON` condition for joins (vitessio#13931)

* test: fix tests to reflect the recent changes

* feat: add code to verify update with cascade run with foreign key checks 0 and validate all the foreign keys on vtgate

* add where condition to fk verify query

* changed parent verification query and accordingly change the engine primitive for foreign key constraint verification

* foreign key verify operator and logical

* refactor: add comment explaining map in input

* add support for foreign key constraint verify on update

* refactor: move DML logic to sql_builder.go (vitessio#13920)

* OnlineDDL: fix nil 'completed_timestamp' for cancelled migrations (vitessio#13928)

* Silence 'CheckThrottler' gRPC calls (vitessio#13925)

* migrate vtorc to use cobra commands (vitessio#13917)

* MoveTables:  allow copying all tables in a single atomic copy phase cycle (vitessio#13137)

* proto: Faster clone (vitessio#13914)

* Properly support ignore_nulls in CreateLookupVindex (vitessio#13913)

* [staticcheck] Last few staticchecks! (vitessio#13909)

* icuregex: Update to ICU 73 (vitessio#13912)

* gen4: Fast aggregations (vitessio#13904)

* Use correct syntax in test (vitessio#13907)

* Misc Local Install improvements. (vitessio#13446)

* Consolidate helper functions for working with proto3 time messages (vitessio#13905)

* Add vtsql flags to vtadmin (vitessio#13674)

* MoveTables: add flag to specify that routing rules should not be created when a movetables workflow is created (vitessio#13895)

* [staticcheck] miscellaneous tidying (vitessio#13892)

* [misc] tidy imports (vitessio#13885)

* Remove duplicate ACL check in tabletserver handleHTTPConsolidations (vitessio#13876)

* vtexplain: Fix passing through context for cleanup (vitessio#13900)

* [staticcheck] Cleanup deprecations (vitessio#13898)

* inputs method to return additional information about the input primitive (vitessio#13883)

* vttablet: do not notify `vtgate` about internal tables (vitessio#13897)

* Skip launchable if the Pull Request is marked as a Draft (vitessio#13886)

* vtctldclient: support OnlineDDL `complete`, `launch` commands  (vitessio#13896)

* collations: implement collation dumping as a docker image (vitessio#13879)

* sqlparser: Tablespace option is case sensitive (vitessio#13884)

* Clean up deprecated slice header usage and unused code (vitessio#13880)

* [misc] Delete more unused functions, tidy up dupe imports (vitessio#13878)

* collations: Refactor to separate basic collation information from data (vitessio#13868)

* [wrangler] cleanup unused functions (vitessio#13867)

* Fix setup order to avoid races (vitessio#13871)

* Add Foreign key verify constraint engine primitive (vitessio#13848)

* Foreign key cascade planning for DELETE and UPDATE queries (vitessio#13823)

* Fix merge conflict with new tests (vitessio#13869)

* vtctldclient OnlineDDL CANCEL (vitessio#13860)

* Add leak checking for vtgate tests (vitessio#13835)

* Fix for "text type with an unknown/unsupported collation cannot be hashed" error (vitessio#13852)

* Go 1.21 cleanups (vitessio#13862)

* VTGate Buffering: Use a more accurate heuristic for determining if we're doing a reshard (vitessio#13856)

* docker/bootstrap: remove --no-cache flag (vitessio#13785)

* Flakes: Improve reliability of vreplication_copy_parallel test (vitessio#13857)

* [main] Upgrade the Golang version to `go1.21.0` (vitessio#13853)

* Fix regular expression issue in Golang Upgrade and remove `release-14.0` from target branch (vitessio#13846)

* Migrates most workflows to 4 and 16 cores Large GitHub-Hosted-Runners (vitessio#13845)

* [onlineddl] retry and cleanup (vitessio#13830)

* Bump word-wrap from 1.2.3 to 1.2.4 in /web/vtadmin (vitessio#13569)

* Bump tough-cookie from 4.1.2 to 4.1.3 in /web/vtadmin (vitessio#13767)

* Skip VTAdmin build in Docker tests (vitessio#13836)

* Flakes: Synchronize access to logErrStacks in vterrors (vitessio#13827)

* Flakes: VReplication unit tests: reduce goroutine leakage (vitessio#13824)

* Remove explicit usage of etcd v2 (api and storage) (vitessio#13791)

* Add OnlineDDL show support (vitessio#13738)

* VReplication: Improve MoveTables Create Error Handling (vitessio#13737)

* Add 2 new metrics with tablet type labels (vitessio#13521)

* Add 2 more durability policies that allow RDONLY tablets to send semi-sync ACKs (vitessio#13698)

* CI: Misc test improvements to limit failures with various runners (vitessio#13825)

* Flakes: empty vtdataroot before starting a new vreplication e2e test (vitessio#13803)

* Fix `BackupShard` to get its options from its own flags (vitessio#13813)

* Flakes: skip flaky check that ETA for a VReplication VDiff2 Progress command is in the future. (vitessio#13804)

* Copy release notes for v17.0.2 and v16.0.4 (vitessio#13811)

* More union merging (vitessio#13743)

* Add Foreign key Cascade engine primitive (vitessio#13802)

* Add support for tuple as value type (vitessio#13800)

* Foreign Keys: `UPDATE` planning (vitessio#13762)

* Improving random query generation for endtoend testing (vitessio#13460)

* Flakes: Delete VTDATAROOT files in reparent test teardown within CI (vitessio#13793)

* vtgate: fix race condition iterating tables and views from schema tracker (vitessio#13673)

* Fixing `backup_pitr` flaky tests via wait-for loop on topo reads (vitessio#13781)

* Do not drain tablet in incremental backup (vitessio#13773)

* Address vttablet memory usage with backups to Azure Blob Service (vitessio#13770)

* Run auto golang upgrade only on vitessio/vitess (vitessio#13766)

* Flakes: remove non-determinism from vtctldclient MoveTables unit test (vitessio#13765)

* Minor --initialize-target-sequences followups (vitessio#13758)

* CI: fix onlineddl_scheduler flakiness (vitessio#13754)

* VReplication: Initialize Sequence Tables Used By Tables Being Moved (vitessio#13656)

* Bump docker images to `bullseye` (vitessio#13664)

* Use NodeJS v18 in VTAdmin Dockerfile (vitessio#13751)

* Refactor Expression and Statement Simplifier (vitessio#13636)

* Foreign Keys: `DELETE` planning (vitessio#13746)

* build: Allow passing in custom -ldflags (vitessio#13748)

* Point in time recovery: fix cross-tablet GTID evaluation (vitessio#13555)

* Cache info schema table info (vitessio#13724)

* BackupShard: support incremental backup (vitessio#13522)

* OnlineDDL: support @@migration_context in vtgate session. Use if non-empty (vitessio#13675)

* schemadiff: add time measure test for massive schema load and diff (vitessio#13697)

* Vtgate: pass 'SHOW VITESS_MIGRATIONS' to tablet's query executor (vitessio#13726)

* Move UNION planning to the operators (vitessio#13450)

* Fix vtcombo DBDDL plugin race condition (vitessio#13117)

* Foreign Keys: `INSERT` planning (vitessio#13676)

* go/vt/vitessdriver: implement driver.{Connector,DriverContext} (vitessio#13704)

* Backup: safe compressor/decompressor closure (vitessio#13668)

* sqlparser: Track if original default value is a literal (vitessio#13730)

* Solve RevertMigration.Comment read/write concurrency issue (vitessio#13700)

* `ApplySchema`: support `--batch-size` flag in 'direct' strategy (vitessio#13693)

* Fix closed channel `panic` in Online DDL cutover (vitessio#13729)

* mysql: Refactor dependencies (vitessio#13688)

* vtgate tablet gateway buffering: don't shutdown if not initialized (vitessio#13695)

* [OnlineDDL] add label so break works as intended (vitessio#13691)

* fastparse: Fix bug in overflow detection (vitessio#13702)

* Refactor code to remove `evalengine` as a dependency of `VTOrc` (vitessio#13642)

* `txthrottler`: remove `txThrottlerConfig` struct, rely on `tabletenv` (vitessio#13624)

* [vtctldclient] flags need to be defined to be deprecated (vitessio#13681)

* Add dry-run/monitoring-only mode for TxThrottler (vitessio#13604)

* Enhancing VTGate buffering for MoveTables and Shard by Shard Migration (vitessio#13507)

* Improvements to PRS (vitessio#13623)

* Errant GTID Metrics Refactor (vitessio#13670)

* Improve logging and renaming PrimaryTermStartTimestamp in vttablets (vitessio#13625)

* Fix a couple of logs in VTOrc (vitessio#13667)

* Throttler: exempt apps via `UpdateThrottlerConfig --throttle-app-exempt` (vitessio#13666)

* Tablet throttler: inter-checks via gRPC  (vitessio#13514)

* Reroute 'ALTER VITESS_MIGRATION ... THROTTLE ...' through topo (vitessio#13511)

* Build foreign key definition in schema tracker (vitessio#13657)

* evalengine: Fix JSON weight string computation (vitessio#13669)

* Throttler: verify deprecated flags are still allowed (vitessio#13615)

* Backup & Restore: vtctldclient to support PITR flags (vitessio#13513)

* `UpdateThrottlerConfig --unthrottle-app ...` (vitessio#13494)

* tx throttler: healthcheck all cells if `--tx-throttler-healthcheck-cells` is undefined (vitessio#12477)

* evalengine: Improve weight string support (vitessio#13658)

* mysqlctl: Reduce logging for running commands (vitessio#13659)

* Add v15.0.4, v16.0.3, and v17.0.1 changelogs (vitessio#13661)

* [viper WatchConfig] platform-specific write to ensure callback fires exactly once (vitessio#13627)

* go/mysql: switch to new API for x/exp/slices.SortFunc (vitessio#13644)

* Per workload TxThrottler metrics (vitessio#13526)

* VReplication: Make Source Tablet Selection More Robust (vitessio#13582)

* Update known issues in `v16.x` and `v17.0.0` (vitessio#13618)

* Augment VTOrc to also store the shard records and use it to better judge Primary recoveries (vitessio#13587)

* icuregex: Lazy load ICU data into memory (vitessio#13640)

* Ensure to call `servenv.Init` when needed (vitessio#13638)

* GetSchema: limit concurrent operations (vitessio#13617)

* check keyspace snapshot time if none specified for backup restores (vitessio#13557)

* vtgate buffering logic: remove the deprecated healthcheck based implementation (vitessio#13584)

* Fix type comparisons for Nullsafe* functions (vitessio#13605)

* Reintroduce `TestReadOutdatedInstanceKeys` with debugging information (vitessio#13562)

* mysqlctl: Remove noisy log line (vitessio#13599)

* Vtctldclient MoveTables (vitessio#13015)

* txthrottler: add metrics for topoWatcher and healthCheckStreamer (vitessio#13153)

* Reduce usages of old horizon planning fallback (vitessio#13595)

* Throttler: reintroduce deprecated flags so that deprecation actually works (vitessio#13597)

* Incremental backup & recovery: restore-to-timestamp (vitessio#13270)

* Fix potential panics due to "Fail in goroutine after test completed" (vitessio#13596)

* ignore all error for views in engine reload (vitessio#13590)

* vtgate table schema tracking to use GetSchema rpc (vitessio#13544)

* MoveTables sequence e2e tests: change terminology to use basic vs simple everywhere for partial movetables workflows (vitessio#13435)

* Skip VTAdmin build in more places (vitessio#13588)

* Fix show character set (vitessio#13565)

* Remove unused chromedriver (vitessio#13573)

* fix TestQueryTimeoutWithTables flaky test (vitessio#13579)

* CI: Fix make build related issues (vitessio#13583)

* docker/mini: remove refs to orc configs (vitessio#13495)

* stats: use *time.Ticker instead of time.After() (vitessio#13492)

* VReplication: Ensure ROW events are sent within a transaction (vitessio#13547)

* Add a `keyspace` configuration in the `vschema` for foreign key mode (vitessio#13553)

* vreplication: Move to use collations package (vitessio#13566)

* Flaky tests: Fix race in memory topo (vitessio#13559)

* Fix flaky vtgate test TestInconsistentStateDetectedBuffering (vitessio#13560)

* Flaky tests: Fix wrangler tests (vitessio#13568)

* Optimize `make build` in `test.go` and in CI (vitessio#13567)

* Unset the PREFIX environment variable when building VTAdmin (vitessio#13554)

* vtctldclient: Add missing new backup option (vitessio#13543)

* Skip flaky test `TestReadOutdatedInstanceKeys` (vitessio#13561)

* Fix a number of encoding issues when evaluating expressions with the evalengine (vitessio#13509)

* Deflake `TestPlannedReparentShardPromoteReplicaFail` (vitessio#13548)

* [vipersync] deflake TestWatchConfig (vitessio#13545)

* vtgate v3 planner removal (vitessio#13458)

* Fix flakiness in VTOrc tests (vitessio#13489)

* Replace deprecated `github.com/golang/mock` with `go.uber.org/mock` (vitessio#13512)

* Fix dependencies in docker build script (vitessio#13520)

* fix docgen for subcommands (vitessio#13518)

* Merge pull request vitessio#13515 from planetscale/partial-movetables-traffic-status

* Correct unit test

* docker/k8s: add bookworm builds (vitessio#13436)

* ignore ongoing backfill vindex from routing selection (vitessio#13505)

* Adjust function names for improved clarity

* More fixes related to partial traffic handling

* Ignore unrelated shards in partial movetables workflow status

* Better handling of vreplication setState() failure (vitessio#13488)

* flags: Remove hardcoded runner paths (vitessio#13482)

* viperutil: Remove potential cross site reflecting issue (vitessio#13483)

* skip flaky test (vitessio#13501)

* feat: remove --disable_active_reparents flag in vttablet-up.sh (vitessio#13504)

* fix: error.as method usage to send pointer to the reference type expected. (vitessio#13496)

* Fix `TestGatewayBufferingWhileReparenting` flakiness (vitessio#13469)

* ApplySchema: deprecate '--allow_long_unavailability' flag (vitessio#10717)

* Incremental backup: accept GTID position without 'MySQL56/' flavor prefix (vitessio#13474)

* Deprecating and removing tablet throttler CLI flags and tests (vitessio#13246)

* Online DDL: improved row estimation via ANALYE TABLE with --analyze-table strategy flag (vitessio#13352)

* Tablet throttler: throttled app configuration via `vtctl UpdateThrottlerConfig` (vitessio#13351)

* added no-commit-collection option to launchable record build command (vitessio#13490)

* Fix remote VersionString API (vitessio#13484)

* Fix logging by omitting the host and port in `SetReadOnly` (vitessio#13470)

* `vtctl OnlineDDL`: complete command set (vitessio#12963)

* Random selection of keyspace based on available tablet (vitessio#13359)

* Improve and Fix Distinct Aggregation planner (vitessio#13466)

* Fix ubi8.arm64.mysql80 build package mirrorserver error (vitessio#13431)

* backup: Allow for upgrade safe backups (vitessio#13449)

* Merge pull request vitessio#13468 from timvaillancourt/examples-compose-fix-consul-tag

* Update a number of dependencies (vitessio#13031)

* Enable Tcp keep alive and provide keep alive period setting (vitessio#13434)

* rm mistaken commit

* Update docker-compose.beginners.yml too

* compose: fix `consul:latest` error

* Add support for kill statement (vitessio#13371)

* feat: remove excessive logging (vitessio#13459)

* schema.Reload(): ignore column reading errors for views only, error for tables (vitessio#13442)

* [CI] deflake viper sync tests (vitessio#13185)

* mysql: introduce icuregex package (vitessio#13391)

* Fix `Fakemysqldaemon` to store the host and port after `SetReplicationSource` call (vitessio#13439)

* Examples: only terminate vtadmin if it was started (vitessio#13433)

* [main] Upgrade-Downgrade Fix: Schema-initialization stuck on semi-sync ACKs while upgrading (vitessio#13411) (vitessio#13440)

* Move more horizon planning to the operators (vitessio#13412)

* Refactor backup_pitr into two distinct CI tests: builtin vs Xtrabackup (vitessio#13395)

* Improve VTOrc logging statements, now that we have alias as a field (vitessio#13428)

* vtorc: Cleanup more unused code (vitessio#13354)

* VReplication Workflows: make sequence tables follow routing rules (vitessio#13238)

* Ignore error while reading table data in Schema.Engine reload (vitessio#13421)

* Improve time taken to run the examples by optimizing `vtadmin` build (vitessio#13262)

* vtctl,vindexes: logs warnings and export stat for unknown vindex params (vitessio#13322)

* Add end-of-life documentation + re-organize internal documentation (vitessio#13401)

* Adding random query generation for endtoend testing of the Gen4 planner (vitessio#13260)

* Aggregation engine refactor (vitessio#13378)

* Deflake `TestQueryTimeoutWithDual` test (vitessio#13405)

* Optimize release notes generation to use GitHub Milestones (vitessio#13398)

* feat: don't run any reparent commands if the host is empty (vitessio#13396)

* `vttestserver`: persist vschema changes in `--persistent_mode` (vitessio#13065)

* Tablet throttler: only start watching SrvKeyspace once it's confirmed to exist (vitessio#13384)

* Support views in BaseShowTablesWithSizes for MySQL 8.0 (vitessio#13394)

* Fix incorrect output in release scripts (vitessio#13385)

* Adds support for ANY_VALUE (vitessio#13342)

* BaseShowTablesWithSizes: optimize MySQL 8.0 query (vitessio#13375)

* Forward port of release notes changes from v17.0.0 GA (vitessio#13370)

* feat: add timestamp to vtorc debug page (vitessio#13379)

* Prevent resetting replication every time we set replication source (vitessio#13377)

* Local example 101: idempotent on existing clusters (vitessio#13373)

* governance: clean up language, link steering doc from first occurrence instead of from a random occurrence (vitessio#13337)

* Add group_concat aggregation support (vitessio#13331)

* Improve VTOrc failure detection to be able to better handle dead primary failures (vitessio#13190)

* Improve lock action string (vitessio#13355)

* Add RestorePosition and RestoredBackupTime as metrics to vttablet (vitessio#13339)

* update link for reparenting guide (vitessio#13350)

* Handle inconsistent state error in query buffering (vitessio#13333)

* mysqlctl: Move more to use built in MySQL client (vitessio#13338)

* VTOrc: Update the primary key for all the tables from `hostname, port` to `alias` (vitessio#13243)

* docker/k8s: Cleanup done TODO (vitessio#13347)

* bug: don't always wrap aggregation in coalesce (vitessio#13348)

* Vttablet schema tracking: Fix _vt.schema_version corruption (vitessio#13045)

* bugfixes: collection of fixes to bugs found while fuzzing (vitessio#13332)

* Remove CI endtoend test for VReplication copy throttling (vitessio#13343)

* Use sqlparser for all dynamic query building in VDiff2 (vitessio#13319)

* txthrottler: verify config at vttablet startup, consolidate funcs (vitessio#13115)

* mysqlctl: Use DBA connection for schema operations (vitessio#13178)

* Bug fix: SQL queries erroring with message `unknown aggregation random` (vitessio#13330)

* Support complex aggregation in Gen4's Operators (vitessio#13326)

* [VTAdmin] Upgrade to use node 18.16.0 (vitessio#13288)

* Cleanup unused Dockerfile entries (vitessio#13327)

* Add metric for showing the errant GTIDs in VTOrc (vitessio#13281)

* vtgate planner: HAVING in the new operator horizon planner (vitessio#13289)

* vtgr: Remove deprecated vtgr (vitessio#13308)

* mysqlctl: Correctly encode database and table names (vitessio#13312)

* sqlparser: Add support for TIMESTAMPADD (vitessio#13314)

* Refactor and add a comment to schema initialisation code (vitessio#13309)

* Release notes for 17.0.0-rc2 (vitessio#13306)

* remove os.Exit (vitessio#13310)

* k8stopo: Remove the deprecated Kubernetes topo (vitessio#13303)

* vindexes: return unknown params (vitessio#12951)

* Operator planner refactor (vitessio#13294)

* Deprecate VTGR (vitessio#13301)

* k8stopo: Include deprecation warning (vitessio#13299)

* [main] Upgrade the Golang version to `go1.20.5` (vitessio#13256)

* evalengine: implement date/time math (vitessio#13274)

* increase size of reparent_journal columns (vitessio#13287)

* VReplication: Fix VDiff2 DeleteByUUID Query (vitessio#13255)

* Miscellaneous code modifications based on observations made while doing a code walkthrough (vitessio#12873)

* Set the number of threads for release notes generation with a flag (vitessio#13273)

* Fix and Make aggregation planner handle aggregation functions better (vitessio#13228)

* fix: ShardedRouting clone to clone slice of reference correctly (vitessio#13265)

* Remove viper warnings from local examples (vitessio#13234)

* Add flag to VTOrc to enable/disable its ability to run ERS (vitessio#13259)

* Remove `out.txt` and add `release-17.0` to go upgrade automation (vitessio#13261)

* build(deps-dev): bump vite from 4.2.1 to 4.2.3 in /web/vtadmin (vitessio#13240)

* Copy v17.0.0-rc changelog to main (vitessio#13248)

* fix: GetField to use existing session for query (vitessio#13219)

* Fix flakiness in `TestDeadPrimaryRecoversImmediately` (vitessio#13232)

* Augmenting the `GetSchema` RPC to also work for `Table` and `All` type of input (vitessio#13197)

* Incremental backup and point in time recovery for XtraBackup  (vitessio#13156)

* VReplication: More intelligently manage vschema table entries on unsharded targets (vitessio#13220)

* Use $hostname in vtadmin script as other scripts do (vitessio#13231)

* Tablet throttler: throttler-config-via-topo defaults 'true', deprecation message for old flags (vitessio#13130)

* [ci] add generator for templated flag testdata (vitessio#13150)

* schemadiff: validating case-sensitive view names (vitessio#13208)

* Add security audit report (vitessio#13221)

* gentler warning message on config-not-found (vitessio#13215)

* Bump the vitess version on main (vitessio#13212)

* Handle DISTINCT with the new operators (vitessio#13201)

* Gen4: move insert planner to gen4 (vitessio#12934)
frouioui pushed a commit to planetscale/vitess that referenced this pull request Mar 26, 2024
…po defaults 'true', deprecation message for old flags (vitessio#2333)

* cherry pick of 13130

* resolved conflict

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>

---------

Signed-off-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Component: TabletManager Type: Enhancement Logical improvement (somewhere between a bug and feature)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants