-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql: migrate schema object removals and GC to DeleteRange
#70427
Comments
Reflecting on this a bit, the primary challenge will be the migration away from the GC job so that we can delete that code. My instinct is that the way to do it is to:
This is still hand wavy. The problem with this is that it may take 2x GC TTL for data to get removed. Seems fine. |
Will we use |
76203: batcheval: add range tombstone support for `DeleteRange` r=aliher1911,dt a=erikgrinaker This patch adds a parameter `UseExperimentalRangeTombstone` for `DeleteRange`, which deletes the span using an MVCC range tombstone. This must only be called after checking `storage.CanUseExperimentalMVCCRangeTombstones()`, which ensures the `ExperimentalMVCCRangeTombstones` version gate and `COCKROACH_EXPERIMENTAL_MVCC_RANGE_TOMBSTONES` environment variable are set. Did I mention that MVCC range tombstones are experimental? They are currently under active development, and are not respected by the KV or MVCC APIs, nor are they persisted. This patch simply sets up the plumbing for it. Resolves #70415. Touches #70427. Release note: None Co-authored-by: Erik Grinaker <grinaker@cockroachlabs.com>
Note that this does not change anything about tenant GC. Fixes cockroachdb#70427 Release note (sql change): The asynchronous garbage collection process has been changed such that very soon after dropping a table, index, or database, or after refreshing a materialized view, the system will issue range deletion tombstones over the dropped data. These tombstones will result in the KV statistics properly counting these bytes as garbage. Before this change, the asynchronous "gc job" would wait out the TTL and then issue a lower-level operation to clear out the data. That meant that while the job was waiting out the TTL, the data would appear in the statistics to still be live. This was confusing.
Note that this does not change anything about tenant GC. Fixes cockroachdb#70427 Release note (sql change): The asynchronous garbage collection process has been changed such that very soon after dropping a table, index, or database, or after refreshing a materialized view, the system will issue range deletion tombstones over the dropped data. These tombstones will result in the KV statistics properly counting these bytes as garbage. Before this change, the asynchronous "gc job" would wait out the TTL and then issue a lower-level operation to clear out the data. That meant that while the job was waiting out the TTL, the data would appear in the statistics to still be live. This was confusing.
85853: kv: ensure secondary tenants route follower reads to the closest replica r=arulajmani a=arulajmani The dist sender uses node locality information to rank replicas of a range by latency. Previously, this node locality information was read off a node descriptor available in Gossip. Unfortunately, secondary tenants do not have access to Gossip, and as such, would end up randomizing this list of replicas. This manifested itself through unpredictable latencies when running follower reads. We're no longer susceptible to this hazard with this patch. This is done by eschewing the need of a node descriptor from gossip in the DistSender; instead, we now instantiate the DistSender with locality information. However, we do still use Gossip to get the current node's ID when ranking replicas. This is done to ascertain if there is a local replica, and if there is, to always route to it. Unfortunately, because secondary tenants don't have access to Gossip, they can't conform to these semantics. They're susceptible to a hazard where a request may be routed to another replica in the same locality tier as the client even though the client has a local replica as well. This shouldn't be a concern in practice given the diversity heuristic. Resolves #81000 Release note (bug fix): fix an issue where secondary tenants could route follower reads to a random, far away replica instead of one closer. 85878: gcjob: issue DeleteRange tombstones and then wait for GC r=ajwerner a=ajwerner Note that this does not change anything about tenant GC. Fixes #70427 Release note (sql change): The asynchronous garbage collection process has been changed such that very soon after dropping a table, index, or database, or after refreshing a materialized view, the system will issue range deletion tombstones over the dropped data. These tombstones will result in the KV statistics properly counting these bytes as garbage. Before this change, the asynchronous "gc job" would wait out the TTL and then issue a lower-level operation to clear out the data. That meant that while the job was waiting out the TTL, the data would appear in the statistics to still be live. This was confusing. Co-authored-by: Arul Ajmani <arulajmani@gmail.com> Co-authored-by: Andrew Werner <awerner32@gmail.com>
Currently, removal of schema objects such as tables, indexes, and tenants are done by changing the descriptor visibility and scheduling a job to remove the object data using
ClearRange
once the GC TTL expires. However, as described in #69380, whenDeleteRange
is changed to use MVCC range tombstones in #70415 we should useDeleteRange
to remove these objects instead. This includes removing the schema GC jobs entirely, and relying on MVCC GC to remove the data when the TTL expires (requires #70414).There is a much cheaper initial option in #76105.
Related to #31563.
Epic CRDB-2624
Jira issue: CRDB-10070
The text was updated successfully, but these errors were encountered: