Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify "preserving DELETE performance over time" docs #12230

Open
rmloveland opened this issue Nov 10, 2021 · 6 comments
Open

Clarify "preserving DELETE performance over time" docs #12230

rmloveland opened this issue Nov 10, 2021 · 6 comments

Comments

@rmloveland
Copy link
Contributor

rmloveland commented Nov 10, 2021

Richard Loveland (rmloveland) commented:

In the DELETE docs we have a section on preserving DELETE performance over time which needs to be updated. It provides the user with the option to take

one of the following approaches:

  • At each iteration, update the WHERE clause to filter only the rows that have not yet been marked for deletion. For an example, see Batch-delete on an indexed filter.
  • At each iteration, first use a SELECT statement to return primary key values on rows that are not yet deleted. Rows marked for deletion will not be returned. Then, use a nested DELETE loop over a smaller batch size, filtering on the primary key values. For an example, see Batch delete on a non-indexed column.
  • To iteratively delete rows in constant time, using a simple DELETE loop, you can alter your zone configuration and change gc.ttlseconds to a low value like 5 minutes (i.e., 300), and then run your DELETE statement once per GC interval.

However, we have users running into this problem via a support issue who have apparently tried (one of? all of?) these approaches and not found satisfaction.

Therefore we need to update this page to make it clearer. In particular, given that this is still a known behavior of CockroachDB when scanning over tombstones according to cockroachdb/cockroach#17229, perhaps none of these actions will actually help? Also it's possible that providing the "choose your own adventure" of three choices is less helpful than it could be, since users will end up having to try all three.

Estimated scope of work:

  • Seek out an updated recommendation from Eng folks on what users should do here; ideally, it will be one recommendation, since it's unlikely that users can be reasonably expected to try all three and see what happens, esp in prod
  • Determine if, in fact, this is just a known limitation and we should make weaker claims in this doc about how to "fix it", since maybe it cannot be fixed via something we can say in docs

See also:

Jira Issue: DOC-1128

@rmloveland
Copy link
Contributor Author

@mwang1026 @awoods187 FYI - the behavior is storage, but the docs are SQL. Lemme know which product area this should be in (I added both labels for now)

@mwang1026
Copy link
Contributor

@mwang1026 @awoods187 FYI - the behavior is storage, but the docs are SQL. Lemme know which product area this should be in (I added both labels for now)

Is this question more about IA or about who writes the doc?

@rmloveland
Copy link
Contributor Author

Not IA, less about who writes it and more about who (of you all) will be prioritizing it.

If it's Storage, it might (?) get done sooner.

If it goes on the SQL docs mega-backlog, maybe less soon. (But that's just guessing/speculation)

@awoods187
Copy link
Contributor

This looks like @vy-ton to me (consistent with her TTL/Bulk Deletes research). Vy do you agree?

@vy-ton
Copy link
Contributor

vy-ton commented Nov 11, 2021

+1, let's put this on SQL docs backlog

@github-actions
Copy link

We have marked this issue as stale because it has been inactive for
18 months. If this issue is still relevant, removing the stale label
or adding a comment will keep it active. Otherwise, we'll close it in
10 days to keep the issue queue tidy. Thank you for your contribution
to CockroachDB docs!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants