Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ignore max-db-size limit when deleting data or writing aux informations #2047

Merged
merged 6 commits into from
Jan 30, 2024

Conversation

caipengbo
Copy link
Contributor

When we delete data in kvrocks, we will write delete makers to rocksdb. When our data size reaches max-db-size, we can't delete data.

For operations that delete data, we should ignore the max-db-size limit and allow the deletion to proceed, so we can clean up the data.

In addition, for some aux information(written into propagate column family), we should also ignore max-db-size.

@jihuayu
Copy link
Member

jihuayu commented Jan 24, 2024

Will this logic confuse users?
For example, when the data size reaches the max-db-size, if the user deletes a large amount of data, it will cause the db size to significantly exceed the db-size they have configured.

@caipengbo
Copy link
Contributor Author

caipengbo commented Jan 24, 2024

Will this logic confuse users?

@jihuayu There is also a risk of confusion, which we can explain in the configuration file. In addition, these delete makers will be eliminated during compaction, and the data size will decrease.

If max-db-size is not reached, users deleting large amounts of data will also cause disk explosion, this can also bother users.

@git-hulk
Copy link
Member

git-hulk commented Jan 24, 2024

@caipengbo The better way is to check if it reached the max db size if it's a write command before running. And add an extra flag to identify if the write command is allowed to run even though reached the limit of db size. What you do think about this?

@caipengbo
Copy link
Contributor Author

The better way is to check if it reached the max db size if it's a write command. And add an extra flag to identify if the write command is allowed to run even though reached the limit of db size.

That sounds better, I'll try to modify it.

@jihuayu
Copy link
Member

jihuayu commented Jan 24, 2024

@caipengbo The better way is to check if it reached the max db size if it's a write command before running. And add an extra flag to identify if the write command is allowed to run even though reached the limit of db size. What you do think about this?

I think we can make some reserved db space, such as:

  • When free db space is below 100MB, only delete commands write and system write can work.
  • When free db space is below 10MB, only system write can work.

What do you think? @git-hulk @caipengbo

@caipengbo
Copy link
Contributor Author

caipengbo commented Jan 24, 2024

Your idea is similar to what I did on our personalized version kvrocks. @jihuayu

It's just that we don't want to save disk space(we reserve some disk space), we want to avoid OOM, because if we delete a lot of data, we can create a very large WAL, which caches a lot of data in memory and very easy to cause OOM(we have encountered it), and even worse, we can't restart our program, because we need to replay the WAL at startup. At this point, all unflushed data will be loaded into memory, again leading to an OOM!

So we find all the places in the program where kvrocks can generate a large WAL, check it, and if it's larger than max-write-batch-size, we return an error so the user can't do it.

This is a feature I plan to implement in some subsequent PRs (about memory control).

@caipengbo caipengbo requested review from jihuayu and git-hulk and removed request for git-hulk January 26, 2024 03:35
git-hulk and others added 3 commits January 27, 2024 17:36
@git-hulk git-hulk merged commit fd33e55 into apache:unstable Jan 30, 2024
29 checks passed
Copy link

Quality Gate Passed Quality Gate passed

The SonarCloud Quality Gate passed, but some issues were introduced.

1 New issue
0 Security Hotspots
63.2% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

JoverZhang pushed a commit to JoverZhang/kvrocks that referenced this pull request Feb 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants