-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
storage: More improvements for rebalancing/compacting when disks are nearly full #22235
Conversation
though someone else should put eyes on this as well. The growing number of thresholds is something we should take a look at in 2.1. The interactions are becoming difficult to predict. Review status: 0 of 9 files reviewed at latest revision, 1 unresolved discussion, some commit checks broke. pkg/storage/compactor/compactor.go, line 334 at r9 (raw file):
Not sure if Comments from Reviewable |
@bdarnell can you spare a second set of eyes?
Indeed, it's getting ugly in places. Review status: 0 of 9 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. pkg/storage/compactor/compactor.go, line 334 at r9 (raw file): Previously, petermattis (Peter Mattis) wrote…
That's fair. Changed. Comments from Reviewable |
Review status: 0 of 9 files reviewed at latest revision, all discussions resolved, some commit checks broke. Comments from Reviewable |
Alright, so the first round of real cluster testing with this looked better than the control group running on master, but still isn't good. My test was to set up two 6-node clusters (one running this, one running master), put a 5GB ballast file on the 10GB disk of 3 of the nodes in each cluster, and then run
My next step is to try to understand what's going on in these compactions that's making them fill the disk (e.g. by examining the rocksdb manifest), whether it's expected behavior, and what we can do about it. I'm planning to put that off a bit and focus on #19985 today, though, so if anyone has thoughts in the meantime I'm all ears. |
Another thing to consider is that you're running on really small disks, so the breathing room provided by the fractional usage thresholds (
This is by our Compactor or by RocksDB? If it was the latter then Reviewed 2 of 2 files at r7, 2 of 2 files at r8, 2 of 3 files at r9, 1 of 1 files at r10. pkg/storage/compactor/compactor.go, line 100 at r10 (raw file):
Comments from Reviewable |
@petermattis is your take that this and #21866 should go in or that they should wait until after we understand/fix the problems I ran into when trying to recover from the full disks?
By our compactor. |
I think this can go in as-is given that this is a definite improvement over the current state. Would be nice to fully understanding the problem with recovering from full disks, but I'm also a believer in incremental progress. Review status: 4 of 9 files reviewed at latest revision, 1 unresolved discussion, some commit checks broke. Comments from Reviewable |
Yeah, it's very possible that's contributing to the problems here. I'll mention that in the follow-up issue. Review status: 0 of 9 files reviewed at latest revision, 1 unresolved discussion. pkg/storage/compactor/compactor.go, line 100 at r10 (raw file): Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
Done. Comments from Reviewable |
I'm not sure whether we should also do this for raft snapshots -- is it better for a node to run out of disk or get stuck in a behind state that causes other nodes to keep trying to send it snapshots? Release note: None
Touches cockroachdb#21400 Release note: Free up disk space more aggressively when the disk is closer to full.
This helps all nodes' allocators have more up-to-date capacity information sooner after a significant change. Touches cockroachdb#21400 Release note: None
Follow-up to #21866. Only the last three commits are new. This is the full version that I'll start testing on a real cluster with small disks.
Fixes #21400 (pending testing on a real cluster)