-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PurgeOldVersions+RunValueLogGC does not seem to still work for me immediately #444
Comments
@fingon: Can you please share a reproducible example. |
I will try to come up with one; it comes up from using the relatively minimal API in https://github.com/fingon/go-tfhfs/blob/master/storage/badger/badger.go from number of goroutines in parallel, and eventually the flush part will simply stop doing anything (RunValueLogGC always returns the error of not having done anything.) |
I have a similar problem using badger to store an index for some faster data lookup. After some trials,
Results:
It seems to me that the cleaning didn't work and the delete transactions blew up the value log (makes sense if deletes are not removed). However, the descriptions on the badger db project site mentioned that purging old versions + GC should clean deletes, too. Can you confirm that this is unusual and if yes, do you have a bugfix? |
I've experienced this as well and put a repro case in #464 |
@tobiasrm : So, I created this code: https://gist.github.com/manishrjain/647cc6ea41d2c10769a61f8c517dddae and tested the write-delete pairs, where all keys written are deleted. With the latest change #471, the SSTable compaction discards most keys leaving a 212 byte sstable.
Note that the value log is at 20MB. GC would not reclaim the latest value log, which is a read-write log. It only works on the value logs which have become "immutable". So, in a long-running process, you'll see the value logs being reclaimed over time. Also, you don't want to avoid the storage costs of value log, you could increase the Line 120 in e597fb7
I think the above-mentioned PR is a resolution to this issue. If you have other concerns, feel free to reopen or create a new issue. |
At least I still encounter the same issue using badger from commit 754278d, unfortunately the project I was working on is stalled so I am not really motivated to tease out minimal testcase out of it. What I am doing:
Re-opening database cleans it. |
It sounds like you'd have one or two value logs, so value log GC won't touch them. Also, 10K keys would barely fill even one memtable. In other words, there's not much data to clean up. |
To test that with smaller dataset I've set it to bit smaller config values ( opts.ValueLogFileSize = 1 << 27 ) so in that case it is 6 value logs; still no go, all stay with 0 values visible in database. Also happened on larger scale back when I opened the bug (~50GB of data in 400 value log files). Being lazy, I did only the smaller test on this most recent version. Restarting and running same steps again cleans the value logs. |
If you can share a working code example, I could run it and see if I can reproduce it. Otherwise, it's hard to tell what's going on. My experimentation based on https://gist.github.com/manishrjain/647cc6ea41d2c10769a61f8c517dddae showed things working as expected. |
Given long-running process (that only uses db.View and db.Update short-lived transactions), calling db.PurgeOldVersions and then db.RunValueLogGC(0.5) does not cause any data to be deleted even if I set and subsequently almost delete every key. I do have 'few' (5) vlog files and their associated ssts in this case.
Backup of the database is minimal at this stage (16MB as opposed to 5GB on-disk size; restored size is 37MB).
However, if I start fresh Badger db instance and do the same (PurgeOldVersions + RunValueLogGC) it behaves as expected. Is there something else than 'The caller must make sure that there are no long-running read transactions running before this function is called, otherwise they will not work as expected.' to care about?
I might have short-lived read transactions colliding with PurgeOldVersions, but most likely not; the behavior seems repeatable and at least my reading of that comment is that it does not blow things up altogether..? (Documentation on this could be bit more precise anyway.)
The text was updated successfully, but these errors were encountered: