Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PurgeOldVersions+RunValueLogGC does not seem to still work for me immediately #444

Closed
fingon opened this issue Mar 21, 2018 · 9 comments
Closed
Assignees

Comments

@fingon
Copy link

fingon commented Mar 21, 2018

Given long-running process (that only uses db.View and db.Update short-lived transactions), calling db.PurgeOldVersions and then db.RunValueLogGC(0.5) does not cause any data to be deleted even if I set and subsequently almost delete every key. I do have 'few' (5) vlog files and their associated ssts in this case.

Backup of the database is minimal at this stage (16MB as opposed to 5GB on-disk size; restored size is 37MB).

However, if I start fresh Badger db instance and do the same (PurgeOldVersions + RunValueLogGC) it behaves as expected. Is there something else than 'The caller must make sure that there are no long-running read transactions running before this function is called, otherwise they will not work as expected.' to care about?

I might have short-lived read transactions colliding with PurgeOldVersions, but most likely not; the behavior seems repeatable and at least my reading of that comment is that it does not blow things up altogether..? (Documentation on this could be bit more precise anyway.)

@janardhan1993
Copy link

@fingon: Can you please share a reproducible example.

@fingon
Copy link
Author

fingon commented Mar 23, 2018

I will try to come up with one; it comes up from using the relatively minimal API in https://github.com/fingon/go-tfhfs/blob/master/storage/badger/badger.go from number of goroutines in parallel, and eventually the flush part will simply stop doing anything (RunValueLogGC always returns the error of not having done anything.)

@tobiasrm
Copy link

tobiasrm commented Apr 15, 2018

I have a similar problem using badger to store an index for some faster data lookup. After some trials,
I tested the cleaning by simply storing 100.000 string keys/values (without overwrites) using the following code and the write/delete example code of the badgerDB documentation; the cleaning is always done afterwards.

for i := 0; i < 100000; i++ {
	WriteToDB( strconv.Itoa(i), strconv.Itoa(i), db )
	DeleteFromDB( strconv.Itoa(i), db )                 // un-/commented this line
}
db.PurgeOlderVersions()
db.RunValueLogGC(0.3)

Results:

  • Puts only: 8.7 MB
  • Puts and deletes directly afterwards: 20.2 MB

It seems to me that the cleaning didn't work and the delete transactions blew up the value log (makes sense if deletes are not removed). However, the descriptions on the badger db project site mentioned that purging old versions + GC should clean deletes, too.

Can you confirm that this is unusual and if yes, do you have a bugfix?
My current DB size grow up to multiple GB for ~500MB production data, where the uncompressed json serialized data structure has mere ~200MB ...

@jiminoc
Copy link
Contributor

jiminoc commented Apr 25, 2018

I've experienced this as well and put a repro case in #464

@manishrjain
Copy link
Contributor

@tobiasrm : So, I created this code: https://gist.github.com/manishrjain/647cc6ea41d2c10769a61f8c517dddae

and tested the write-delete pairs, where all keys written are deleted. With the latest change #471, the SSTable compaction discards most keys leaving a 212 byte sstable.

$ badger info --dir=.                                                             ~/badgertest
Listening for /debug HTTP requests at port: 8080

[2018-05-01T06:11:29-07:00] MANIFEST       48 B MA
[                      now] 000002.sst    212 B L1
[                      now] 000000.vlog   20 MB VL

[Summary]
Level 0 size:          0 B
Level 1 size:        212 B
Total index size:    212 B
Value log size:      20 MB

Abnormalities: None.
0 extra files.
0 missing files.
0 empty files.
0 truncated manifests.
SSTable [L1, 002] [216261646765722168656164, v200001     ->           3939393939, v200000    ]

Note that the value log is at 20MB. GC would not reclaim the latest value log, which is a read-write log. It only works on the value logs which have become "immutable". So, in a long-running process, you'll see the value logs being reclaimed over time.

Also, you don't want to avoid the storage costs of value log, you could increase the ValueThreshold

ValueThreshold: 20,
, to always store values along with the LSM tree. That way, whenever GC runs, it would be able to reclaim the entire value log file (minus the latest one).

I think the above-mentioned PR is a resolution to this issue. If you have other concerns, feel free to reopen or create a new issue.

@fingon
Copy link
Author

fingon commented May 2, 2018

At least I still encounter the same issue using badger from commit 754278d, unfortunately the project I was working on is stalled so I am not really motivated to tease out minimal testcase out of it.

What I am doing:

  • Set ~10k key=value pairs with small key + ~64kb value size each, each in unique sub-Update, semi-parallel (concurrency of ~5)
  • Sequentially call Delete on the keys (again in own sub-Updates if it matters)
  • even after awhile, calling PurgeOlderVersions + RunValueLogGC does nothing.

Re-opening database cleans it.

@manishrjain
Copy link
Contributor

It sounds like you'd have one or two value logs, so value log GC won't touch them. Also, 10K keys would barely fill even one memtable. In other words, there's not much data to clean up.

@fingon
Copy link
Author

fingon commented May 2, 2018

To test that with smaller dataset I've set it to bit smaller config values ( opts.ValueLogFileSize = 1 << 27 ) so in that case it is 6 value logs; still no go, all stay with 0 values visible in database. Also happened on larger scale back when I opened the bug (~50GB of data in 400 value log files).

Being lazy, I did only the smaller test on this most recent version.

Restarting and running same steps again cleans the value logs.

@manishrjain
Copy link
Contributor

If you can share a working code example, I could run it and see if I can reproduce it. Otherwise, it's hard to tell what's going on. My experimentation based on https://gist.github.com/manishrjain/647cc6ea41d2c10769a61f8c517dddae showed things working as expected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants