-
Notifications
You must be signed in to change notification settings - Fork 65
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Tracing GC for Backends #24
base: master
Are you sure you want to change the base?
Conversation
(loop [] | ||
(when-let [addr (pop-from-work-queue! gc-scratch)] | ||
(observe-addr! gc-scratch addr) | ||
(let [node (hh/resolve addr)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This resolve
should only be necessary if the current node is an index node. However, we currently can't tell from an address if we're looking at an index or data node, and so we also resolve all data nodes, which is terrible for perf. IResolve
should have a method index?
or something, so that we can dramatically reduce the IO by never reading data nodes into memory.
The test builds two random b-trees, and then uses one of them as a gc root, and including the other in the lazy sequence of "all keys" to be dealt with by the collector. After running a gc, we assert that the `deleted-fn` has been invoked against all the "dead" nodes.
I had a stab at writing some tests for this. It shook out what appear to be a couple of bugs in the implementation. |
Thank you for the testing! I want to ponder this a bit more, and then merge it so that it can be consumed by other backends :) |
This pulls in datacrypt-project#24, since I'd like to test out the GC too.
Based partially on datacrypt-project#24 Rewritten to primarily use core.async. * src/hitchhiker/tree/konserve.cljc (create-id): new function; prepends the current timestamp as hex to the UUID key. (KonserveBackend.-write-node): use create-id to generate the storage ID. * src/hitchhiker/tree/tracing-gc/konserve.cljc: new namespace. * src/hitchhiker/tree/tracing-gc.cljc: new namespace. * .gitignore: ignore IntelliJ files. * project.clj: update konserve to 0.6.0-SNAPSHOT.
Based partially on datacrypt-project#24 Rewritten to primarily use core.async. * src/hitchhiker/tree/konserve.cljc (create-id): new function; prepends the current timestamp as hex to the UUID key. (KonserveBackend.-write-node): use create-id to generate the storage ID. * src/hitchhiker/tree/tracing-gc/konserve.cljc: new namespace. * src/hitchhiker/tree/tracing-gc.cljc: new namespace. * .gitignore: ignore IntelliJ files. * project.clj: update konserve to 0.6.0-SNAPSHOT.
This is a sketch of how tracing GC could work for backends