-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add hooks for storage backend #4
Comments
As stated in Marcela's email:
My thought on this issue is I would like to implement a db interface (like the |
I think this could work. Then the idea would be that the user would implement the interface for their specific key-value backend? Doesn't this conflict with the notion that we would provide a mechanism for reconstructing the tree from the KV store? The reason being that in order to reconstruct the tree from a KV store you would have to make certain assumptions about how the tree data is stored in it, or am I missing something?
You could consider storing the diffs of trees only to save space and not save redundant copies of data that hasn't changed across epochs. The reconstruction might be a bit trickier then because you have to reassign the right node pointers. Another option that may save more space would be to only store the leaf nodes (and you could even go as far as only storing the changed leaf nodes). But then you have to reconstruct the entire tree from the leaves doing a bunch of insertions. The question then becomes: is it more efficient to do many pointer reassignments or many insert operations? |
Actually, the data storing mechanism is implemented in the lib. The user would need to implement the db interface only (i.e., implement how to store a pair (key, value) to the db). Is it right?
This way looks mostly the same as how we create the tree currently [1], it might also be a bit easier to implement, compare to the first one. So I think I will do some benchmarks to compare these 2 approaches. |
I understand that the data storing mechanism would be implemented in the lib. What doesn't make sense to me is that the user needs to implement how to store a kv-pair when the lib provides the storing mechanism. The fact that the lib specifies what/how the data is stored means that we have already made a decision for the user how the data will be stored, so what else does the user still have to do? Since we're already specifying the storing mechanism, wouldn't it make more sense to provide hooks for specific KV stores (i.e. LevelDB) and the user then only needs to pass in their specific db configuration?
I would suggest starting with reconstructing from the leaves (the second method), since, like you said, this should be easier to implement. Once this is working, and if there is time, we can look into reconstructing from tree diffs to see if this method improves the performance of reconstructing from leaves in the db. |
The backend storage is implemented in 15606c5d64fa15683c7ec0e2229ecfdf80128e09.
The developers now can construct the tree from db or do lookup in the STR which is stored in the persistent storage.
|
This is a good start! I'm going to make some specific comments in the commit, but here are some other comments:
This seems fine with me since this lets us reconstruct individual paths from the DB.
I'm assuming the application for this function is when the server has been restarted and we want to initialize the in-memory PAD with an existing tree?
Is the idea that this would be called whenever the developers needs to do a lookup from the db instead of the in-memory STR? I think it would be nice to abstract this away by calling |
In terms of code organization, I think we should separate the |
That's right. This is what
It's also right. It's should be called in
Great! |
@c633 What do you think of d3a2388? In addition to refactoring, I also renamed The other thing is tests: |
Yes. I understand that the reason why you separated these kv-related code to other source files because we could add other db engines (e.g., relational db) in the future, right? If so, I think it makes sense! |
Yes, that's the main reason why I wanted to separate them. I'm glad you think it makes sense! |
I merged the changes in |
You read my mind! I was just writing to ask if I should handle it. Thanks for taking care of it! |
TODO
|
Right now, trees are kept in memory, and we're assuming that the user will set up her own storage backend and save the tree. To improve scalability, we should provide hooks for a few popular storage backend. We may also want to provide a mechanism for serializing trees into files on disk for this purpose.
The text was updated successfully, but these errors were encountered: