Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Move parents cache to disk and enable it by default #855

Closed
schomatis opened this issue Aug 30, 2019 · 3 comments
Closed

Move parents cache to disk and enable it by default #855

schomatis opened this issue Aug 30, 2019 · 3 comments

Comments

@schomatis
Copy link
Contributor

Why?

  • Based on the recent bechmarks of storing MTs in disk vs memory, encoding time is still the large dominant force here which gives room for (batched) disk operations.

  • Cache size is not small, if I remember correctly is something like 4 sector sizes, roughly:

nodes = sector_size / 32
expansion_parents = 8
total_entries = nodes * expansion_parents
entry_size = 8 bytes (using `u64`)
cache_size = total_entries * entry_size * 2 (two caches: forward and reverse)
----
cache_size = sector_size / 32 * 8 * 8 * 2
cache_size = sector_size * 4
  • The parents cache gives a considerable speed improvement which, if enabled by default, storing it on disk would not have any major memory impact and the disk penalty would be greatly compensated by the cache benefits.

  • Cache access pattern is extremely regular because by definition we encode sequentially (the only difference is the forward/reverse direction), so grouping entries in blocks (similar to MT generation) would minimize I/O.

How?

  • There could be an option (feature) to still keep it in RAM to optimize for speed in extreme cases if needed.

  • We can leverage the new DiskStore to reduce implementation time.

  • The disk-trees feature (soon to be made default) and whatever feature we end up using here should eventually converge to a more general and simple profile option that the user should be aware of, that should transmit an idea in the lines of "optimize for memory at the cost of speed and disk usage".


If this turns out to actually be useful and the Pedersen cache (#697) also exhibits regular access patterns something similar could be done there as well.

@dignifiedquire
Copy link
Contributor

Please consider #847 carefully for this

@schomatis
Copy link
Contributor Author

Yes, they are pretty much the same, sorry I missed that issue.

Please coordinate with @DrPeterVanNostrand if you think that one should be implemented instead, the idea was to reduce the memory consumption of the parents to pave the way for #827.

@porcuquine
Copy link
Collaborator

I think, at least by #1163, the spirit of this issue has been quite fulfilled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants