Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running out of memory on BeaconStateBellatrix GetTree #119

Closed
claravanstaden opened this issue Jan 12, 2023 · 12 comments
Closed

Running out of memory on BeaconStateBellatrix GetTree #119

claravanstaden opened this issue Jan 12, 2023 · 12 comments

Comments

@claravanstaden
Copy link
Contributor

I am using this lib to get multiproofs for block_roots in the BeaconState. I download the BeaconState from Prysm, unmarshal the ssz bytes into a BeaconStateBellatrix object, then call GetTree on the object in preparation to get the proof. I keep running out of memory when calling GetTree.

I have made a demo repo to illustrate the issue: https://github.com/claravanstaden/beacon-state-scratchpad The beacon state as download from Prysm is included as a file.

Please let me know if I am doing something wrong?

@claravanstaden
Copy link
Contributor Author

It might be worthwhile that the unmarshalling is very quick and completes successfully, but converting the BeaconStateBellatrix object to a tree seems problematic.

@ferranbt
Copy link
Owner

Hi! I have seen this before.

The part of the trees is not part of the high-performance functions of the library. I think the issue is that with max lists it has to allocate nodes for the max number of entries. When max is big, the number of tree nodes gets too big to allocate the memory. Note that this happens even if the list itself has only one item.

I am not sure there is an immediate solution for that. I did not build that part myself (cc @s1na).

What is your use case?

@claravanstaden
Copy link
Contributor Author

I've noticed that (on my machine) it completes about hash tree root operations for about 140k validators about of the 400k. So it actually starts hashing the validator set correctly but doesn't complete. Here's the link to where it happens: https://github.com/ferranbt/fastssz/blob/main/spectests/structs_encoding.go#L6076

I am trying to get ancestry proofs for a light client. The light client gets a finalized beacon header update and verifies it. After that, it receives an ancestor of the finalized block and I need to prove that the ancestor block is in fact a grandparent of the finalized block. I want to do this by checking the block_roots field in the BeaconState. So to get the BeaconState proof, I think I need to call GetTree and then 'ProveMulti(with theblock_root generalized` indices that I need to prove)

@ferranbt
Copy link
Owner

I tried the demo. It did hash on my side with HashTreeRoot without problems.

For GetTree, it does run out of memory here. In this example, the validators field of the BeaconStateBellatrix has as max number of items 1099511627776. Go cannot pre-allocate an array that big even when it is half empty (only 399333 validators in the example).

@claravanstaden
Copy link
Contributor Author

@ferranbt thanks for testing the repo out! 😄

Ok, that makes sense. Maybe I ran into memory issues in a different place because of my local config. I extended Go's max memory limit by changing the GOMEMLIMIT var.

But in any case, I wonder what the solution something like this should be. Maybe there's a different way to get the proof without creating the whole tree. Prysm must have around this issue in some way.

@ferranbt
Copy link
Owner

As far as I know, Prysm does not return the proofs.

I am unsure what the solution is, I do not have extensive experience in the proving part to think about an efficient architecture. Let me try to ping @s1na offline to see if he can bring some feedback.

@claravanstaden
Copy link
Contributor Author

@ferranbt appreciated the help! 😄

Maybe Prysm will support in the future, with beacon API requests like these: ethereum/beacon-APIs#267

@s1na
Copy link
Contributor

s1na commented Jan 19, 2023

Go cannot pre-allocate an array that big even when it is half empty (only 399333 validators in the example).

Is the problem only pre-allocating a large enough block in one piece? we might be able to go around that with some modifications. However if the tree is so big it doesnt fit in memory then basically the tree logic needs to be re-written to use disk.

What I have in mind for constructing the tree is breaking it into smaller subtrees (we can set a limit for the size of a tree that can be constructed in one piece) and then link those subtries together.

@claravanstaden
Copy link
Contributor Author

@s1na maybe the subtree option is the more future-proof solution, either way.

@ferranbt
Copy link
Owner

ferranbt commented Feb 6, 2023

Here is another idea. Use the HashTreeRootWith and the ssz.HashWalker interface to create an implementation that takes as input an ssz object and outputs a fieldsRoot object like the one generated by Prysm here.

@claravanstaden
Copy link
Contributor Author

@ferranbt that would also work! I've been working on a optimised version of the tree.go package and it seems to have resolved the out-of-memory problem. Will open a PR today. 😄

@ferranbt
Copy link
Owner

ferranbt commented Feb 7, 2023

Fixed in #120

@ferranbt ferranbt closed this as completed Feb 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants