-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shared validator pubkey #5883
Shared validator pubkey #5883
Conversation
This PR allows sharing the pubkey data between validators by using a thread-local cache for pubkey data, netting about a 400mb mem usage reduction on holesky due to us keeping 3 permanent + several ephemeral state copies in memory at all times and each state copy holding a full validator. The PR also introduces a hash cache for the key which gives ~14% speedup for a full state `hash_tree_root` - the key makes up for a large part of the `Validator` htr time. Finally, the time it takes to copy a state goes down as well from ~80m ms to ~60, for reasons similar to htr. We use a `ptr` even if a `ref` could in theory have been used - there is not much practical benefit to a `ref` (given it's mutable) while a `ptr` is cheaper and easier to copy (when copying temporary states). We could go further and cache a cooked pubkey but it turns out this is quite intrusive - in all the relevant places, we're already using a cooked key from the immutable validator data so there are no immediate performance gains of doing so while managing the compressed -> cooked key mapping would become more difficult - something for a future PR perhaps.
draft until post-deneb |
#5887 might let this build |
# This should never happen but we guard against it in case a | ||
# default-initialized Validator instance makes it through the other safety | ||
# nets |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wondering if in such a hypothetical scenario, a log + exit wouldn't be more suitable. it starts injecting nasty bugs into surrounding logic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
using a zero-inited pubkey is the current behavior in this scenario (it's used in test cases)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't it then return the correct hash_tree_root
of the zero validator instead of zero hash?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
zero initialized validator comment still applies, but if we can ensure that those are only existing in tests, and don't get triggered when someone actually makes a deposit to a zero validator, seems alright
Co-authored-by: Etan Kissling <etan@status.im>
The current implementation of the validator key cache as introduced in #5883 leads to issues when compiling with `--gc:arc`. Namely, the assert in `injectdestructors.nim` > `destructiveMoveVar` is triggered: ```nim assert n.kind != nkSym or not hasDestructor(c, n.sym.typ) ``` `cached == nkSym`, and `n.sym.typ == ref HashedValidatorPubKeyItem` with `hasDestructor(c, n.sym.typ) == true`. Inlining the `addr ...[]` avoids the problem and allows `--gc:arc` compilation as part of LC wasm demo project. Compilation command: ```sh nim c \ -d:disable_libbacktrace \ -d:disableMarchNative \ -d:disableLTO \ -d:emscripten \ -d:release \ -d:useGcAssert \ -d:useSysAssert \ --debuginfo:off \ --nimcache:nimcache \ --os:linux \ --cpu:wasm32 \ --cc:clang \ --clang.exe:emcc \ --clang.linkerexe:emcc \ --gc:arc \ --exceptions:goto \ --define:noSignalHandler \ --define:danger \ --panics:on \ --passC:-fpic \ --passL:-Os \ --passL:-fpic \ --passC:'-pthread' \ --passL:'-pthread' \ --passC:'-sASSERTIONS' \ --passL:'-sASSERTIONS' \ --passC:'-sINITIAL_MEMORY=256MB' \ --passL:'-sINITIAL_MEMORY=256MB' \ --passC:'-sSTACK_SIZE=128MB' \ --passL:'-sSTACK_SIZE=128MB' \ --passC:'-sUSE_PTHREADS=1' \ --passL:'-sUSE_PTHREADS=1' \ --passC:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sEXPORTED_FUNCTIONS="[_free, _malloc, _NimMain, _ETHRandomNumberCreate, _ETHConsensusConfigCreateFromYaml, _ETHConsensusConfigGetConsensusVersionAtEpoch, _ETHBeaconStateCreateFromSsz, _ETHBeaconStateDestroy, _ETHBeaconStateCopyGenesisValidatorsRoot, _ETHRootDestroy, _ETHForkDigestsCreateFromState, _ETHBeaconClockCreateFromState, _ETHBeaconClockGetSlot, _ETHLightClientStoreCreateFromBootstrap, _ETHLightClientStoreDestroy, _kETHLcSyncKind_UpdatesByRange, _kETHLcSyncKind_FinalityUpdate, _kETHLcSyncKind_OptimisticUpdate, _ETHLightClientStoreGetNextSyncTask, _ETHLightClientStoreGetMillisecondsToNextSyncTask, _ETHLightClientStoreProcessUpdatesByRange, _ETHLightClientStoreProcessFinalityUpdate, _ETHLightClientStoreProcessOptimisticUpdate, _ETHLightClientStoreGetFinalizedHeader, _ETHLightClientStoreIsNextSyncCommitteeKnown, _ETHLightClientStoreGetOptimisticHeader, _ETHLightClientStoreGetSafetyThreshold, _ETHLightClientHeaderCreateCopy, _ETHLightClientHeaderDestroy, _ETHLightClientHeaderCopyBeaconRoot, _ETHLightClientHeaderGetBeacon, _ETHBeaconBlockHeaderGetSlot, _ETHBeaconBlockHeaderGetProposerIndex, _ETHBeaconBlockHeaderGetParentRoot, _ETHBeaconBlockHeaderGetStateRoot, _ETHBeaconBlockHeaderGetBodyRoot, _ETHLightClientHeaderCopyExecutionHash, _ETHLightClientHeaderGetExecution, _ETHExecutionPayloadHeaderGetParentHash, _ETHExecutionPayloadHeaderGetFeeRecipient, _ETHExecutionPayloadHeaderGetStateRoot, _ETHExecutionPayloadHeaderGetReceiptsRoot, _ETHExecutionPayloadHeaderGetLogsBloom, _ETHExecutionPayloadHeaderGetPrevRandao, _ETHExecutionPayloadHeaderGetBlockNumber, _ETHExecutionPayloadHeaderGetGasLimit, _ETHExecutionPayloadHeaderGetGasUsed, _ETHExecutionPayloadHeaderGetTimestamp, _ETHExecutionPayloadHeaderGetExtraDataBytes, _ETHExecutionPayloadHeaderGetBaseFeePerGas, _ETHExecutionPayloadHeaderGetBlobGasUsed, _ETHExecutionPayloadHeaderGetExcessBlobGas, _ETHExecutionBlockHeaderCreateFromJson, _ETHExecutionBlockHeaderDestroy, _ETHExecutionBlockHeaderGetTransactionsRoot, _ETHExecutionBlockHeaderGetWithdrawalsRoot, _ETHTransactionsCreateFromJson, _ETHTransactionsDestroy, _ETHTransactionsGetCount, _ETHTransactionsGet, _ETHTransactionGetHash, _ETHTransactionGetFrom, _ETHTransactionGetNonce, _ETHTransactionGetMaxPriorityFeePerGas, _ETHTransactionGetMaxFeePerGas, _ETHTransactionGetGas, _ETHTransactionIsCreatingContract, _ETHTransactionGetTo, _ETHTransactionGetValue, _ETHTransactionGetInputBytes, _ETHTransactionGetBytes, _ETHTransactionGetEip6493Root, _ETHTransactionGetEip6493Bytes, _ETHTransactionGetNumEip6493SnappyBytes, _ETHReceiptsCreateFromJson, _ETHReceiptsDestroy, _ETHReceiptsGet, _ETHReceiptHasStatus, _ETHReceiptGetBytes, _ETHReceiptGetEip6493Bytes, _ETHReceiptGetNumEip6493SnappyBytes]"' \ --passL:'-sEXPORTED_RUNTIME_METHODS="[lengthBytesUTF8, stringToNewUTF8]"' \ --passL:'-Wl,--no-entry' \ --noMain:on \ --passL:'-o libnimbus_lc.js' \ nimbus-eth2/beacon_chain/libnimbus_lc/libnimbus_lc.nim ```
The current implementation of the validator key cache as introduced in #5883 leads to issues when compiling with `--gc:arc`. Namely, the assert in `injectdestructors.nim` > `destructiveMoveVar` is triggered: ```nim assert n.kind != nkSym or not hasDestructor(c, n.sym.typ) ``` `cached == nkSym`, and `n.sym.typ == ref HashedValidatorPubKeyItem` with `hasDestructor(c, n.sym.typ) == true`. Inlining the `addr ...[]` avoids the problem and allows `--gc:arc` compilation as part of LC wasm demo project. Compilation command: ```sh nim c \ -d:disable_libbacktrace \ -d:disableMarchNative \ -d:disableLTO \ -d:emscripten \ -d:release \ -d:useGcAssert \ -d:useSysAssert \ --debuginfo:off \ --nimcache:nimcache \ --os:linux \ --cpu:wasm32 \ --cc:clang \ --clang.exe:emcc \ --clang.linkerexe:emcc \ --gc:arc \ --exceptions:goto \ --define:noSignalHandler \ --define:danger \ --panics:on \ --passC:-fpic \ --passL:-Os \ --passL:-fpic \ --passC:'-pthread' \ --passL:'-pthread' \ --passC:'-sASSERTIONS' \ --passL:'-sASSERTIONS' \ --passC:'-sINITIAL_MEMORY=256MB' \ --passL:'-sINITIAL_MEMORY=256MB' \ --passC:'-sSTACK_SIZE=128MB' \ --passL:'-sSTACK_SIZE=128MB' \ --passC:'-sUSE_PTHREADS=1' \ --passL:'-sUSE_PTHREADS=1' \ --passC:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sEXPORTED_FUNCTIONS="[_free, _malloc, _NimMain, _ETHRandomNumberCreate, _ETHConsensusConfigCreateFromYaml, _ETHConsensusConfigGetConsensusVersionAtEpoch, _ETHBeaconStateCreateFromSsz, _ETHBeaconStateDestroy, _ETHBeaconStateCopyGenesisValidatorsRoot, _ETHRootDestroy, _ETHForkDigestsCreateFromState, _ETHBeaconClockCreateFromState, _ETHBeaconClockGetSlot, _ETHLightClientStoreCreateFromBootstrap, _ETHLightClientStoreDestroy, _kETHLcSyncKind_UpdatesByRange, _kETHLcSyncKind_FinalityUpdate, _kETHLcSyncKind_OptimisticUpdate, _ETHLightClientStoreGetNextSyncTask, _ETHLightClientStoreGetMillisecondsToNextSyncTask, _ETHLightClientStoreProcessUpdatesByRange, _ETHLightClientStoreProcessFinalityUpdate, _ETHLightClientStoreProcessOptimisticUpdate, _ETHLightClientStoreGetFinalizedHeader, _ETHLightClientStoreIsNextSyncCommitteeKnown, _ETHLightClientStoreGetOptimisticHeader, _ETHLightClientStoreGetSafetyThreshold, _ETHLightClientHeaderCreateCopy, _ETHLightClientHeaderDestroy, _ETHLightClientHeaderCopyBeaconRoot, _ETHLightClientHeaderGetBeacon, _ETHBeaconBlockHeaderGetSlot, _ETHBeaconBlockHeaderGetProposerIndex, _ETHBeaconBlockHeaderGetParentRoot, _ETHBeaconBlockHeaderGetStateRoot, _ETHBeaconBlockHeaderGetBodyRoot, _ETHLightClientHeaderCopyExecutionHash, _ETHLightClientHeaderGetExecution, _ETHExecutionPayloadHeaderGetParentHash, _ETHExecutionPayloadHeaderGetFeeRecipient, _ETHExecutionPayloadHeaderGetStateRoot, _ETHExecutionPayloadHeaderGetReceiptsRoot, _ETHExecutionPayloadHeaderGetLogsBloom, _ETHExecutionPayloadHeaderGetPrevRandao, _ETHExecutionPayloadHeaderGetBlockNumber, _ETHExecutionPayloadHeaderGetGasLimit, _ETHExecutionPayloadHeaderGetGasUsed, _ETHExecutionPayloadHeaderGetTimestamp, _ETHExecutionPayloadHeaderGetExtraDataBytes, _ETHExecutionPayloadHeaderGetBaseFeePerGas, _ETHExecutionPayloadHeaderGetBlobGasUsed, _ETHExecutionPayloadHeaderGetExcessBlobGas, _ETHExecutionBlockHeaderCreateFromJson, _ETHExecutionBlockHeaderDestroy, _ETHExecutionBlockHeaderGetTransactionsRoot, _ETHExecutionBlockHeaderGetWithdrawalsRoot, _ETHTransactionsCreateFromJson, _ETHTransactionsDestroy, _ETHTransactionsGetCount, _ETHTransactionsGet, _ETHTransactionGetHash, _ETHTransactionGetFrom, _ETHTransactionGetNonce, _ETHTransactionGetMaxPriorityFeePerGas, _ETHTransactionGetMaxFeePerGas, _ETHTransactionGetGas, _ETHTransactionIsCreatingContract, _ETHTransactionGetTo, _ETHTransactionGetValue, _ETHTransactionGetInputBytes, _ETHTransactionGetBytes, _ETHTransactionGetEip6493Root, _ETHTransactionGetEip6493Bytes, _ETHTransactionGetNumEip6493SnappyBytes, _ETHReceiptsCreateFromJson, _ETHReceiptsDestroy, _ETHReceiptsGet, _ETHReceiptHasStatus, _ETHReceiptGetBytes, _ETHReceiptGetEip6493Bytes, _ETHReceiptGetNumEip6493SnappyBytes]"' \ --passL:'-sEXPORTED_RUNTIME_METHODS="[lengthBytesUTF8, stringToNewUTF8]"' \ --passL:'-Wl,--no-entry' \ --noMain:on \ --passL:'-o libnimbus_lc.js' \ nimbus-eth2/beacon_chain/libnimbus_lc/libnimbus_lc.nim ```
This PR allows sharing the pubkey data between validators by using a thread-local cache for pubkey data, netting about a 400mb mem usage reduction on holesky due to us keeping 3 permanent + several ephemeral state copies in memory at all times and each state copy holding a full validator set.
The PR also introduces a hash cache for the key which gives ~14% speedup for a full state
hash_tree_root
- the key makes up for a large part of theValidator
htr time.Finally, the time it takes to copy a state goes down as well from ~80m ms to ~60, for reasons similar to htr.
We use a
ptr
even if aref
could in theory have been used - there is not much practical benefit to aref
(given it's mutable) while aptr
is cheaper and easier to copy (when copying temporary states).We could go further and cache a cooked pubkey but it turns out this is quite intrusive - in all the relevant places, we're already using a cooked key from the immutable validator data so there are no immediate performance gains of doing so while managing the compressed -> cooked key mapping would become more difficult - something for a future PR perhaps.
./ncli --print-times hashTreeRoot capella_state state-8028127-81e89e6c-a37fb323.ssz a37fb32350d0d8ef9ae665e28036ec9843dd9c8a9b520acb6b68c8065bec5191 All time are ms Average, StdDev, Min, Max, Samples, Test Validation is turned off meaning that no BLS operations are performed 118.682, 0.000, 118.682, 118.682, 1, Load file 1000.351, 0.000, 1000.351, 1000.351, 1, Compute
./ncli --print-times hashTreeRoot capella_state state-8028127-81e89e6c-a37fb323.ssz a37fb32350d0d8ef9ae665e28036ec9843dd9c8a9b520acb6b68c8065bec5191 All time are ms Average, StdDev, Min, Max, Samples, Test Validation is turned off meaning that no BLS operations are performed 420.763, 0.000, 420.763, 420.763, 1, Load file 870.934, 0.000, 870.934, 870.934, 1, Compute