Shared validator pubkey #5883

arnetheduck · 2024-02-12T18:04:49Z

This PR allows sharing the pubkey data between validators by using a thread-local cache for pubkey data, netting about a 400mb mem usage reduction on holesky due to us keeping 3 permanent + several ephemeral state copies in memory at all times and each state copy holding a full validator set.

The PR also introduces a hash cache for the key which gives ~14% speedup for a full state hash_tree_root - the key makes up for a large part of the Validator htr time.

Finally, the time it takes to copy a state goes down as well from ~80m ms to ~60, for reasons similar to htr.

We use a ptr even if a ref could in theory have been used - there is not much practical benefit to a ref (given it's mutable) while a ptr is cheaper and easier to copy (when copying temporary states).

We could go further and cache a cooked pubkey but it turns out this is quite intrusive - in all the relevant places, we're already using a cooked key from the immutable validator data so there are no immediate performance gains of doing so while managing the compressed -> cooked key mapping would become more difficult - something for a future PR perhaps.

 ./ncli --print-times hashTreeRoot capella_state state-8028127-81e89e6c-a37fb323.ssz 
a37fb32350d0d8ef9ae665e28036ec9843dd9c8a9b520acb6b68c8065bec5191
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
     118.682,        0.000,      118.682,      118.682,            1, Load file
    1000.351,        0.000,     1000.351,     1000.351,            1, Compute

./ncli --print-times hashTreeRoot capella_state state-8028127-81e89e6c-a37fb323.ssz 
a37fb32350d0d8ef9ae665e28036ec9843dd9c8a9b520acb6b68c8065bec5191
All time are ms
     Average,       StdDev,          Min,          Max,      Samples,         Test
Validation is turned off meaning that no BLS operations are performed
     420.763,        0.000,      420.763,      420.763,            1, Load file
     870.934,        0.000,      870.934,      870.934,            1, Compute

This PR allows sharing the pubkey data between validators by using a thread-local cache for pubkey data, netting about a 400mb mem usage reduction on holesky due to us keeping 3 permanent + several ephemeral state copies in memory at all times and each state copy holding a full validator. The PR also introduces a hash cache for the key which gives ~14% speedup for a full state `hash_tree_root` - the key makes up for a large part of the `Validator` htr time. Finally, the time it takes to copy a state goes down as well from ~80m ms to ~60, for reasons similar to htr. We use a `ptr` even if a `ref` could in theory have been used - there is not much practical benefit to a `ref` (given it's mutable) while a `ptr` is cheaper and easier to copy (when copying temporary states). We could go further and cache a cooked pubkey but it turns out this is quite intrusive - in all the relevant places, we're already using a cooked key from the immutable validator data so there are no immediate performance gains of doing so while managing the compressed -> cooked key mapping would become more difficult - something for a future PR perhaps.

arnetheduck · 2024-02-12T18:15:59Z

draft until post-deneb

github-actions · 2024-02-12T19:13:03Z

Unit Test Results

        9 files ±0   1 107 suites ±0 25m 12s ⏱️ +24s
  4 233 tests +1   3 886 ✔️ +1 347 💤 ±0 0 ❌ ±0
16 894 runs +3 16 496 ✔️ +3 398 💤 ±0 0 ❌ ±0

Results for commit 1a980a2. ± Comparison against base commit 74eeb0b.

♻️ This comment has been updated with latest results.

tersec · 2024-02-13T17:39:45Z

#5887 might let this build

beacon_chain/spec/datatypes/base.nim

etan-status · 2024-02-21T11:00:41Z

beacon_chain/spec/datatypes/base.nim

+    # This should never happen but we guard against it in case a
+    # default-initialized Validator instance makes it through the other safety
+    # nets


wondering if in such a hypothetical scenario, a log + exit wouldn't be more suitable. it starts injecting nasty bugs into surrounding logic

using a zero-inited pubkey is the current behavior in this scenario (it's used in test cases)

shouldn't it then return the correct hash_tree_root of the zero validator instead of zero hash?

beacon_chain/spec/eth2_merkleization.nim

etan-status

zero initialized validator comment still applies, but if we can ensure that those are only existing in tests, and don't get triggered when someone actually makes a deposit to a zero validator, seems alright

Co-authored-by: Etan Kissling <etan@status.im>

The current implementation of the validator key cache as introduced in #5883 leads to issues when compiling with `--gc:arc`. Namely, the assert in `injectdestructors.nim` > `destructiveMoveVar` is triggered: ```nim assert n.kind != nkSym or not hasDestructor(c, n.sym.typ) ``` `cached == nkSym`, and `n.sym.typ == ref HashedValidatorPubKeyItem` with `hasDestructor(c, n.sym.typ) == true`. Inlining the `addr ...[]` avoids the problem and allows `--gc:arc` compilation as part of LC wasm demo project. Compilation command: ```sh nim c \ -d:disable_libbacktrace \ -d:disableMarchNative \ -d:disableLTO \ -d:emscripten \ -d:release \ -d:useGcAssert \ -d:useSysAssert \ --debuginfo:off \ --nimcache:nimcache \ --os:linux \ --cpu:wasm32 \ --cc:clang \ --clang.exe:emcc \ --clang.linkerexe:emcc \ --gc:arc \ --exceptions:goto \ --define:noSignalHandler \ --define:danger \ --panics:on \ --passC:-fpic \ --passL:-Os \ --passL:-fpic \ --passC:'-pthread' \ --passL:'-pthread' \ --passC:'-sASSERTIONS' \ --passL:'-sASSERTIONS' \ --passC:'-sINITIAL_MEMORY=256MB' \ --passL:'-sINITIAL_MEMORY=256MB' \ --passC:'-sSTACK_SIZE=128MB' \ --passL:'-sSTACK_SIZE=128MB' \ --passC:'-sUSE_PTHREADS=1' \ --passL:'-sUSE_PTHREADS=1' \ --passC:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sPTHREAD_POOL_SIZE_STRICT=0' \ --passL:'-sEXPORTED_FUNCTIONS="[_free, _malloc, _NimMain, _ETHRandomNumberCreate, _ETHConsensusConfigCreateFromYaml, _ETHConsensusConfigGetConsensusVersionAtEpoch, _ETHBeaconStateCreateFromSsz, _ETHBeaconStateDestroy, _ETHBeaconStateCopyGenesisValidatorsRoot, _ETHRootDestroy, _ETHForkDigestsCreateFromState, _ETHBeaconClockCreateFromState, _ETHBeaconClockGetSlot, _ETHLightClientStoreCreateFromBootstrap, _ETHLightClientStoreDestroy, _kETHLcSyncKind_UpdatesByRange, _kETHLcSyncKind_FinalityUpdate, _kETHLcSyncKind_OptimisticUpdate, _ETHLightClientStoreGetNextSyncTask, _ETHLightClientStoreGetMillisecondsToNextSyncTask, _ETHLightClientStoreProcessUpdatesByRange, _ETHLightClientStoreProcessFinalityUpdate, _ETHLightClientStoreProcessOptimisticUpdate, _ETHLightClientStoreGetFinalizedHeader, _ETHLightClientStoreIsNextSyncCommitteeKnown, _ETHLightClientStoreGetOptimisticHeader, _ETHLightClientStoreGetSafetyThreshold, _ETHLightClientHeaderCreateCopy, _ETHLightClientHeaderDestroy, _ETHLightClientHeaderCopyBeaconRoot, _ETHLightClientHeaderGetBeacon, _ETHBeaconBlockHeaderGetSlot, _ETHBeaconBlockHeaderGetProposerIndex, _ETHBeaconBlockHeaderGetParentRoot, _ETHBeaconBlockHeaderGetStateRoot, _ETHBeaconBlockHeaderGetBodyRoot, _ETHLightClientHeaderCopyExecutionHash, _ETHLightClientHeaderGetExecution, _ETHExecutionPayloadHeaderGetParentHash, _ETHExecutionPayloadHeaderGetFeeRecipient, _ETHExecutionPayloadHeaderGetStateRoot, _ETHExecutionPayloadHeaderGetReceiptsRoot, _ETHExecutionPayloadHeaderGetLogsBloom, _ETHExecutionPayloadHeaderGetPrevRandao, _ETHExecutionPayloadHeaderGetBlockNumber, _ETHExecutionPayloadHeaderGetGasLimit, _ETHExecutionPayloadHeaderGetGasUsed, _ETHExecutionPayloadHeaderGetTimestamp, _ETHExecutionPayloadHeaderGetExtraDataBytes, _ETHExecutionPayloadHeaderGetBaseFeePerGas, _ETHExecutionPayloadHeaderGetBlobGasUsed, _ETHExecutionPayloadHeaderGetExcessBlobGas, _ETHExecutionBlockHeaderCreateFromJson, _ETHExecutionBlockHeaderDestroy, _ETHExecutionBlockHeaderGetTransactionsRoot, _ETHExecutionBlockHeaderGetWithdrawalsRoot, _ETHTransactionsCreateFromJson, _ETHTransactionsDestroy, _ETHTransactionsGetCount, _ETHTransactionsGet, _ETHTransactionGetHash, _ETHTransactionGetFrom, _ETHTransactionGetNonce, _ETHTransactionGetMaxPriorityFeePerGas, _ETHTransactionGetMaxFeePerGas, _ETHTransactionGetGas, _ETHTransactionIsCreatingContract, _ETHTransactionGetTo, _ETHTransactionGetValue, _ETHTransactionGetInputBytes, _ETHTransactionGetBytes, _ETHTransactionGetEip6493Root, _ETHTransactionGetEip6493Bytes, _ETHTransactionGetNumEip6493SnappyBytes, _ETHReceiptsCreateFromJson, _ETHReceiptsDestroy, _ETHReceiptsGet, _ETHReceiptHasStatus, _ETHReceiptGetBytes, _ETHReceiptGetEip6493Bytes, _ETHReceiptGetNumEip6493SnappyBytes]"' \ --passL:'-sEXPORTED_RUNTIME_METHODS="[lengthBytesUTF8, stringToNewUTF8]"' \ --passL:'-Wl,--no-entry' \ --noMain:on \ --passL:'-o libnimbus_lc.js' \ nimbus-eth2/beacon_chain/libnimbus_lc/libnimbus_lc.nim ```

arnetheduck marked this pull request as draft February 12, 2024 18:15

arnetheduck added 6 commits February 14, 2024 13:12

Merge branch 'unstable' into shared-pubkey

7b37fd2

Merge remote-tracking branch 'origin/unstable' into shared-pubkey

641dede

fix diff / tests

c964803

bump

d92f872

avoid stew/net utils

504a0ec

bump

e577428

arnetheduck marked this pull request as ready for review February 21, 2024 10:41

etan-status reviewed Feb 21, 2024

View reviewed changes

beacon_chain/spec/datatypes/base.nim Outdated Show resolved Hide resolved

etan-status reviewed Feb 21, 2024

View reviewed changes

etan-status mentioned this pull request Feb 21, 2024

bump nim-json-serialization to c869dae884336e1bca134ccb8ea1a37517d16a29 #5929

Closed

fix validator serialization

be3ff0d

etan-status approved these changes Feb 21, 2024

View reviewed changes

arnetheduck and others added 3 commits February 21, 2024 16:20

empty pubkey htr

f66c288

Update beacon_chain/spec/eth2_merkleization.nim

f448766

Co-authored-by: Etan Kissling <etan@status.im>

re-generate test report

1a980a2

arnetheduck enabled auto-merge (squash) February 21, 2024 15:57

arnetheduck merged commit 1ef7d23 into unstable Feb 21, 2024
13 checks passed

arnetheduck deleted the shared-pubkey branch February 21, 2024 19:06

etan-status mentioned this pull request Apr 15, 2024

avoid --gc:arc issue in validator key caching #6203

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared validator pubkey #5883

Shared validator pubkey #5883

arnetheduck commented Feb 12, 2024

arnetheduck commented Feb 12, 2024

github-actions bot commented Feb 12, 2024 •

edited

Loading

tersec commented Feb 13, 2024

etan-status Feb 21, 2024

arnetheduck Feb 21, 2024

etan-status Feb 21, 2024

etan-status left a comment

Shared validator pubkey #5883

Shared validator pubkey #5883

Conversation

arnetheduck commented Feb 12, 2024

arnetheduck commented Feb 12, 2024

github-actions bot commented Feb 12, 2024 • edited Loading

Unit Test Results

tersec commented Feb 13, 2024

etan-status Feb 21, 2024

Choose a reason for hiding this comment

arnetheduck Feb 21, 2024

Choose a reason for hiding this comment

etan-status Feb 21, 2024

Choose a reason for hiding this comment

etan-status left a comment

Choose a reason for hiding this comment

github-actions bot commented Feb 12, 2024 •

edited

Loading