Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use memory efficient toHex in pubkey2index map #3561

Merged
merged 4 commits into from
Jan 4, 2022

Conversation

dapplion
Copy link
Contributor

@dapplion dapplion commented Jan 3, 2022

Motivation

From the benchmarks in #3446 it's shown that concatenated strings take significant more memory than a string produced with Buffer.toString(). NodeJS's Buffer uses C++ bindings to produce the hex strings thus the different result.

For a network with 250_000 the concatenated strings for the pubkeys alone take 1400 * 250_000 = 350 MB.

Description

  • Add memory and performance benchmarks for concatenated strings and ES6 Maps with BLS pubkeys
  • Use Buffer.toString("hex") to create the keys of pubkey2index Map

This PR should be followed by a general replacement of toHexString by this memory efficient approach. However, this PR does this minimal change first since pubkey2index takes the bulk of memory usage for concatenated strings.

@codecov
Copy link

codecov bot commented Jan 3, 2022

Codecov Report

Merging #3561 (f5b920c) into master (f49aa8e) will not change coverage.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master    #3561   +/-   ##
=======================================
  Coverage   37.47%   37.47%           
=======================================
  Files         311      311           
  Lines        8291     8291           
  Branches     1282     1282           
=======================================
  Hits         3107     3107           
  Misses       5036     5036           
  Partials      148      148           

@codeclimate
Copy link

codeclimate bot commented Jan 3, 2022

Code Climate has analyzed commit f5b920c and detected 0 issues on this pull request.

View more on Code Climate.

@github-actions
Copy link
Contributor

github-actions bot commented Jan 3, 2022

Performance Report

✔️ no performance regression detected

Full benchmark results
Benchmark suite Current: 77cc141 Previous: f49aa8e Ratio
BeaconState.hashTreeRoot - No change 548.00 ns/op 479.00 ns/op 1.14
BeaconState.hashTreeRoot - 1 full validator 117.37 us/op 120.04 us/op 0.98
BeaconState.hashTreeRoot - 32 full validator 1.9672 ms/op 1.9227 ms/op 1.02
BeaconState.hashTreeRoot - 512 full validator 25.514 ms/op 26.186 ms/op 0.97
BeaconState.hashTreeRoot - 1 validator.effectiveBalance 116.50 us/op 120.80 us/op 0.96
BeaconState.hashTreeRoot - 32 validator.effectiveBalance 1.9537 ms/op 2.0165 ms/op 0.97
BeaconState.hashTreeRoot - 512 validator.effectiveBalance 26.475 ms/op 26.156 ms/op 1.01
BeaconState.hashTreeRoot - 1 balances 85.356 us/op 85.885 us/op 0.99
BeaconState.hashTreeRoot - 32 balances 679.59 us/op 656.32 us/op 1.04
BeaconState.hashTreeRoot - 512 balances 7.2863 ms/op 7.1242 ms/op 1.02
BeaconState.hashTreeRoot - 250000 balances 136.53 ms/op 146.32 ms/op 0.93
processSlot - 1 slots 55.089 us/op 49.424 us/op 1.11
processSlot - 32 slots 2.5854 ms/op 2.6776 ms/op 0.97
getCommitteeAssignments - req 1 vs - 250000 vc 4.6711 ms/op 4.6508 ms/op 1.00
getCommitteeAssignments - req 100 vs - 250000 vc 6.4813 ms/op 6.4312 ms/op 1.01
getCommitteeAssignments - req 1000 vs - 250000 vc 6.9340 ms/op 6.8904 ms/op 1.01
computeProposers - vc 250000 18.704 ms/op 19.999 ms/op 0.94
computeEpochShuffling - vc 250000 173.80 ms/op 169.72 ms/op 1.02
getNextSyncCommittee - vc 250000 329.09 ms/op 321.38 ms/op 1.02
altair processAttestation - 250000 vs - 7PWei normalcase 42.802 ms/op 46.495 ms/op 0.92
altair processAttestation - 250000 vs - 7PWei worstcase 45.647 ms/op 47.952 ms/op 0.95
altair processAttestation - setStatus - 1/6 committees join 12.818 ms/op 12.865 ms/op 1.00
altair processAttestation - setStatus - 1/3 committees join 27.011 ms/op 26.156 ms/op 1.03
altair processAttestation - setStatus - 1/2 committees join 43.687 ms/op 39.458 ms/op 1.11
altair processAttestation - setStatus - 2/3 committees join 46.949 ms/op 47.865 ms/op 0.98
altair processAttestation - setStatus - 4/5 committees join 58.729 ms/op 61.311 ms/op 0.96
altair processAttestation - setStatus - 100% committees join 75.513 ms/op 72.185 ms/op 1.05
altair processAttestation - updateEpochParticipants - 1/6 committees join 12.923 ms/op 13.819 ms/op 0.94
altair processAttestation - updateEpochParticipants - 1/3 committees join 27.056 ms/op 26.695 ms/op 1.01
altair processAttestation - updateEpochParticipants - 1/2 committees join 31.397 ms/op 26.645 ms/op 1.18
altair processAttestation - updateEpochParticipants - 2/3 committees join 27.770 ms/op 27.232 ms/op 1.02
altair processAttestation - updateEpochParticipants - 4/5 committees join 29.443 ms/op 29.385 ms/op 1.00
altair processAttestation - updateEpochParticipants - 100% committees join 30.129 ms/op 35.243 ms/op 0.85
altair processAttestation - updateAllStatus 26.060 ms/op 19.911 ms/op 1.31
altair processBlock - 250000 vs - 7PWei normalcase 44.617 ms/op 46.073 ms/op 0.97
altair processBlock - 250000 vs - 7PWei worstcase 113.06 ms/op 119.75 ms/op 0.94
altair processEpoch - mainnet_e81889 1.0095 s/op 1.0433 s/op 0.97
mainnet_e81889 - altair beforeProcessEpoch 281.90 ms/op 383.04 ms/op 0.74
mainnet_e81889 - altair processJustificationAndFinalization 49.470 us/op 52.575 us/op 0.94
mainnet_e81889 - altair processInactivityUpdates 17.164 ms/op 19.030 ms/op 0.90
mainnet_e81889 - altair processRewardsAndPenalties 172.65 ms/op 140.16 ms/op 1.23
mainnet_e81889 - altair processRegistryUpdates 6.2870 us/op 6.9320 us/op 0.91
mainnet_e81889 - altair processSlashings 1.2440 us/op 1.7160 us/op 0.72
mainnet_e81889 - altair processEth1DataReset 972.00 ns/op 2.0220 us/op 0.48
mainnet_e81889 - altair processEffectiveBalanceUpdates 11.626 ms/op 10.059 ms/op 1.16
mainnet_e81889 - altair processSlashingsReset 9.2660 us/op 12.238 us/op 0.76
mainnet_e81889 - altair processRandaoMixesReset 11.756 us/op 15.450 us/op 0.76
mainnet_e81889 - altair processHistoricalRootsUpdate 1.3280 us/op 2.8130 us/op 0.47
mainnet_e81889 - altair processParticipationFlagUpdates 98.725 ms/op 183.89 ms/op 0.54
mainnet_e81889 - altair processSyncCommitteeUpdates 1.1180 us/op 1.6180 us/op 0.69
mainnet_e81889 - altair afterProcessEpoch 200.60 ms/op 199.20 ms/op 1.01
altair processInactivityUpdates - 250000 normalcase 71.560 ms/op 77.740 ms/op 0.92
altair processInactivityUpdates - 250000 worstcase 73.441 ms/op 71.748 ms/op 1.02
altair processParticipationFlagUpdates - 250000 anycase 88.532 ms/op 95.934 ms/op 0.92
altair processRewardsAndPenalties - 250000 normalcase 152.76 ms/op 163.87 ms/op 0.93
altair processRewardsAndPenalties - 250000 worstcase 145.03 ms/op 154.00 ms/op 0.94
altair processSyncCommitteeUpdates - 250000 358.74 ms/op 358.85 ms/op 1.00
Tree 40 250000 create 855.86 ms/op 887.47 ms/op 0.96
Tree 40 250000 get(125000) 324.20 ns/op 323.68 ns/op 1.00
Tree 40 250000 set(125000) 2.1034 us/op 2.1192 us/op 0.99
Tree 40 250000 toArray() 46.485 ms/op 48.109 ms/op 0.97
Tree 40 250000 iterate all - toArray() + loop 40.689 ms/op 42.564 ms/op 0.96
Tree 40 250000 iterate all - get(i) 119.55 ms/op 107.46 ms/op 1.11
MutableVector 250000 create 26.408 ms/op 26.133 ms/op 1.01
MutableVector 250000 get(125000) 13.631 ns/op 14.997 ns/op 0.91
MutableVector 250000 set(125000) 607.86 ns/op 551.34 ns/op 1.10
MutableVector 250000 toArray() 9.1534 ms/op 7.8000 ms/op 1.17
MutableVector 250000 iterate all - toArray() + loop 9.4562 ms/op 9.0214 ms/op 1.05
MutableVector 250000 iterate all - get(i) 3.4074 ms/op 3.4492 ms/op 0.99
Array 250000 create 5.9598 ms/op 5.2666 ms/op 1.13
Array 250000 clone - spread 2.4412 ms/op 2.0266 ms/op 1.20
Array 250000 get(125000) 1.1560 ns/op 0.89600 ns/op 1.29
Array 250000 set(125000) 1.1330 ns/op 0.89100 ns/op 1.27
Array 250000 iterate all - loop 167.83 us/op 148.07 us/op 1.13
aggregationBits - 2048 els - readonlyValues 263.07 us/op 203.59 us/op 1.29
aggregationBits - 2048 els - zipIndexesInBitList 53.954 us/op 39.157 us/op 1.38
regular array get 100000 times 67.442 us/op 67.375 us/op 1.00
wrappedArray get 100000 times 67.966 us/op 67.426 us/op 1.01
arrayWithProxy get 100000 times 29.088 ms/op 31.573 ms/op 0.92
ssz.Root.equals 1.2530 us/op 1.1630 us/op 1.08
ssz.Root.equals with valueOf() 1.6550 us/op 1.3750 us/op 1.20
byteArrayEquals with valueOf() 1.6050 us/op 1.3580 us/op 1.18
phase0 processBlock - 250000 vs - 7PWei normalcase 11.053 ms/op 10.787 ms/op 1.02
phase0 processBlock - 250000 vs - 7PWei worstcase 79.461 ms/op 74.908 ms/op 1.06
phase0 afterProcessEpoch - 250000 vs - 7PWei 209.46 ms/op 192.60 ms/op 1.09
phase0 beforeProcessEpoch - 250000 vs - 7PWei 613.61 ms/op 610.56 ms/op 1.00
phase0 processEpoch - mainnet_e58758 804.45 ms/op 781.27 ms/op 1.03
mainnet_e58758 - phase0 beforeProcessEpoch 439.96 ms/op 513.62 ms/op 0.86
mainnet_e58758 - phase0 processJustificationAndFinalization 58.885 us/op 54.755 us/op 1.08
mainnet_e58758 - phase0 processRewardsAndPenalties 101.20 ms/op 102.21 ms/op 0.99
mainnet_e58758 - phase0 processRegistryUpdates 39.267 us/op 38.902 us/op 1.01
mainnet_e58758 - phase0 processSlashings 1.4520 us/op 1.7960 us/op 0.81
mainnet_e58758 - phase0 processEth1DataReset 1.1790 us/op 1.8540 us/op 0.64
mainnet_e58758 - phase0 processEffectiveBalanceUpdates 9.8080 ms/op 8.2693 ms/op 1.19
mainnet_e58758 - phase0 processSlashingsReset 7.6980 us/op 9.1120 us/op 0.84
mainnet_e58758 - phase0 processRandaoMixesReset 11.465 us/op 15.352 us/op 0.75
mainnet_e58758 - phase0 processHistoricalRootsUpdate 1.3310 us/op 2.1790 us/op 0.61
mainnet_e58758 - phase0 processParticipationRecordUpdates 8.2530 us/op 10.975 us/op 0.75
mainnet_e58758 - phase0 afterProcessEpoch 182.77 ms/op 167.80 ms/op 1.09
phase0 processEffectiveBalanceUpdates - 250000 normalcase 10.550 ms/op 10.092 ms/op 1.05
phase0 processEffectiveBalanceUpdates - 250000 worstcase 0.5 1.4184 s/op 1.3392 s/op 1.06
phase0 processRegistryUpdates - 250000 normalcase 38.249 us/op 42.556 us/op 0.90
phase0 processRegistryUpdates - 250000 badcase_full_deposits 3.0318 ms/op 2.7582 ms/op 1.10
phase0 processRegistryUpdates - 250000 worstcase 0.5 1.7728 s/op 1.7547 s/op 1.01
phase0 getAttestationDeltas - 250000 normalcase 35.075 ms/op 41.819 ms/op 0.84
phase0 getAttestationDeltas - 250000 worstcase 35.075 ms/op 36.632 ms/op 0.96
phase0 processSlashings - 250000 worstcase 37.810 ms/op 37.842 ms/op 1.00
shuffle list - 16384 els 12.913 ms/op 12.648 ms/op 1.02
shuffle list - 250000 els 185.17 ms/op 182.73 ms/op 1.01
getEffectiveBalances - 250000 vs - 7PWei 11.445 ms/op 11.014 ms/op 1.04
computeDeltas 4.0884 ms/op 4.1272 ms/op 0.99
getPubkeys - index2pubkey - req 1000 vs - 250000 vc 2.4844 ms/op 2.4591 ms/op 1.01
getPubkeys - validatorsArr - req 1000 vs - 250000 vc 777.08 us/op 678.37 us/op 1.15
BLS verify - blst-native 1.8622 ms/op 1.6384 ms/op 1.14
BLS verifyMultipleSignatures 3 - blst-native 3.8155 ms/op 3.3693 ms/op 1.13
BLS verifyMultipleSignatures 8 - blst-native 8.2196 ms/op 7.2622 ms/op 1.13
BLS verifyMultipleSignatures 32 - blst-native 29.819 ms/op 26.414 ms/op 1.13
BLS aggregatePubkeys 32 - blst-native 39.340 us/op 34.816 us/op 1.13
BLS aggregatePubkeys 128 - blst-native 153.13 us/op 135.50 us/op 1.13
getAttestationsForBlock 82.010 ms/op 103.22 ms/op 0.79
CheckpointStateCache - add get delete 16.045 us/op 14.989 us/op 1.07
validate gossip signedAggregateAndProof - struct 4.5954 ms/op 4.4705 ms/op 1.03
validate gossip signedAggregateAndProof - treeBacked 4.4101 ms/op 4.4104 ms/op 1.00
validate gossip attestation - struct 2.0924 ms/op 2.1073 ms/op 0.99
validate gossip attestation - treeBacked 2.1188 ms/op 2.1068 ms/op 1.01
bytes32 toHexString 1.8140 us/op
bytes32 Buffer.toString(hex) 706.00 ns/op
bytes32 Buffer.toString(hex) from Uint8Array 944.00 ns/op
bytes32 Buffer.toString(hex) + 0x 673.00 ns/op
Object access 1 prop 0.35100 ns/op 0.32600 ns/op 1.08
Map access 1 prop 0.28900 ns/op 0.27800 ns/op 1.04
Object get x1000 18.172 ns/op 16.811 ns/op 1.08
Map get x1000 0.96800 ns/op 0.97300 ns/op 0.99
Object set x1000 121.24 ns/op 111.43 ns/op 1.09
Map set x1000 71.677 ns/op 67.859 ns/op 1.06
Return object 10000 times 0.37210 ns/op 0.36770 ns/op 1.01
Throw Error 10000 times 6.1173 us/op 5.9920 us/op 1.02
enrSubnets - fastDeserialize 64 bits 1.3700 us/op 1.2680 us/op 1.08
enrSubnets - ssz BitVector 64 bits 16.305 us/op 15.621 us/op 1.04
enrSubnets - fastDeserialize 4 bits 487.00 ns/op 439.00 ns/op 1.11
enrSubnets - ssz BitVector 4 bits 3.0370 us/op 2.9130 us/op 1.04
RateTracker 1000000 limit, 1 obj count per request 189.10 ns/op 181.73 ns/op 1.04
RateTracker 1000000 limit, 2 obj count per request 142.86 ns/op 136.29 ns/op 1.05
RateTracker 1000000 limit, 4 obj count per request 119.36 ns/op 112.54 ns/op 1.06
RateTracker 1000000 limit, 8 obj count per request 107.61 ns/op 100.95 ns/op 1.07
RateTracker with prune 4.6330 us/op 4.1510 us/op 1.12

by benchmarkbot/action

@@ -0,0 +1,215 @@
export type TestRunnerMemoryOpts<T> = {
Copy link
Member

@wemeetagain wemeetagain Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you're ready, maybe this should be pulled into its own package. I see you're using it in ssz too here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree it should eventually, tho it still needs more testing.

@dapplion dapplion changed the title Use memory efficiency toHex in pubkey2index map Use memory efficient toHex in pubkey2index map Jan 4, 2022
@dapplion dapplion merged commit c384b36 into master Jan 4, 2022
@dapplion dapplion deleted the dapplion/bytes-hex-memory branch January 4, 2022 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants