-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize memory buffer, simplify set32, use sha256-simd #7060
Conversation
core/vm/memory.go
Outdated
copy(m.store[offset:offset+32], []byte{0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0}) | ||
// Fill in relevant bits | ||
val.WriteToSlice(m.store[offset:]) | ||
var buf [32]byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI: we have zeroes
global variable in this file
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. I changed to use that instead.
@@ -304,7 +313,8 @@ func (in *EVMInterpreter) Run(contract *Contract, input []byte, readOnly bool) ( | |||
err = nil // clear stop token error | |||
} | |||
|
|||
return res, err | |||
ret = append(ret, res...) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mabye good. But I need run some experiment with Grafana to see an impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look!
For the Celeron N5100 I'm left with github.com/ledgerwatch/erigon/crypto/bn256/cloudflare.gfpMul
dominating 47% of the CPU profiling while before I had memory.go related calls. However I must admit that the SHA/SIMD change made a larger difference than the memory pool. This machine is at 100% CPU and no iowait as it has a good nvme on a slow CPU.
Another machine syncing Ethereum mainnet on Intel 11800h using mechanical HDD with nvme lvm cache just had a minor CPU usage reduction as it is severely limited by the slow HDD IO.
If desired, I can split the changes into smaller PRs.
pls run |
Maybe Reset() or Resize() need set zeroes to re-used region. Or maybe when we get buf from pool: fill it by zeroes. |
It was missing an offset on Set32 after I changed to use the |
core/vm/memory.go
Outdated
@@ -59,7 +59,7 @@ func (m *Memory) Set32(offset uint64, val *uint256.Int) { | |||
} | |||
// Zero the memory area | |||
copy(m.store[offset:offset+32], zeroes) | |||
val.WriteToSlice(m.store) | |||
val.WriteToSlice(m.store[offset:]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not offset:offset+32
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed, thanks. WriteToSlice will write 32 bytes but I agree that specifying explicitly reads better, it was just the way I found it.
I will try run real test tomorrow. Thank you |
@shyba thanks. if you have any idea - how we can avoid copy-on-return - feel free to PR: https://github.com/ledgerwatch/erigon/pull/7060/files#diff-121e1c875cf3bed2f44305ab103a23126e330633fc45011cb5b1041f7b1ca119R316 |
) Hi, I'm syncing Gnosis on a Celeron N5100 to get familiar with the codebase. In the process I managed to optimize some things from profiling. Since I'm not yet on the project Discord, I decided to open this PR as a suggestion. This pass all tests here and gave me a nice boost for that platform, although I didn't have time to benchmark it yet. * reuse VM Memory objects with sync.Pool. It starts with 4k as `evmone` [code suggested](https://github.com/ethereum/evmone/blob/0897edb00179ac3d27f27c82749119003df851e7/lib/evmone/execution_state.hpp#L49) as a good value. * set32 simplification: mostly cosmetic * sha256-simd: Celeron has SHA instructions. We should probably do the same for torrent later, but this already helped as it is very CPU bound on such a low end processor. Maybe that helps ARM as well.
Filter method with changed sequence of arguments add Plato hardfork support Fix chapel sync issues change back docker tag #2 change back docker tag remove entrypoint from Dockerfile exitChan + logging add starter to PATH add docker tag starter with errortrack flag fix Dockerfile luban fork support; light client features; upstream erigon main repo add grpc-health-probe to Dockerfile add planck block number to mainnet config use replace mechanism add nolint annotation add planck contract upgrades and backoff time, fix --txpool.commit.every panic change ledgerwatch => chainstack erigon-lib use replace mechanism for better upstream support soon use replace mechanism add default value to deprecated txpool config add nolint annotation add planck contract upgrades and backoff time, fix --txpool.commit.every panic add ics23 proof type, add tests, key validators etc change ledgerwatch => chainstack erigon-lib close memdb in ./cmd/evm stop using olddb in simulated backend (erigontech#7219) e3: enable simulated backed (erigontech#7218) sqlite version up e3: deadlock fix Skip test in fork_graph_test.go (erigontech#7216) Fork-choice graph oriented implementation (erigontech#7212) e4: step 1 to run tests (erigontech#7209) e3: one more reconst deadlock fix (erigontech#7207) Break dependency of `ethcfg` package to `core`/`consensus`/etc... move genesis struct to 'types' package (erigontech#7206) eip-4844: NewEVMBlockContext now expects excessDataGas (erigontech#7203) Small change in core.NewEVMBlockContext and now it expects excessDataGas. This will be used in state transition to compute data fee for eip-4844 data blobs. The logic that computes it will be added in the next PRs. e4: add tests flag e4: add flag in tests Add maxNumberOfFailedWithdrawalsToProcess to executeSystemWithdrawals (erigontech#7197) See gnosischain/specs#3 log BLS verification failure (erigontech#7196) e3: one more reconst deadlock fix add netgo tag hopefully fix erigontech#7130 (erigontech#7193) i think issue is golang/go#30310 (comment) i was able to reproduce on my machine using go releaser dry run, and adding this build flag fixed it. apparently its some glibc issue, what a surprise! :) Co-authored-by: a <a@tuxpa.in> Remove propagation of lightclient updates (erigontech#7192) eip-4844: small additions and modified gaspool (erigontech#7190) This PR contains very small EIP-4844 additions. GasPool is modified and now it is a struct with 2 fields "gas" and "dataGas" (blobs are priced in dataGas). ExcessDataGas block header field added. ExcessDataGas needed to compute the data gas price. EIP-4844 helper functions are added as well. removed --el.enabled (erigontech#7187) e3: recon deadlock fix (erigontech#7186) Added reverse beacon changeset for beacon state rewind (erigontech#7185) Added changesets for beacon chain to implement memory efficient fork choice kv_temporal lost commit clean prevent infinity unwind when no --unwind.every flag set integration temporal db add constant kv.Unlim=-1 (erigontech#7183) enable more linters erigontech#954 (erigontech#7179) DomainRange DomainRange e3: DomainGetAsOf, DomainRange (erigontech#7177) fix(release): http-https redirect working (erigontech#7176) Fixed storage for download (erigontech#7175) e3: remove some iterator (erigontech#7174) readme dates add eth_getFilterLogs to docs (erigontech#7171) add eth_getFilterLogs to docs panic in trace_adhoc clean save attempt to finx bindtest (erigontech#7167) manual integration tests run allow CI v4 to fix txpool startup race (erigontech#7165) ``` ================== WARNING: DATA RACE Write at 0x00c0005dc570 by goroutine 273633: github.com/ledgerwatch/erigon/turbo/shards.(*Accumulator).Reset() github.com/ledgerwatch/erigon/turbo/shards/state_change_accumulator.go:32 +0x5ec github.com/ledgerwatch/erigon/turbo/stages.StageLoopStep() github.com/ledgerwatch/erigon/turbo/stages/stageloop.go:163 +0x3bd github.com/ledgerwatch/erigon/turbo/stages.StageLoop() github.com/ledgerwatch/erigon/turbo/stages/stageloop.go:94 +0x197 github.com/ledgerwatch/erigon/eth.(*Ethereum).Start.func1() github.com/ledgerwatch/erigon/eth/backend.go:1058 +0x10b Previous write at 0x00c0005dc570 by goroutine 273468: github.com/ledgerwatch/erigon/turbo/shards.(*Accumulator).Reset() github.com/ledgerwatch/erigon/turbo/shards/state_change_accumulator.go:32 +0x2ef github.com/ledgerwatch/erigon/turbo/shards.(*Accumulator).SendAndReset() github.com/ledgerwatch/erigon/turbo/shards/state_change_accumulator.go:41 +0x228 github.com/ledgerwatch/erigon/eth.New.func8() github.com/ledgerwatch/erigon/eth/backend.go:620 +0x80e ``` fix e3 test go 1.19 atomics (erigontech#7164) linter up test simplify Add eth_getProof support for historical blocks (erigontech#7115) This PR starts with a few small commits of code cleanup. Reviewed separately they should hopefully obviously be functionally no-ops. I'm happy to strip these out and submit them separately if desired. The final commit is to add support for older blocks as a parameter to eth_getProof. In order to compute proofs, the function leverages the staged sync unwinding code to bring the hashed state table back to its historic state, as well as to build a list of trie nodes which need to be invalidated/re-computed in the trie computation. Because these operations could be expensive for very old blocks, it limits the distance proofs are allowed from the head. It also adds some additional checks for correctness, as well as tests which verify the implementation. This was discussed a bit on Discord in the db-format topic. --------- Co-authored-by: Jason Yellick <jason@enya.ai> Go 1.18 drop (erigontech#7159) --txpool.commit.every panic handling (erigontech#7163) [Gnosis] Don't call ExecuteSystemWithdrawals before Shanghai (erigontech#7160) This is a patch to PR erigontech#6940. Withdrawal contract should not be called for pre-Shanghai block. The issue was found on gnosis_withdrawals_devnet_2 (PR erigontech#7150), causing ``` [WARN] [03-22|10:44:51.574] [7/15 Execution] Execution failed block=51035 hash=0xa8fb9e58eb734b7ce4e2e6260ad20e07a16039325f9924cc18ea61fa2eb5ee90 err="execution reverted" ``` e3: close context e3: domain range api simplify (erigontech#7158) e3: unionKV limit (erigontech#7157) e3: union limit (erigontech#7156) e3: remove settings table (erigontech#7155) e4: reset state domain: to use history api e4: to use serializev3 (erigontech#7154) add flag txpool.commit_every (erigontech#7062) Adds flag _--txpool.commit_every_ to control how often transactions should be committed to the storage [Related Issue](erigontech#7002) e3: exec to use half of CPU's by default Total difficulty can be huge on Gnosis (erigontech#7149) This fixes the following issue observed on gnosis_withdrawals_devnet_2: ``` [DBUG] [03-21|09:24:15.060] Handling incoming message stream=RecvMessage err="newBlock66: too large block TD: bitlen 144" ``` Update ethereum/tests to v12 (erigontech#7148) [Tests update 12: Shanghai tests](https://github.com/ethereum/tests/releases/tag/v12) Drop Default from GenesisBlock* functions (erigontech#7147) Small cosmetic changes and clean-ups fixed fork consensus spec test (erigontech#7143) removed stream handler (erigontech#7142) added tests for ssz_static in consensus tests and fixed beacon blocks encoding (erigontech#7141) Update go deps (erigontech#7138) Updated CI to 1.19.0 (erigontech#7139) removed database functionality from lightclient (erigontech#7135) Banning peers sending bad gossip (erigontech#7134) feat(docker): add debug.Dockerfile with delve (erigontech#7125) - removes dbtools from debug container - adds delve to debug container refactored sentinel gossip and only connect to nimbus now (erigontech#7127) Added tool for processing for all mainnet beacon blocks (erigontech#7095) --rpc.returndata.limit torrent: suppress some warning (erigontech#7124) grpc_middleware_up (erigontech#7123) fix(release): download page https redirect (erigontech#7120) Updated initial traefik config so the http -> https redirect on download.erigon.sh will work. E4 metrics upd (erigontech#7122) get localnode address up front on creation to save potential data race (erigontech#7111) the log line here was the culprit for the race. made sense to just capture this on localnode creation instead and hold onto it for when the server is started. ran test with `-race` and `-count=5000` to double check and all looks good fix(release): run download page update separately (erigontech#7112) Checksum not working as the file isn't available in the assets when run in the same job. txpool: senders batch commit optimize (erigontech#7118) e3: don't loose nil-value in reconstitution (erigontech#7117) e3: history no auto-increment (erigontech#7097) e3: less merge logs don't show "block number" in txpool logs it confusing users check for nil stream when running the null check in handler (erigontech#7105) up goprotobuf check for nil before returning invalid json in rpc streaming calls (erigontech#7104) should handle nil having already been written in any rpc call before writing it again causing invalid json to be returned. torrent: don't cancel storage, because lib can't handle such error and can graceful-shutdown anyway (erigontech#7102) fix for reading yaml/toml config in cmd/integration (erigontech#7101) torrent lib version up to remove some warning log (erigontech#7100) move more parts to lru2 (erigontech#7098) Optimize memory buffer, simplify set32, use sha256-simd (erigontech#7060) Hi, I'm syncing Gnosis on a Celeron N5100 to get familiar with the codebase. In the process I managed to optimize some things from profiling. Since I'm not yet on the project Discord, I decided to open this PR as a suggestion. This pass all tests here and gave me a nice boost for that platform, although I didn't have time to benchmark it yet. * reuse VM Memory objects with sync.Pool. It starts with 4k as `evmone` [code suggested](https://github.com/ethereum/evmone/blob/0897edb00179ac3d27f27c82749119003df851e7/lib/evmone/execution_state.hpp#L49) as a good value. * set32 simplification: mostly cosmetic * sha256-simd: Celeron has SHA instructions. We should probably do the same for torrent later, but this already helped as it is very CPU bound on such a low end processor. Maybe that helps ARM as well. read metrics config from yaml file (erigontech#7089) Backfill eth getproof tests (erigontech#7092) This PR adds missing tests for eth_getProof and does some mild refactoring with switching from strings to more strict types. It's likely best/most easily reviewed commit by commit. Note, the tests include quite a number of helper types and functions for doing the proof validation. This is largely because unlike Geth, Erigon's approach to trie computations only requires serializing the trie nodes, not deserializing them. Consequently, it wasn't obvious how to leverage the existing trie code for doing deserialization and proof checks. I checked on Discord, but, there were no suggestions. Of course, any feedback is welcome and I'd be happy to remove this code if it can be avoided. Additionally, I've opted to change the interface type for `GetProof` to use a `common.Hash` for the storage keys instead of a `string`. I _think_ this should be fairly safe, as until very recently it was unimplemented. That being said, since it's an interface, it has the potential to break other consumers, anyone who was generating mocks against it etc. There's additionally a `GetStorageAt` that follows the same parameter. I'd be happy to submit a PR modifying this one as well if desired. Also, as a small note, there is test code for checking storage proofs, but, storage proofs aren't currently supported by the implementation. My hope is to add storage proofs and historic proofs in a followup PR. --------- Co-authored-by: Jason Yellick <jason@enya.ai> Fix broken link in doc (erigontech#7094) The doc in [dupsort.md](https://github.com/ledgerwatch/erigon/blob/devel/docs/programmers_guide/dupsort.md#erigon) points to a non-existent `AbstractKV.md` As best as I can tell, the `AbstractKV.md` was reworked and renamed in commit 0bc61c0 then, this was later moved into erigon-lib. This commit simply repoints the doc at this new location. Co-authored-by: Jason Yellick <jason@enya.ai> Delete retain_list_builder.go (erigontech#7096) I was walking the code to try to understand in a bit more detail how the state root hash is constructed and stumbled across `retain_list_builder.go` as a consumer of the retain list APIs. But, as far as I can tell, this file doesn't seem to actually be used anywhere (including tests), and, it's seen no development (other than import fixes) since 2020 or so. All linting and tests still pass for me locally without it, so, I believe it's safe to simply remove. Co-authored-by: Jason Yellick <jason@enya.ai> Added hard fork transition support to Erigon-CL. (erigontech#7088) e3: split "changed keys" iterator to simplify (erigontech#7086) check for free messages when calling trace_transaction (erigontech#7073) Gateway reported an issue with a trace returning an odd result, similar to the recent problem we'd seen with gnosis. I found that debug_traceTransaction worked fine so found where the differences were. trace_transaction wasn't checking for service transactions so the trace failed around fees when it shouldn't. A number of code paths use callManyTransactions so they should all now check for service messages where needed. erigon-lib up e3: reconst: run workers in errgroup (erigontech#7071) e3: simplify wal (erigontech#7085) go mod tidy Downloader main loop wait on close (erigontech#7082) fix typo in eth_call error (erigontech#7084) Spotted this typo when poking at Cloudflare's public gateway and mapped it back here 🙈 Added partial SSZ library (erigontech#7083) Implements SSZ encode/decoding and hashtreeroot for simple list-lacking data structures. does not account offset case. Added phase0 support to Erigon-CL. (erigontech#7066) Added phase 0 support. save nil ptr in delete ancient (erigontech#7081) e4: added some metrics to code (erigontech#7078) Remove ETC-specific DAOForkSupport=false functionality (erigontech#7075) Remove archaic eip150Hash functionality (erigontech#7074) save sentry: handle "retry later" grpc stream (erigontech#6852) params: remove EF azure bootnodes (erigontech#7061) Same has been done in geth: ethereum/go-ethereum#26828 cleanup fix build e3: aggressive pruning (erigontech#7070) up moq version, up some lru version (erigontech#7069) run E4 via integration binary (erigontech#7063) added subcommand `state_domains` of `bin/integration` to start processing of existing chaindata with use of state domains and commitment. It creates two directory in `datadir`: `state` and `statedb` for files and mdbx respectively. This version runs blocks one after another and produces merged files. Want to add some metrics to export later. Added SSZ support for Phase0 state. (erigontech#7065) Partial EIP1186 eth_getProof implementation (erigontech#6560) This is a partial implementation of eth_getProof (see issue erigontech#1349), supporting only a request for the latest block and an empty list of storage keys (i.e. Account proof only). I don't know if there's a better way of implementing this, but this was what I could come up with. Posting it here in case it's useful. Example output: ``` > eth.getProof("0x67b1d87101671b127f5f8714789C7192f7ad340e", [], 'latest') { accountProof: ["0xf90131a0252c9d4ed347b4cf3fdccaea3ccef0a507e6bd4dbe4dcd98609b7195347c4062a0ab8cdb808c8303bb61fb48e276217be9770fa83ecf3f90f2234d558885f5abf18080a01a697e814758281972fcd13bc9707dbcd2f195986b05463d7b78426508445a04a0b5d7a91be5ee273cce27e2ad9a160d2faadd5a6ba518d384019b68728a4f62f4a0c2c799b60a0cd6acd42c1015512872e86c186bcf196e85061e76842f3b7cf86080a0e73919d9f472eec11f6da95518503f5527a98b9428f7a02c4f55bf51854214e480a06301b39b2ea8a44df8b0356120db64b788e71f52e1d7a6309d0d2e5b86fee7cb8080a01b7779e149cadf24d4ffb77ca7e11314b8db7097e4d70b2a173493153ca2e5a0a066a7662811491b3d352e969506b420d269e8b51a224f574b3b38b3463f43f0098080", "0xf8518080808080a0a00135c9ec2655cb6a47ab7ad27d6fc150d9cba8b3d4a702e879179116a68a60808080808080a02fb46956347985b9870156b5747712899d213b1636ad4fe553c63e33521d567a80808080", "0xf873a02056274a27dd7524955417c11ecd917251cc7c4c8310f4c7e4bd3c304d3d9a79b850f84e808a021e19e0c9bab2400000a056e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421a0c5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470"], address: "0x67b1d87101671b127f5f8714789c7192f7ad340e", balance: "0x21e19e0c9bab2400000", codeHash: "0xc5d2460186f7233c927e7db2dcc703c0e500b653ca82273b7bfad8045d85a470", nonce: "0x0", storageHash: "0x56e81f171bcc55a6ff8345e692c0f86e5b48e01b996cadc001622fb5e363b421", storageProof: [] } > eth.getBlock('latest').stateRoot "0x6a0673c691edfa4c4528323986bb43c579316f436ff6f8b4ac70854bbd95340b" ``` e3: use iterators composition in invIndex.Range (erigontech#7056) graphql: add Tx Receipts Logs to output (erigontech#7059) Added pending attestation object. (erigontech#7058) mdbx: RangeDupSort iterator (erigontech#7054) --torrent.staticpeers (erigontech#7052) Added Capella specs support to Erigon-CL (erigontech#7051) Passing consensus-specs tests for Capella. Processing of withdrawals and ExecutionChanges. efficient non-validation implemented. Refactored: ExecutionPayload/ExecutionPayloadHeader. save up grpc version (erigontech#7043) save attempt to fix TestGolangBindings (erigontech#7041) etl: distinct empty values from nil (erigontech#7039) Reverts erigontech#7038 Revert "etl: distinct empty values from nil" (erigontech#7038) Reverts erigontech#6934 e3: native map instead of btree where can (because e2 experience shows - it's faster) (erigontech#7010) etl: distinct empty values from nil (erigontech#6934)
Hi,
I'm syncing Gnosis on a Celeron N5100 to get familiar with the codebase. In the process I managed to optimize some things from profiling.
Since I'm not yet on the project Discord, I decided to open this PR as a suggestion. This pass all tests here and gave me a nice boost for that platform, although I didn't have time to benchmark it yet.
evmone
code suggested as a good value.