Skip to content
This repository has been archived by the owner on Nov 6, 2020. It is now read-only.

Multithreaded snapshot creation #9239

Merged
merged 27 commits into from
Sep 13, 2018
Merged

Conversation

ngotchac
Copy link
Contributor

@ngotchac ngotchac commented Jul 27, 2018

This PR introduces multithreaded snapshot creation. The idea is that a snapshot will be divided in N parts, and the user can choose to multithread the process into T threads, each thread processing N / T parts of the snapshot.

For now, it would be divided into 16 different parts. The more parts, the bigger the snapshot will be, since there are no deduplication between parts. On Kovan, the snapshot's size increased from 1GB to 1.1GB, so a 10% increase. I still need to run this on Foundation.

Regarding performance gains, here's a little graph:
snapshot cpu

The time spent creating the snapshot is not linear with the number of threads, but with sqrt(num_threads).

This PR adds thus a CLI argument --snapshot-threads which specifies the number of threads.
If this is accepted, the snapshot manifest could be edited in a future PR to specify the differents sub-parts of the state snapshot, so that the snapshot recovery would also work in parrallel.

It should be discussed whether this performance increase (in creating, and in a near future in restoring) outweighs the size increase.

Update : on Mainnet, splitting the snapshot into 16 subparts increases the number of chunks by only 2%, from 1818 to 1859 chunks. I was able to produce a full snapshot on 8 threads in 1h10min instead of the previously 5h.

@ngotchac ngotchac added A0-pleasereview 🤓 Pull request needs code review. M4-core ⛓ Core client code / Rust. labels Jul 27, 2018
@ngotchac ngotchac requested a review from rphmeier July 27, 2018 12:23
@rphmeier
Copy link
Contributor

@ngotchac related to #8565 ?

@ngotchac
Copy link
Contributor Author

ngotchac commented Jul 27, 2018

@rphmeier Kind of, but it's really a first step that shows what performance improvements you can get, vs. the size increase drawback. (It doesn't introduce any new snapshot format, yet)

@rphmeier
Copy link
Contributor

Does it produce bit-for-bit the same snapshot regardless of the number of threads? That's a property of snapshots which we want to keep

@ngotchac
Copy link
Contributor Author

Yes it does ! It won't if you change the SNAPSHOT_SUBPART variable (which for now is 16), but if you run 1,4 or 16 threads it will produce the same chunks, so the same snapshot.

@5chdn 5chdn added this to the 2.1 milestone Jul 27, 2018
@ngotchac ngotchac requested a review from ordian July 31, 2018 10:49
@ordian
Copy link
Collaborator

ordian commented Jul 31, 2018

@ngotchac I'm going to take a look after paritytech/parity-common#13 (looking at it now) is merged, since there are too many distracting changes associated with the branch = "ng-fix-triedb-seek" switch.

Copy link
Collaborator

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I've left some code style comments

block_guard.join().map(|block_hashes| (state_hashes, block_hashes))
})
// The number of threads must be between 1 and SNAPSHOT_SUBPARTS
let num_threads: usize = cmp::max(1, cmp::min(processing_threads, SNAPSHOT_SUBPARTS));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe use assert!(processing_threads >= 1, "...") instead of this cmp::max, since it would be a logical mistake to pass 0 as processing_threads?


for subpart_chunk in subparts_c.chunks(num_threads) {
if subpart_chunk.len() > thread_idx {
let part = subpart_chunk[thread_idx];
Copy link
Collaborator

@ordian ordian Aug 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe something like this would be easier to read (and we don't need to allocate subparts):

let state_guard = scope.spawn(move || -> Result<Vec<H256>, Error> {
  let mut chunk_hashes = Vec::new();
  for part in (thread_idx..SNAPSHOT_SUBPARTS).step_by(num_threads) {
    ...
  }
}

Note, that step_by was only stabilized in rust 1.28.

for guard in state_guards {
let mut part_state_hashes = guard.join()?.clone();
state_hashes.append(&mut part_state_hashes);
}
Copy link
Collaborator

@ordian ordian Aug 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can avoid cloning:

		let mut state_hashes = Vec::new();

		for guard in state_guards {
			let part_state_hashes = guard.join()?;
			state_hashes.extend(part_state_hashes);
		}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

extend expects a mutable reference :/

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oups, you're right, sorry.

let mut seek_to = None;

if let Some(part) = part {
let part_offset = 256 / SNAPSHOT_SUBPARTS;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe name 256? I don't like magic constants.

ethcore/src/snapshot/mod.rs Show resolved Hide resolved
@@ -263,10 +318,12 @@ impl<'a> StateChunker<'a> {

/// Walk the given state database starting from the given root,
/// creating chunks and writing them out.
/// `part` is a number between 0 and 15, which describe which part of

This comment was marked as resolved.

seek_from[0] = (part * part_offset) as u8;
account_iter.seek(&seek_from)?;

// Set the upper-bond, except for the last part
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: upper bound

account_iter.seek(&seek_from)?;

// Set the upper-bond, except for the last part
if part < SNAPSHOT_SUBPARTS - 1 {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we could get rid of this if, if we make seek_to inclusive upper bound, e.g. the check would be

			if account_key[0] > seek_to {
				break;
			}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But seek_to wouldn't make sense if part == None, so it might as well be an Option

@ordian ordian added A5-grumble 🔥 Pull request has minor issues that must be addressed before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Aug 3, 2018
@5chdn 5chdn added A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. and removed A0-pleasereview 🤓 Pull request needs code review. labels Sep 6, 2018
@ngotchac
Copy link
Contributor Author

ngotchac commented Sep 7, 2018

@ordian I updated it so that the default number of snapshot threads is half the number of cpus

Copy link
Collaborator

@ordian ordian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the case of num_cpus::get() / 2 == 0 will be handled properly in snapshot_config fn, so it's fine.
I'm also ok with Cargo.lock updates as long as tests pass, since we need to update dependencies from time to time anyway.

@ordian ordian added A8-looksgood 🦄 Pull request is reviewed well. and removed A6-mustntgrumble 💦 Pull request has areas for improvement. The author need not address them before merging. labels Sep 7, 2018
@5chdn 5chdn modified the milestones: 2.1, 2.2 Sep 11, 2018
@5chdn 5chdn added B0-patchthis 🕷 B9-blocker 🚧 This pull request blocks the next release from happening. Use only in extreme cases. labels Sep 12, 2018
@ordian ordian added A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. and removed A8-looksgood 🦄 Pull request is reviewed well. labels Sep 12, 2018
@5chdn 5chdn mentioned this pull request Sep 12, 2018
21 tasks
Copy link
Contributor

@cheme cheme left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

It could be good to have a comment somewhere indicating that it is no use to put more than 16 threads (maybe in the help or next to the two global variables).

@5chdn 5chdn added the B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. label Sep 13, 2018
@ordian ordian added A8-looksgood 🦄 Pull request is reviewed well. and removed A7-looksgoodcantmerge 🙄 Pull request is reviewed well, but cannot be merged due to conflicts. labels Sep 13, 2018
@5chdn 5chdn merged commit 4ddd69c into master Sep 13, 2018
@5chdn 5chdn deleted the ng-multithread-snapshot-creation branch September 13, 2018 10:58
5chdn pushed a commit that referenced this pull request Sep 13, 2018
* Add Progress to Snapshot Secondary chunks creation

* Use half of CPUs to multithread snapshot creation

* Use env var to define number of threads

* info to debug logs

* Add Snapshot threads as CLI option

* Randomize chunks per thread

* Remove randomness, add debugging

* Add warning

* Add tracing

* Use parity-common fix seek branch

* Fix log

* Fix tests

* Fix tests

* PR Grumbles

* PR Grumble II

* Update Cargo.lock

* PR Grumbles

* Default snapshot threads to half number of CPUs

* Fix default snapshot threads // min 1
5chdn added a commit that referenced this pull request Sep 17, 2018
* parity-version: mark 2.1.0 track beta

* ci: update branch version references

* docker: release master to latest

* Fix checkpointing when creating contract failed (#9514)

* ci: fix json docs generation (#9515)

* fix typo in version string (#9516)

* Update patricia trie to 0.2.2 crates. Default dependencies on minor
version only.

* Putting back ethereum tests to the right commit

* Enable all Constantinople hard fork changes in constantinople_test.json (#9505)

* Enable all Constantinople hard fork changes in constantinople_test.json

* Address grumbles

* Remove EIP-210 activation

* 8m -> 5m

* Temporarily add back eip210 transition so we can get test passed

* Add eip210_test and remove eip210 transition from const_test

* In create memory calculation is the same for create2 because the additional parameter was popped before. (#9522)

* deps: bump fs-swap and kvdb-rocksdb

* Multithreaded snapshot creation (#9239)

* Add Progress to Snapshot Secondary chunks creation

* Use half of CPUs to multithread snapshot creation

* Use env var to define number of threads

* info to debug logs

* Add Snapshot threads as CLI option

* Randomize chunks per thread

* Remove randomness, add debugging

* Add warning

* Add tracing

* Use parity-common fix seek branch

* Fix log

* Fix tests

* Fix tests

* PR Grumbles

* PR Grumble II

* Update Cargo.lock

* PR Grumbles

* Default snapshot threads to half number of CPUs

* Fix default snapshot threads // min 1

* correct before_script for nightly build versions (#9543)

- fix gitlab array of strings syntax error
- get proper commit id
- avoid colon in stings

* Remove initial token for WS. (#9545)

* version: mark release critical

* ci: fix rpc docs generation 2 (#9550)

* Improve P2P discovery (#9526)

* Add `target` to Rust traces

* network-devp2p: Don't remove discovery peer in main sync

* network-p2p: Refresh discovery more often

* Update Peer discovery protocol

* Run discovery more often when not enough nodes connected

* Start the first discovery early

* Update fast discovery rate

* Fix tests

* Fix `ping` tests

* Fixing remote Node address ; adding PingPong round

* Fix tests: update new +1 PingPong round

* Increase slow Discovery rate
Check in flight FindNode before pings

* Add `deprecated` to deprecated_echo_hash

* Refactor `discovery_round` branching

* net_version caches network_id to avoid redundant aquire of sync read lock (#9544)

* net_version caches network_id to avoid redundant aquire of sync read lock, #8746

* use lower_hex display formatting for net_peerCount rpc method

* Increase Gas-floor-target and Gas Cap (#9564)

+ Gas-floor-target increased to 8M by default

+ Gas-cap increased to 10M by default

* Revert to old parity-tokio-ipc.

* Downgrade named pipes.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A8-looksgood 🦄 Pull request is reviewed well. B7-releasenotes 📜 Changes should be mentioned in the release notes of the next minor version release. B9-blocker 🚧 This pull request blocks the next release from happening. Use only in extreme cases. M4-core ⛓ Core client code / Rust.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants