Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use jobserver in wasm-builder to limit concurrency of spawned cargo processes #4946

Merged
merged 3 commits into from
Jul 24, 2024

Conversation

tmpolaczyk
Copy link
Contributor

When building multiple runtimes in parallel, each of them will try to use the concurrency set by the parent cargo process. For example, in a system with 8 cpu cores, building 3 runtimes in parallel creates 8 * 3 tasks. This results in the system hanging because of the high cpu and memory usage.

This PR allows the substrate_wasm_builder to use the same jobserver as the parent cargo process, making all invocations of cargo share the same concurrency pool. So in a system with 8 cores, there will never be more than 8 tasks running at the same time.

Implementation roughly based on cargo but with less unsafe.

This can be tested by telling cargo to use half the cpu cores, like cargo build -j4 in an 8 core machine. Before this PR it will use 100% cpu when building 2 runtimes in parallel, after this PR it will always use 50%.

@tmpolaczyk tmpolaczyk requested a review from koute as a code owner July 4, 2024 15:38
Copy link
Member

@bkchr bkchr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really good idea! Never knew that it was that easy to integrate it :D


JOBSERVER.get_or_init(|| {
// Unsafe because it deals with raw fds
unsafe { jobserver::Client::from_env() }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the documentation, this should be called as early as possible in the process to get the correct fd. This said, we should move this call to a different location. I think at the top of build_project should be fine.

@bkchr bkchr added the T0-node This PR/Issue is related to the topic “node”. label Jul 20, 2024
@bkchr bkchr added R0-silent Changes should not be mentioned in any release notes and removed T0-node This PR/Issue is related to the topic “node”. labels Jul 24, 2024
@bkchr bkchr enabled auto-merge July 24, 2024 08:14
@bkchr bkchr added this pull request to the merge queue Jul 24, 2024
Merged via the queue into paritytech:master with commit fee481f Jul 24, 2024
155 of 160 checks passed
TarekkMA pushed a commit to moonbeam-foundation/polkadot-sdk that referenced this pull request Aug 2, 2024
…rocesses (paritytech#4946)

When building multiple runtimes in parallel, each of them will try to
use the concurrency set by the parent cargo process. For example, in a
system with 8 cpu cores, building 3 runtimes in parallel creates 8 * 3
tasks. This results in the system hanging because of the high cpu and
memory usage.

This PR allows the substrate_wasm_builder to use the same [jobserver][1]
as the parent cargo process, making all invocations of cargo share the
same concurrency pool. So in a system with 8 cores, there will never be
more than 8 tasks running at the same time.

Implementation roughly based on [cargo][2] but with less unsafe.

This can be tested by telling cargo to use half the cpu cores, like
`cargo build -j4` in an 8 core machine. Before this PR it will use 100%
cpu when building 2 runtimes in parallel, after this PR it will always
use 50%.

[1]:
https://doc.rust-lang.org/cargo/reference/build-scripts.html#jobserver
[2]:
https://github.com/rust-lang/cargo/blob/d1b5f0759eedf5f1126c781c64232856956069ad/src/cargo/util/context/mod.rs#L271

---------

Co-authored-by: Bastian Köcher <git@kchr.de>
ordian added a commit that referenced this pull request Aug 6, 2024
* master: (27 commits)
  Bridges improved tests and nits (#5128)
  Fix misleading comment about RewardHandler in epm config (#3095)
  Introduce a workflow updating the wishlist leaderboards (#5085)
  membership: Restructure pallet into separate files (#4536)
  Fix after ring-proof api change (#5126)
  Bump paritytech/review-bot from 2.4.0 to 2.5.0 (#5057)
  Bump docker/login-action from 3.0.0 to 3.3.0 (#5109)
  Bump docker/build-push-action from 5.1.0 to 6.5.0 (#5108)
  Bump peter-evans/create-pull-request from 5.0.0 to 6.1.0 (#5093)
  Tx Payment: drop ED requirements for tx payments with exchangeable asset  (#4488)
  Remove `pallet-getter` usage from pallet-transaction-payment (#4970)
  pallet macro: do not generate try-runtime related code when frame-support doesn't have try-runtime. (#5099)
  fix(chain-spec): ChainSpecBuilder with object as default genesis (#4345)
  Migrate BEEFY BLS crypto to  bls12-381 curve (#4931)
  Bump clap from 4.5.9 to 4.5.10 in the known_good_semver group (#5120)
  Use jobserver in wasm-builder to limit concurrency of spawned cargo processes (#4946)
  include events for voting (#4613)
  [subsystem-bench] Add mocks for own assignments triggering (#5042)
  Remove not-audited warning (#5114)
  hotfix: blockchain/backend: Skip genesis leaf to unblock syncing (#5103)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
R0-silent Changes should not be mentioned in any release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants