Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stable rust toolchain #2876

Merged
merged 23 commits into from
May 24, 2023
Merged

stable rust toolchain #2876

merged 23 commits into from
May 24, 2023

Conversation

aatifsyed
Copy link
Contributor

@aatifsyed aatifsyed commented May 15, 2023

Summary of changes

Changes introduced in this pull request:

  • Switch to stable rust toolchain
  • Shim nightly for rustdoc (to generate an index page for self-hosted rustdoc)
  • Drop rustfmt config (we included nightly-only features)
  • Minor code changes to fixup formatting and unknown lints
  • Drop most custom compile flags from /cargo/config.toml
  • Surgical update of dependencies that we need to update to run on stable

Reference issue to close (if applicable)

Closes

Other information and links

Change checklist

  • I have performed a self-review of my own code,
  • I have made corresponding changes to the documentation,
  • I have added tests that prove my fix is effective or that my feature works (if possible),
    • N/A
  • I have made sure the CHANGELOG is up-to-date. All user-facing changes should be reflected in this document.

@aatifsyed aatifsyed self-assigned this May 15, 2023
@aatifsyed aatifsyed changed the title mark: aatifsyed/stable-rust-toolchain stable rust toolchain May 15, 2023
@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch from 687d538 to 5ba0421 Compare May 15, 2023 15:16
.cargo/config.toml Show resolved Hide resolved
.cargo/config.toml Show resolved Hide resolved
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These features have been unstable for a while, don't look likely to get stabilized soon (unless we e.g drive it...)

rust-lang/rustfmt#4991
rust-lang/rustfmt#3347
rust-lang/rustfmt#5081
rust-lang/rustfmt#5083

@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch 5 times, most recently from fc0c396 to 3d1bd09 Compare May 16, 2023 11:56
.cargo/config.toml Show resolved Hide resolved
# For details checkout https://jondot.medium.com/8-steps-for-troubleshooting-your-rust-build-times-2ffc965fd13e
[target.x86_64-unknown-linux-gnu]
# clang has been listed as prerequisite in the doc
linker = "clang"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the implication of removing this? I believe it was added to reduce linking time. If we still want to remove it, the dependencies in both README and Dockerfiles (and Makefile) should be updated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we own linking time? Or is it owned by the developer?

I'd argue that we don't own any cargo config:

  • Devs will have their own linkers set up in in ~/.cargo/config.toml. E.g if they've install mold
  • Fetching with CLI, cargo target directory etc, that's very developer-owned

The gray area here is --cfg=tokio_unstable, but I think that's a minor feature, and if users want to include it, we should provide instructions.

Note if we supported a devcontainer, we would then own more of the developer experience - linking building etc, but at the moment we're a mess of both (I guess this is why I'm spending so much time here).

We don't really want to be messing with build config too much, because it ends up being pain - we're not build experts, we're rust experts.
One sample is the vectorised instruction support which we've mentioned earlier. This is another excellent example

BTW this arg does nothing on my machine, you actually need a linker if you want a speedup:

  './do-link.bash -C link-arg=-fuse-ld=mold' ran
    1.02 ± 0.11 times faster than './do-link.bash -C linker=clang -C link-arg=-fuse-ld=mold'
    1.21 ± 0.08 times faster than './do-link.bash -C linker=clang -C link-arg=-fuse-ld=lld'
    1.22 ± 0.08 times faster than './do-link.bash -C link-arg=-fuse-ld=lld'
    4.81 ± 0.32 times faster than './do-link.bash -C linker=clang'
    4.95 ± 0.42 times faster than './do-link.bash'

See benchmarking at 54fb5e8

I'm generally all for build-time tuning, and developer experience, but I think the way we handle it at the moment is a bit haphazard, and we end up dropping balls outside of the actual source code

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the developer is free to tinker with his/her own setup, we do own the CI. The smaller the build times are, the more we can offload to GH runners and not paid buildjet machines which brings real benefits of reduced costs (and more subjective ones as well - we like to have green checkmarks sooner than later).

Thanks for doing the benchmark. I'd love to see mold or lld used in the CI.

All in all, do you agree that we should remove clang from dependencies if it is no longer used anywhere by default?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this motion. I'd have more confidence in the correctness of Forest if we used the default configuration.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, I overall agree that using defaults is the way to go. That being said if a couple of lines of configuration that are rarely changed bring 10-20% faster builds for thousands of workflow runs in the future - IMO it's worth having it as long as it is not really exotic piece of technology that changes a lot.

Copy link
Contributor Author

@aatifsyed aatifsyed May 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impact of avx2

Methodology

I built the code with and without avx here: https://github.com/ChainSafe/forest/blob/c5b75126b84b1c5af98057187fe43a3f9a96cdff/build.bash
I ran the benchmark script here https://github.com/ChainSafe/forest/blob/c5b75126b84b1c5af98057187fe43a3f9a96cdff/benchmark.bash

  • a warmup run, followed by 20 runs, cleaning the db between each run
  • 100 tipsets for validation

I didn't have enough hard drive space to run against mainnet snapshots, so I used calibnet

Importing

Command Mean [s] Min [s] Max [s] Relative
avx2 36.508 ± 10.011 21.245 66.273 1.11 ± 0.36
no_avx2 32.762 ± 5.676 22.025 45.221 1.00

Validating

Command Mean [s] Min [s] Max [s] Relative
avx2 9.461 ± 0.126 9.149 9.739 1.00
no_avx2 9.860 ± 0.915 9.044 12.578 1.04 ± 0.10

Conclusion

There is no statistically significant difference between avx and non_avx builds.
This makes sense because at no point did I see we were CPU bound - I reckon we're io bound. Maybe if we were memory bound we'd get a speedup, but no user is going to map a ~100GB snapshot into memory, and our code is certainly not there yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have on my side:

Network: mainnet
Snapshot used: 2866560_2023_05_17T06_00_00Z.car
OS: Fedora 37
CPU: Intel Core i3-6300
Command Import Snapshot Validation of 30 heights
avx2 14m22s 2m16s
no_avx2 14m13s 2m12s

Copy link
Contributor

@elmattic elmattic May 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can confirm this on a mainnet snapshot as well, my CPU is pretty old (Skylake), probably worth testing on a recent intel gen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say we can leave those instructions away, and re-assess performance later when needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for assessing this, LGTM.

Copy link
Member

@LesnyRumcajs LesnyRumcajs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dismissing review so it's not blocked

@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch from d1f2f08 to 0e662ab Compare May 17, 2023 21:48
@aatifsyed aatifsyed marked this pull request as ready for review May 17, 2023 21:52
@aatifsyed
Copy link
Contributor Author

We've pinned to fvm 2.3.x ChainSafe/fil-actor-states#102
We need fvm 2.4.x to take this cross-compilation fix: filecoin-project/rust-fil-proofs#1679
Otherwise cross compilation fails: https://github.com/ChainSafe/forest/actions/runs/5008225935/jobs/8975856604?pr=2876#step:9:2115

We should do whatever work we need to to take the latest fvm 2.x.x release.

@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch 2 times, most recently from db5a002 to 351a167 Compare May 22, 2023 15:14
@@ -1,5 +0,0 @@
# Doc: https://rust-lang.github.io/rustfmt/
imports_granularity = "Crate"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find these grouping rules pretty helpful. Any ideas on keeping the functionalities after the change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be nice, but not possible on stable - would be pretty easy to write a custom linter, but probably not a priority right now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch from 2ed15f8 to ff05aa4 Compare May 23, 2023 09:46
@aatifsyed aatifsyed enabled auto-merge (squash) May 23, 2023 09:47
@aatifsyed aatifsyed dismissed LesnyRumcajs’s stale review May 23, 2023 09:48

Hubert has said "LGTM"

Copy link
Contributor

@lemmih lemmih left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@aatifsyed aatifsyed force-pushed the aatifsyed/stable-rust-toolchain branch from 58c4274 to ec3f8c1 Compare May 24, 2023 12:46
@aatifsyed aatifsyed enabled auto-merge (squash) May 24, 2023 12:46
@aatifsyed aatifsyed disabled auto-merge May 24, 2023 12:46
@aatifsyed aatifsyed enabled auto-merge (squash) May 24, 2023 12:47
@aatifsyed aatifsyed merged commit 60366fb into main May 24, 2023
@aatifsyed aatifsyed deleted the aatifsyed/stable-rust-toolchain branch May 24, 2023 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Status: No status
Development

Successfully merging this pull request may close these issues.

Switch toolchain from nightly to stable (or document why nightly is required)
6 participants