Greatly speed up doctests by compiling compatible doctests in one file #126245

GuillaumeGomez · 2024-06-10T21:55:26Z

Take 2 at #123974. It should now be much simpler to review since it's based on #125798.

As discussed, this feature will only be enabled starting with the 2024 edition.

I split the changes as much as possible to make it as easy as possible to review (even though it's a huge lot of code changes...).

The following tests are not included into the combined doctests:

compile_fail
If there are crate attributes (deny/allow/warn are ok)
have invalid AST
test_harness
no capture
--show-output (because the output is nicer without the extra code surrounding it)

Everything else is merged into one file. If this one file fails to compile, we go back to the current strategy: compile each doctest separately.

Because of the edition codeblock attribute, I couldn't make them all into one file directly so I grouped them by edition, but it should be pretty anecdotic.

In case the users want a specific doctest to be opted-out from this doctest merge, I added the standalone codeblock attribute:

/// ```rust,standalone
/// // This doctest will not be part of merged doctests!
/// ```

Now the interesting part, I ran it on a few crates and here are the results (with cargo test --doc to only include doctests):

crate	nb doctests	before this PR	with this PR
sysinfo	227	4.6s	1.11s
geos	157	3.95s	0.45s
core	4604	54.08s	13.5s (merged: 0.9s, standalone: 12.6s)
std	1147	12s	3.56s (merged: 2.1s, standalone: 1.46s)

r? @camelid

try-job: x86_64-msvc
try-job: aarch64-apple

src/doc/rustdoc/src/write-documentation/documentation-tests.md

camelid

First round of review! Wondering if it'd make sense to run this on crater (with the edition-handling temporarily removed so it'd run unconditionally) to check breakage?

src/librustdoc/doctest.rs

camelid · 2024-06-11T19:04:40Z

src/librustdoc/doctest.rs

+
+    run_tests(
+        test_args,
+        nocapture,


nit: Isn't nocapture now passed as part of opts?

It's actually part of RustdocOptions, good catch!

Same goes for test_args, I love when code gets simpler like that. :3

src/librustdoc/doctest/make.rs

src/librustdoc/doctest.rs

tests/rustdoc-ui/doctest/wrong-ast-2024.rs

tests/rustdoc-ui/doctest/wrong-ast.rs

Kobzol · 2024-06-11T20:09:45Z

The comment in documentation claims that the slowest part that limits doctests is the compilation time (not the runtime to actually run all the tests). Instead of merging all the doctests into a single program, so that it runs in the same process (which can cause issues mentioned in the previous PR), could we instead perform the compilation in a smarter way, to generate N binaries, but still invoking the compiler just once, to amortize the compilation costs (something like my rustc daemon experiments)? Do you have any expectation of how much time would be saved if compilation was approx. as fast as compiling a single program, but we still invoked N binaries to run the tests?

Or, as a different alternative, compile everything into a single program, but then execute it in N processes (each process would run just a single test), to avoid the issues?

camelid · 2024-06-11T20:16:42Z

Interesting. I was actually wondering if there was some way to daemonize rustc. However IIRC one of the slowest parts is linking, which would still be an issue with the approach you suggest :(

Kobzol · 2024-06-11T20:19:12Z

It would also be very non-trivial, implementation wise, I imagine. But what about the second approach? We could compile a single binary that looks something like this:

fn test_0() {}
fn test_1() {}

fn main() {
   let arg = args()[0];
   if arg == "test_0" { test_0(); }
   if arg == "test_1" { test_1(); }
}

And the rustdoc driver would then execute this binary N times (in parallel for all threads on the given machine), and run each test in a separate process.

camelid

This doesn't seem to handle the fact that different doctests may use different #![feature)]s.

camelid · 2024-06-11T20:26:45Z

This doesn't seem to handle the fact that different doctests may use different #![feature)]s.

I think this would unfortunately break the second approach too @Kobzol

Kobzol · 2024-06-11T20:29:22Z

I think this would unfortunately break the second approach too @Kobzol

Sure, there are possibly additional issues with the "add everything to a single binary" approach, but as far as we can get that working somehow, it seems to me that moving from "run all tests within a single process" to "run each test in a separate process" should be relatively straightforward, and should resolve the issues with shared state between tests.

camelid · 2024-06-11T21:10:05Z

Agreed. So sounds like the idea would be:

merge doc tests into groups based on each one's combination of edition, no_std/no_core, enabled feature gates, etc.
create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary
run each instance (i.e. provided args) of each merged test in parallel

This should obviate the need for doing this over an edition boundary since it shouldn't break any code. It would also automatically work with code that configures global state, without needing to know a standalone attribute. We'd also skip introducing this attribute to rustdoc since it wouldn't be necessary.

src/librustdoc/doctest.rs

Scripter17 · 2024-06-11T23:51:40Z

However IIRC one of the slowest parts is linking

Would nightly's new linker (LLD, I think) be a relevant thing to look through for insight?

camelid · 2024-06-12T00:49:37Z

I'm not sure. It'd be worth testing. We should also look into turning off debuginfo for doctests since it's unused.

Kobzol · 2024-06-12T06:10:15Z

Would nightly's new linker (LLD, I think) be a relevant thing to look through for insight?

LLD should help slightly, but it's not a panacea.

I tried LLD for doctesting core, and BFD (the default Linux linker) ran for ~42s, LLD ran for 40s. It seemed that most (maybe 90%) of the time was spent running the tests though, not compiling them (assuming that compilation and running isn't overlapped).

That would also sadly suggest that the overhead from spawning the test processes (even on Linux) is the dominant factor here, which means that my proposed approach wouldn't help that much.

GuillaumeGomez · 2024-06-12T12:46:23Z

Agreed. So sounds like the idea would be:
* merge doc tests into groups based on each one's combination of edition, no_std/no_core, enabled feature gates, etc.

* create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary

* run each instance (i.e. provided args) of each merged test in parallel
This should obviate the need for doing this over an edition boundary since it shouldn't break any code. It would also automatically work with code that configures global state, without needing to know a standalone attribute. We'd also skip introducing this attribute to rustdoc since it wouldn't be necessary.

I agree but with some small changes: no_std tests should be on their own, however doctests with features should likely not be part of the merged doctests (we can always change that later on) and therefore no_core as well since it requires a feature to be enabled.

However I"m curious about:

* create unified test for each merged group where only one test is run per execution, dependent on the args provided by rustdoc to the binary

* run each instance (i.e. provided args) of each merged test in parallel

I generate merged doctests like rustc does with #[test], so I'm not sure it'd be worth to spawn a process for each test.

Kobzol · 2024-06-12T12:48:53Z

I generate merged doctests like rustc does with #[test], so I'm not sure it'd be worth to spawn a process for each test.

This approach would resolve the problems with sharing state between tests. But it's quite possible that it would also make the perf. gains much smaller.

GuillaumeGomez · 2024-06-12T12:55:38Z

Forking on linux has a huge overhead, so the test code itself would likely take less time than spawning it. I'm really not convinced by this approach. Let me check locally the diff.

Kobzol · 2024-06-12T13:01:25Z

If you have a recent-ish glibc and kernel, forking is relatively fast (you could spawn thousands of doctests per second, if needed). Actually I'm pretty sure that forking on Linux would be much faster than on macOS and Windows, so I would worry about these more 😆

But based on some experiments that I have posted above, it seems to me that for crates that have a lot of doc-tests (e.g. core), the primary bottleneck is not compilation, but rather execution of the doctests. If that is indeed the case, then the process spawning approach will still probably be much slower than what you had here (>2x speedup on core) :(

But maybe it's only because the rustc test machinery has a lot of costs when starting. I wonder what would happen if we just literally did this, or, as an alternative, set up everything with libtest, and then fork the process just before executing each test.

GuillaumeGomez · 2024-06-12T13:07:20Z

True, forking on linux should be pretty fast nowadays. ^^'

But maybe it's only because the rustc test machinery has a lot of costs when starting. I wonder what would happen if we just literally did this, or, as an alternative, set up everything with libtest, and then fork the process just before executing each test.

Hum... I'm definitely not convinced. Especially since you mentioned Windows and MacOS being slow to fork. I can provide a big source code for what rustdoc generates for core later once I'm done with what I'm doing and you can experiment with forking?

bors · 2024-08-14T00:50:44Z

☀️ Test successful - checks-actions
Approved by: t-rustdoc
Pushing e5b3e68 to master...

rust-timer · 2024-08-14T02:22:11Z

Finished benchmarking commit (e5b3e68): comparison URL.

Overall result: no relevant changes - no action needed

@rustbot label: -perf-regression

Instruction count

This benchmark run did not return any relevant results for this metric.

Max RSS (memory usage)

Results (primary -1.1%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	-	-	0
Improvements ✅ (primary)	-1.1%	[-1.1%, -1.1%]	1
Improvements ✅ (secondary)	-	-	0
All ❌✅ (primary)	-1.1%	[-1.1%, -1.1%]	1

Cycles

Results (secondary 0.2%)

This is a less reliable metric that may be of interest but was not used to determine the overall result at the top of this comment.

	mean	range	count
Regressions ❌ (primary)	-	-	0
Regressions ❌ (secondary)	2.8%	[2.8%, 2.8%]	1
Improvements ✅ (primary)	-	-	0
Improvements ✅ (secondary)	-2.5%	[-2.5%, -2.5%]	1
All ❌✅ (primary)	-	-	0

Binary size

This benchmark run did not return any relevant results for this metric.

Bootstrap: 752.976s -> 752.981s (0.00%)
Artifact size: 341.40 MiB -> 341.50 MiB (0.03%)

jyn514 · 2024-08-14T03:33:41Z

this is really cool. nice work all.

Kobzol · 2024-08-14T08:10:19Z

Awesome work! 🚀 This should probably receive relnotes or maybe relnotes-perf (not sure if that one is used a lot).

BurntSushi · 2024-08-15T14:47:17Z

I got an absolutely wild speed-up in Jiff. Before:

$ time cargo t --doc
[.. snip ..]
test result: ok. 903 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 12.56s


real    12.783
user    3:14.02
sys     1:25.99
maxmem  162 MB
faults  1425

After:

$ time cargo t --doc
[.. snip ..]
test result: ok. 903 passed; 0 failed; 2 ignored; 0 measured; 0 filtered out; finished in 0.25s


real    2.524
user    4.943
sys     2.355
maxmem  458 MB
faults  23

GuillaumeGomez · 2024-08-15T14:50:17Z

Going down from 4min39 to 7 secs is quite wild indeed. We did a good job here. :)

kpreid · 2024-08-15T17:19:35Z

Note that this feature only takes effect if you set package.edition = "2024" (documented here). I mention this for the sake of the next person who, like me, wants to try it out and see the improvement, but didn’t recall that part of the design discussion. Just running cargo +nightly test --doc doesn’t suffice.

ehuss · 2024-08-20T20:28:30Z

Awesome work! 🚀 This should probably receive relnotes or maybe relnotes-perf (not sure if that one is used a lot).

Can you say more about why this should be relnotes? It's currently unstable (under the edition).

Kobzol · 2024-08-20T20:29:53Z

...because I forgot about that :) Sorry. It should definitely be mentioned in the 2024 Edition relnotes, as I imagine it will be :)

Crater for rustdoc test merging Crater run for rust-lang#126245

rustbot assigned camelid Jun 10, 2024

rustbot added A-run-make Area: port run-make Makefiles to rmake.rs S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. labels Jun 10, 2024

This comment was marked as resolved.

Sign in to view

GuillaumeGomez mentioned this pull request Jun 10, 2024

Greatly speed up doctests by compiling compatible doctests in one file #123974

Closed

the8472 reviewed Jun 10, 2024

View reviewed changes

src/doc/rustdoc/src/write-documentation/documentation-tests.md Outdated Show resolved Hide resolved

camelid added A-doctests Area: Documentation tests, run by rustdoc A-edition-2024 Area: The 2024 edition needs-fcp This change is insta-stable, so needs a completed FCP to proceed. labels Jun 11, 2024

camelid requested changes Jun 11, 2024

View reviewed changes

rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 11, 2024

camelid requested changes Jun 11, 2024

View reviewed changes

traviscross mentioned this pull request Jun 11, 2024

Tracking Issue for Rust 2024: Rustdoc tests in one binary #124853

Closed

3 tasks

notriddle reviewed Jun 11, 2024

View reviewed changes

src/librustdoc/doctest.rs Outdated Show resolved Hide resolved

bors added the merged-by-bors This PR was explicitly merged by bors. label Aug 14, 2024

bors merged commit e5b3e68 into rust-lang:master Aug 14, 2024
7 checks passed

rustbot added this to the 1.82.0 milestone Aug 14, 2024

GuillaumeGomez deleted the doctest-improvement-v2 branch August 14, 2024 08:21

jieyouxu added relnotes Marks issues that should be documented in the release notes of the next release. relnotes-perf Performance improvements that should be mentioned in the release notes. labels Aug 14, 2024

SkiFire13 mentioned this pull request Aug 14, 2024

bevy_gizmos doc tests hang compute when run locally bevyengine/bevy#12207

Open

epage mentioned this pull request Aug 14, 2024

rustdoc standalone doctest attribute is confusing, taking up a potentially useful name #129098

Closed

arifd mentioned this pull request Aug 14, 2024

Help wanted: ideas for improving the test run time jmcnamara/rust_xlsxwriter#37

Closed

ilyagr mentioned this pull request Aug 15, 2024

cargo insta test cli: add --disable-nextest-doctest option mitsuhiko/insta#438

Open

Voultapher mentioned this pull request Aug 15, 2024

Run merged doc-tests by default in the same process and add opt-out attribute #129119

Open

apiraino removed the to-announce Announce this issue on triage meeting label Aug 15, 2024

GuillaumeGomez mentioned this pull request Aug 19, 2024

Doctests don't work in bin targets, non-public items #50784

Open

ehuss added relnotes Marks issues that should be documented in the release notes of the next release. and removed relnotes Marks issues that should be documented in the release notes of the next release. labels Aug 20, 2024

ehuss removed relnotes Marks issues that should be documented in the release notes of the next release. relnotes-perf Performance improvements that should be mentioned in the release notes. labels Aug 20, 2024

Folyd mentioned this pull request Aug 21, 2024

Separate core search logic with search ui #126183

Merged

ehuss mentioned this pull request Sep 12, 2024

Crater for rustdoc test merging #130285

Closed

bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 12, 2024

Auto merge of rust-lang#130285 - ehuss:crater-rustdoc-process, r=<try>

024495a

Crater for rustdoc test merging Crater run for rust-lang#126245

GuillaumeGomez mentioned this pull request Oct 1, 2024

merged doctests: regressions due to examining process arguments #130796

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Greatly speed up doctests by compiling compatible doctests in one file #126245

Greatly speed up doctests by compiling compatible doctests in one file #126245

GuillaumeGomez commented Jun 10, 2024 •

edited

Loading

This comment was marked as resolved.

camelid left a comment

camelid Jun 11, 2024

GuillaumeGomez Jun 20, 2024

GuillaumeGomez Jun 20, 2024

Kobzol commented Jun 11, 2024 •

edited

Loading

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid left a comment

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid commented Jun 11, 2024

Scripter17 commented Jun 11, 2024 •

edited

Loading

camelid commented Jun 12, 2024

Kobzol commented Jun 12, 2024 •

edited

Loading

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

bors commented Aug 14, 2024

rust-timer commented Aug 14, 2024

jyn514 commented Aug 14, 2024

Kobzol commented Aug 14, 2024

BurntSushi commented Aug 15, 2024

GuillaumeGomez commented Aug 15, 2024 •

edited

Loading

kpreid commented Aug 15, 2024 •

edited

Loading

ehuss commented Aug 20, 2024

Kobzol commented Aug 20, 2024

Greatly speed up doctests by compiling compatible doctests in one file #126245

Greatly speed up doctests by compiling compatible doctests in one file #126245

Conversation

GuillaumeGomez commented Jun 10, 2024 • edited Loading

This comment was marked as resolved.

camelid left a comment

Choose a reason for hiding this comment

camelid Jun 11, 2024

Choose a reason for hiding this comment

GuillaumeGomez Jun 20, 2024

Choose a reason for hiding this comment

GuillaumeGomez Jun 20, 2024

Choose a reason for hiding this comment

Kobzol commented Jun 11, 2024 • edited Loading

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid left a comment

Choose a reason for hiding this comment

camelid commented Jun 11, 2024

Kobzol commented Jun 11, 2024

camelid commented Jun 11, 2024

Scripter17 commented Jun 11, 2024 • edited Loading

camelid commented Jun 12, 2024

Kobzol commented Jun 12, 2024 • edited Loading

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

Kobzol commented Jun 12, 2024

GuillaumeGomez commented Jun 12, 2024

bors commented Aug 14, 2024

rust-timer commented Aug 14, 2024

Overall result: no relevant changes - no action needed

Instruction count

Max RSS (memory usage)

Cycles

Binary size

jyn514 commented Aug 14, 2024

Kobzol commented Aug 14, 2024

BurntSushi commented Aug 15, 2024

GuillaumeGomez commented Aug 15, 2024 • edited Loading

kpreid commented Aug 15, 2024 • edited Loading

ehuss commented Aug 20, 2024

Kobzol commented Aug 20, 2024

GuillaumeGomez commented Jun 10, 2024 •

edited

Loading

Kobzol commented Jun 11, 2024 •

edited

Loading

Scripter17 commented Jun 11, 2024 •

edited

Loading

Kobzol commented Jun 12, 2024 •

edited

Loading

GuillaumeGomez commented Aug 15, 2024 •

edited

Loading

kpreid commented Aug 15, 2024 •

edited

Loading