Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible nightly rustdoc doc generation time / size regression #134435

Closed
piaoyh opened this issue Dec 17, 2024 · 27 comments
Closed

Possible nightly rustdoc doc generation time / size regression #134435

piaoyh opened this issue Dec 17, 2024 · 27 comments
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.

Comments

@piaoyh
Copy link

piaoyh commented Dec 17, 2024

I published my crate to crates.io, but I found the server does not generate documentation properly. My crate is cryptocol and its link is https://docs.rs/cryptocol/0.8.5/cryptocol/. However, the webpage https://docs.rs/cryptocol/0.8.5/cryptocol/number/big_uint/struct.BigUInt.html#struct.BigUInt is failed to be generated. At the site, rust-lang/docs.rs#2676 , it is recommended to toss the issue here because it is considered that the reason is related to that fact that cargo +nightly docs does not work properly.

I tried this code:

cargo +nightly docs

I expected to see this happen: All the webpages will be generated properly. Especially, /home/youngho/shared/rust-target/doc/cryptocol/number/big_uint/struct.BigUInt.html should be properly generated. So, for offline test, when I execute cargo +nightly docs, it should generate all the webpages fine as executing cargo doc.

Instead, this happened: For offline test, when I execute cargo +nightly docs, it keeps running but never ends. It looks that the cargo falls into infinite loop. However, when I execute cargo doc, it generated all the webpages fine.

Meta

rustc --version --verbose:

cargo 1.83.0 (5ffbef321 2024-10-29)
rustc 1.83.0 (90b35a623 2024-11-26)
rustdoc 1.83.0 (90b35a623 2024-11-26)

$ rustc --version --verbose
rustc 1.83.0 (90b35a623 2024-11-26)
binary: rustc
commit-hash: 90b35a6239c3d8bdabc530a6a0816f7ff89a0aaf
commit-date: 2024-11-26
host: x86_64-unknown-linux-gnu
release: 1.83.0
LLVM version: 19.1.1

I feel it strange about why it takes so irrationally long and generates irrationally big size of file. (Somebody said that if you wait for an hour, it generates 60 MB-sized struct.BigUInt.html, but I haven't wait for that long time.) BigUInt.rs of version 0.8.4 is 2.9 MB in size while BigUInt.rs of version 0.8.5 is 4.0 MB. The change (1.1 MB) in size makes huge difference. There must be a size threshold that cargo cannot deal with files.

backtrace


@piaoyh piaoyh added the C-bug Category: This is a bug. label Dec 17, 2024
@rustbot rustbot added the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Dec 17, 2024
@jieyouxu jieyouxu added the T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue. label Dec 17, 2024
@jieyouxu jieyouxu changed the title docs.rs cannot generate documentation properly. Possible nightly rustdoc doc generation time / size regression Dec 17, 2024
@jieyouxu jieyouxu added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example and removed needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Dec 17, 2024
@lqd lqd added the E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc label Dec 17, 2024
@workingjubilee
Copy link
Member

@piaoyh does this pathological behavior occur if you remove everything from the crate except BigUInt?

@piaoyh
Copy link
Author

piaoyh commented Dec 18, 2024

@workingjubilee Sorry for late reply because of time difference and I have to work at daytime.

If I remove everything from my crate except BigUInt and execute cargo +nightly docs, it reports a lot of errors because BigUInt is dependent other structs in number module. So, I couldn't remove everything from the crate except BigUInt.

Instead, I removed all other modules from my crate except the module number and executed cargo +nightly docs. I found that the result was the same. It seemed that it fell into infinite loop. However, if I execute cargo doc, it works fine.

@piaoyh
Copy link
Author

piaoyh commented Dec 22, 2024

I don't know whether the following helps you or not.
I found three things:
When I execute cargo doc, it works fine.
However, when I execute cargo docs, it seems to fall into infinite loop.
And, when I execute cargo +nightly docs, it also seems to fall into infinite loop.

So, it seems cargo works fine with the argument doc while cargo does not work with the argument docs with/without the argument +nightly.

Usually, when I test documentation, I have always used only cargo doc because I didn't know that there was the argument docs. Actually, when I execute cargo, it does not show the description about docs.

When I was asked to try to execute cargo +nightly docs, I thought that there was the argument docs only for the argument +nightly which is nightly version. And I suddenly suspected that there might be the argument docs for normal version as well though cargo hides the argument docs for help for users for unknown reason. Yet, I don't know what the argument docs means or is for. Maybe it is only for cargo developers for test. That's why the argument docs is hidden from users at cargo help.

$ cargo
Rust's package manager

Usage: cargo [+toolchain] [OPTIONS] [COMMAND]
cargo [+toolchain] [OPTIONS] -Zscript <MANIFEST_RS> [ARGS]...

Options:
-V, --version
Print version info and exit
--list
List installed commands
--explain
Provide a detailed explanation of a rustc error message
-v, --verbose...
Use verbose output (-vv very verbose/build.rs output)
-q, --quiet
Do not print cargo log messages
--color
Coloring: auto, always, never
-C
Change to DIRECTORY before doing anything (nightly-only)
--locked
Assert that Cargo.lock will remain unchanged
--offline
Run without accessing the network
--frozen
Equivalent to specifying both --locked and --offline
--config <KEY=VALUE|PATH>
Override a configuration value
-Z
Unstable (nightly-only) flags to Cargo, see 'cargo -Z help' for
details
-h, --help
Print help

Commands:
build, b Compile the current package
check, c Analyze the current package and report errors, but don't build object files
clean Remove the target directory
doc, d Build this package's and its dependencies' documentation
new Create a new cargo package
init Create a new cargo package in an existing directory
add Add dependencies to a manifest file
remove Remove dependencies from a manifest file
run, r Run a binary or example of the local package
test, t Run the tests
bench Run the benchmarks
update Update dependencies listed in Cargo.lock
search Search registry for crates
publish Package and upload this package to the registry
install Install a Rust binary
uninstall Uninstall a Rust binary
... See all commands with --list

See 'cargo help ' for more information on a specific command.

@syphar
Copy link
Member

syphar commented Dec 22, 2024

To clarify the comment above for the people debugging this: I assume @piaoyh is not talking about cargo docs as command, but cargo docs-rs, which is coming from this package which emulates how docs.rs builds the docs.

@piaoyh
Copy link
Author

piaoyh commented Dec 24, 2024

No, I was talking about cargo docs, NOT cargo docs-rs.
I tested with the command cargo docs.
Actually, I don't know what is the difference between cargo docs and cargo docs-rs exactly. I just let you know what I got from execution of the command cargo docs. I thought that it would help you.

I tested more.

  • The cargo docs-rs gives the following error message:
    error: the -Z flag is only accepted on the nightly channel of Cargo, but this is the stable channel
    See https://doc.rust-lang.org/book/appendix-07-nightly-rust.html for more information about Rust release channels.
  • The cargo +nightly docs-rs seems to work fine. It did not fall into the infinite loop. I generates struct.BigUInt.html successfully. Its size is 4.6 MB.

merry Christmas to all.

@piaoyh
Copy link
Author

piaoyh commented Jan 17, 2025

I dramatically reduced the code of my crate for test but still cargo +nightly docs ran endlessly.

What am I supposed to test to find why docs.rs does not show documentation of my crate cryptocol correctly? cargo +nightly docs-rs or cargo +nightly docs?

cargo docs-rs gives me error.
cargo +nightly docs-rs has worked fine even before I reduced my code for test.
cargo +nightly docs ran endlessly.

@piaoyh
Copy link
Author

piaoyh commented Jan 20, 2025

I thought that cargo +nightly docs is related with publishing crates at crates.io and docs.rs. But while I was reading carefully the above comments again, I found syphar's comment. As syphar said, cargo docs-rs is coming from this package which emulates how docs.rs builds the docs.

Then, does it mean cargo +nightly docs-rs is related with how docs.rs builds the docs rather than cargo +nightly docs?

If so, my question is why the webserver docs.rs fails in building struct.BigUInt.html though cargo +nightly docs-rs works fine and generates struct.BigUInt.html well at my local computer. The size of the generated struct.BigUInt.html is 8.5MB at my local computer. And the source file of big_uint.rs has 58046 lines and its size is 2.7 MB.

What is the limit size of each source file for publishing crate? Is my source code too big or does cargo +nightly docs-rs that works at the webserver docs.rs have a bug?

Can anyone answer me?

@piaoyh
Copy link
Author

piaoyh commented Jan 21, 2025

I think I found the reason!

There must be the limitation of the number of methods that can be implemented for struct. SmallUInt is more than 50,000 lines but it's documentation was generated successfully at docs.rs while BigUInt has less than 30,000 lines (I moved a lot of the documentation to other files) but its documentation was not generated at docs.rs. I found that BigUInt has 815 public methods while SmallUInt has 117 methods.

I think there must be function number limitation.
Can it be solved?

@syphar
Copy link
Member

syphar commented Jan 21, 2025

cc @GuillaumeGomez what could we do to debug this?

@piaoyh
Copy link
Author

piaoyh commented Jan 21, 2025

Is it related with cargo +nightly docs-rs or cargo +nightly docs?

What does the server docs.rs exactly use to generate documentation?

Can you let me know it?

Probably, most of people here must be too busy to dig my case.
I think regression test must be helpful.

If necessary information is provided to me, I'll try to debug though I am also busy for my job.

@syphar
Copy link
Member

syphar commented Jan 21, 2025

Just some remarks from my side:

If so, my question is why the webserver docs.rs fails in building struct.BigUInt.html though cargo +nightly docs-rs works fine and generates struct.BigUInt.html well at my local computer.

we can build it, it's only too big to serve.

The size of the generated struct.BigUInt.html is 8.5MB at my local computer. And the source file of big_uint.rs has 58046 lines and its size is 2.7 MB.

I think and one HTML page being bigger than megabyte-sized is not a good idea for usability, especially with slower network connections.

Unrelated to this bug, which seemed to be an endless loop, I would recommend changing your design so you have smaller pages.

@GuillaumeGomez
Copy link
Member

cc @GuillaumeGomez what could we do to debug this?

Running with tracing enabled I suppose?

@piaoyh
Copy link
Author

piaoyh commented Jan 21, 2025

I think I found the reason!

There must be the limitation of the number of methods that can be implemented for struct. SmallUInt is more than 50,000 lines but it's documentation was generated successfully at docs.rs while BigUInt has less than 30,000 lines (I moved a lot of the documentation to other files) but its documentation was not generated at docs.rs. I found that BigUInt has 815 public methods while SmallUInt has 117 methods.

I think there must be function number limitation. Can it be solved?

I miscounted the number of methods.
BigUInt does not have 815 methods but 408 methods. The editor counted the pub fn strings in comments too, so the number of methods was counted doubled.

@piaoyh
Copy link
Author

piaoyh commented Jan 21, 2025

cc @syphar Thanks to Denis's advice.

cc @GuillaumeGomez My guess is as follows.
If rustdoc deals with the number of the functions of struct and it contains the number of the functions in a certain variable of u8 type or in the equivalent way in order to generate html files, it can show the unexpected behavior when there are more than 256 functions.

I guess in this way because:

  • SmallUInt has 117 functions which is less than 256 and its html file could be generated well at the server docs.rs though small_uint.rs has 51892 lines (2.1 MB) and is bigger than big_uint.rs in the number of lines and in size..
  • BigUInt has 408 functions which is greater than 256 and its html file failed to be generated at the server docs.rs though big_uint.rs has 27961 lines (1.3 MB) and is smaller than small_uint.rs in the number of lines and in size.

So, it seems that the bug is more related with the number of functions of the source files rather than the size or the number of lines of the source files.

In the above post, I asked as follows but nobody answered to my question:

  1. Is it related with cargo +nightly docs-rs or cargo +nightly docs?
  2. What does the server docs.rs exactly use to generate documentation?

Please anyone, who knows the answer, answer to my question, please!

@piaoyh
Copy link
Author

piaoyh commented Jan 23, 2025

cc @GuillaumeGomez I think that I found the threshold that separates the bug zone from the normal zone. Here, the bug zone means the bug that makes the server docs.rs fail in generating the webpage of struct.BigUInt.html while the normal zone means it makes the server docs.rs succeed in generating the webpage of struct.BigUInt.html.

It is relevant to the number of methods. If BigUInt has 407 or more methods, the server docs.rs fails in generating the webpage of struct.BigUInt.html, but if BigUInt has 406 or less methods, the server docs.rs can generate the webpage of struct.BigUInt.html successfully.

You can easily find it.
At crates.io, search for cryptocol, and choose the version 0.9.0-server-test.14, then you will see the server docs.rs generate the webpage of struct.BigUInt.html successfully. This version has 406 methods in BigUInt.
However, if you choose the version cryptocol 0.9.0-server-test.15, then you will see the server docs.rs fail in generating the webpage of struct.BigUInt.html. This version has 407 methods in BigUInt.

Many Blessing to You !!!

@syphar
Copy link
Member

syphar commented Jan 24, 2025

Thanks for this further research!

While there is definitely a bug somewhere in rustdoc I still recommend that you find a way to reduce the size of the page. Event your server-test.14 BigUInt page with 406 methods is nearly 10 MB in size, which is super unpractical for anyone with a slower or mobile network connection.

That being said, while the method count could be the issue, it could also be the 407th method specifically that breaks doc generation here. Or the size of the docs, or the number of examples, or many other things. Also, this is using the relatively new rustdoc-scrape-examples feature, not sure if that's the reason for the breakage.

This needs someone with more rustdoc knowledge than me to dig into.

Until then I recommend trying to find ways to reduce the size of the page also locally.

@piaoyh
Copy link
Author

piaoyh commented Jan 24, 2025

Yes, I reduced a bit in order not to fail in generating documents.

Maybe I need to reduce it more. But it is a bit tricky. BigUInt is the struct that you can use to calculate (addition, subtraction, multiplication, division, power, finding prime number, etc.) big unsigned integers such as 1024-bit unsigned integers, 2048-bit, or longer integers. So, how to use it should be as easy and simple as normal integer calculation such as u64, u128, i32, etc. But if I implement such functionalities using traits and their implementation in order to scatter all the necessary methods to the outside of the struct BigUInt in order to reduce the number of functions, how to use it will be more complicated for users. That is my dilemma.

By the way, how can I make cc to somebody in this kind of markdown post? Can you let me know? One day GuillaumeGomez asked me to cc to him but I couldn't because I didn't know how to do that.

Thanks

@GuillaumeGomez
Copy link
Member

Can you provide the link to the repository so I can check what's wrong?

@syphar
Copy link
Member

syphar commented Jan 29, 2025

Can you provide the link to the repository so I can check what's wrong?

in Cargo.toml.orig I found: https://github.com/piaoyh/cryptocol ( commented out)

@GuillaumeGomez
Copy link
Member

I'll wait for them to confirm:

  1. It's the actual repository
  2. They're ok with us building it (otherwise why they would comment it out?)

@piaoyh
Copy link
Author

piaoyh commented Jan 29, 2025

Thank you for your digging

1. It's the actual repository

Yes, It is the actual repository.

2. They're ok with us building it (otherwise why they would comment it out?)

I commented out some functions and published the crate to find how many functions of BigUInt cause the webserver to fail in generating the struct.BigUInt.html. I did it fifteen times changing the number of functions of BigUInt. And I found that if BigUInt has 407 methods, the server fails in generating struct.BigUInt.html while if BigUInt has 406 methods, the server succeeds in generating struct.BigUInt.html. You can see it at https://crates.io/crates/cryptocol/versions. So, I reduced the number of methods of BigUInt in order that the server can generate the webpage struct.BigUInt.html successfully.

Thanks

@GuillaumeGomez
Copy link
Member

Do you have a branch with the bug I can use?

@piaoyh
Copy link
Author

piaoyh commented Feb 2, 2025

Sorry, I was out of this city. That's why I couldn't get your question.
I didn't make any branch. Sorry for that.
But I changed this project at github to be public for you temporarily.
You can travel to to older version.
https://github.com/piaoyh/cryptocol

And, I found that the rustdoc used in docs.rs was changed again. It failed again in generating struct.BigUInt.html.

@GuillaumeGomez
Copy link
Member

Tested locally, it builds fine. However the rendered page is 8.1MB. I ran an analyzer on the HTML code which gave me this:

Image

So basically, too many code examples. There isn't much that be can be done to improve the output on our end here. Closing then.

@piaoyh
Copy link
Author

piaoyh commented Feb 8, 2025

Thank you for your digging it.
On my local computer, cargo, cargo +nightly docs-rs works and never failed in generating struct.BigUInt.html, too. I tested it with cargo version cargo 1.84.1 (66221abde 2024-11-19) and rustdoc version rustdoc 1.84.1 (e71f9a9a9 2025-01-27)
Then, I wonder why docs.rs failed in generating struct.BigUInt.html though cargo docs-rs works fine at my and your local computer.
What is the difference between docs.rs and your local machine?

@syphar
Copy link
Member

syphar commented Feb 8, 2025

something I already set, but here for clarification:

  • cargo docs-rs tries to emulate the build in docs.rs, and doesn't do it fully. I did a short check and discovered a missing parameter I'll created a PR for. .
  • just running the current cargo docs-rs ends up with an 8.4 MB struct.BigUInt.html, which is in any case a file far too big for any good usability in the browser. So this should be optimized in any case, so normal users on the internet can use it. You probably have to reduce the docs or methods on this struct.

You can manually add the missing argument to your Cargo.toml with these lines:

[package.metadata.docs.rs]
cargo-args = ["-Zunstable-options", "-Zrustdoc-scrape-examples"]

Running cargo docs-rs after this will give you something similar (not the same) as the docs.rs build output.

Running your build with the fixed cargo docs-rs or updated Cargo.toml will end up in a 142MB struct.BigUInt.html. Which is 100% not manageable by any browser. So the recommendation stays the same: reduce amount of struct methods, the length of the docs, or coming from this: the amount of examples.

@dtolnay
Copy link
Member

dtolnay commented Feb 8, 2025

Adding this to the bottom of your Cargo.toml will make docs.rs generate the 8MB output instead of the 142MB output, which eliminates the failure without needing to shrink the number of methods on BigUInt.

[[example]]
name = "biguint_examples"
doc-scrape-examples = false

[[example]]
name = "des_examples"
doc-scrape-examples = false

[[example]]
name = "hash_app"
doc-scrape-examples = false

[[example]]
name = "hash_examples"
doc-scrape-examples = false

[[example]]
name = "md4_app"
doc-scrape-examples = false

[[example]]
name = "md5_app"
doc-scrape-examples = false

[[example]]
name = "performance_test_biguint"
doc-scrape-examples = false

[[example]]
name = "random_examples"
doc-scrape-examples = false

[[example]]
name = "sha1_app"
doc-scrape-examples = false

[[example]]
name = "sha2_256_app"
doc-scrape-examples = false

[[example]]
name = "sha2_512_224_app"
doc-scrape-examples = false

[[example]]
name = "sha2_512_app"
doc-scrape-examples = false

[[example]]
name = "small_uint_examples"
doc-scrape-examples = false

[[example]]
name = "unions_examples"
doc-scrape-examples = false

It is very verbose. Alternatively you can add this to the top of Cargo.toml (above [package]) if you prefer.

example = [
    { name = "biguint_examples", doc-scrape-examples = false },
    { name = "des_examples", doc-scrape-examples = false },
    { name = "hash_app", doc-scrape-examples = false },
    { name = "hash_examples", doc-scrape-examples = false },
    { name = "md4_app", doc-scrape-examples = false },
    { name = "md5_app", doc-scrape-examples = false },
    { name = "performance_test_biguint", doc-scrape-examples = false },
    { name = "random_examples", doc-scrape-examples = false },
    { name = "sha1_app", doc-scrape-examples = false },
    { name = "sha2_256_app", doc-scrape-examples = false },
    { name = "sha2_512_224_app", doc-scrape-examples = false },
    { name = "sha2_512_app", doc-scrape-examples = false },
    { name = "small_uint_examples", doc-scrape-examples = false },
    { name = "unions_examples", doc-scrape-examples = false },
]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: This is a bug. E-needs-bisection Call for participation: This issue needs bisection: https://github.com/rust-lang/cargo-bisect-rustc E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example T-rustdoc Relevant to the rustdoc team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

8 participants