-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bundle local images in rustdoc output #3397
base: master
Are you sure you want to change the base?
Changes from 2 commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,118 @@ | ||||||
Rustdoc: Bundle local images | ||||||
|
||||||
- Feature Name: NONE | ||||||
- Start Date: 2023-02-06 | ||||||
- RFC PR: [rust-lang/rfcs#0000](https://github.com/rust-lang/rfcs/pull/0000) | ||||||
- Rust Issue: [rust-lang/rust#32104](https://github.com/rust-lang/rust/issues/32104) | ||||||
|
||||||
# Summary | ||||||
[summary]: #summary | ||||||
|
||||||
This RFC proposes to allow the bundling of local images in rustdoc HTML output. A draft implementation is available as [#107640](https://github.com/rust-lang/rust/pull/107640). | ||||||
|
||||||
# Motivation | ||||||
[motivation]: #motivation | ||||||
|
||||||
Doc authors want to produce docs that are consistent across local `cargo doc` output, `docs.rs`, and self-hosted docs. They would also like to include images (like logos and diagrams), and scripts (like KaTeX for rendering math symbols). Both doc authors and doc readers would like for those resources to not be subject to link-rot, which means it should be possible to build docs for an old version of a crate and have the images and scripts reliably available. Doc readers would like for `cargo doc` output to be rendered correctly by their browsers even when they are offline. | ||||||
|
||||||
Right now, there are attributes that can set a logo and a favicon for documentation, but they must to point to an absolute URL, which prevents bundling the logo and favicon in the source repository. Also, while `<script>` tags are allowed in rustdoc, they have a similar problem: if they load script from some URL, that URL needs to be absolute or it won't work consistently across `cargo doc` and `docs.rs`. | ||||||
|
||||||
# Guide-level explanation | ||||||
[guide-level-explanation]: #guide-level-explanation | ||||||
|
||||||
This RFC proposes to allow rustdoc to include local images in the generated documentation by copying them into the output directory. | ||||||
|
||||||
This would be done by allowing users to specify the path of a local resource file in doc comments. The resource file would be stored in the `doc.files` folder. The `doc.files` folder will be at the "top level" of the rustdoc output level (at the same level as the `static.files` or the `src` folders). | ||||||
|
||||||
The only local resources considered will be the ones in the markdown image syntax: `![resource title](path)`, where `<path>` is the path of the resource file relative to the source file. | ||||||
|
||||||
The path could be any relative or absolute file path. For example, to include an image generated by [`build.rs`](https://doc.rust-lang.org/cargo/reference/build-scripts.html), concatenate a path with the `OUT_DIR` environment variable: | ||||||
|
||||||
```rust | ||||||
/// Using a local image | ||||||
#[doc=concat!("![with absolute path](", env!("OUT_DIR"), "/local/image.png)")] | ||||||
/// | ||||||
/// Using a local image ![with relative path](../local/image.png) | ||||||
``` | ||||||
|
||||||
Since the local images are all put in the same folder, if the same is imported from different crates, the content won't be duplicated since they have the same name and the same hash. | ||||||
|
||||||
If the path isn't referring to a file, a warning will be emitted and rustdoc will left the path unchanged in the generated documentation. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Will this result in relative links to rustdoc pages that aren't using intra-doc links warning? Warning for these is fine, but should ideally be able to say "this looks like an intra-doc link, use one instead." There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Sorry I don't understand what you mean. Intra-doc links is a feature used on links whereas this feature is only applied on image links. For example: /// ![image](../relative-link-to-image)
///
/// [intra_doc_link_type]
|
||||||
|
||||||
For published crates, `docs.rs` builds the contents of the `.crate` package in a sandbox with no internet access. Make sure any resources your docs need are [included](https://doc.rust-lang.org/cargo/reference/manifest.html#the-exclude-and-include-fields) in the package. | ||||||
|
||||||
The local resources files are not affected by the `--resource-suffix`. | ||||||
|
||||||
The impact on `docs.rs` would also be very minimal as the size of a published crate resources is limited to a few megabytes. The only thing needed would be to handle the new `doc.files` folder. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. for images this might be true, but I'm a bit worried about JS files. while docs.rs currently allows running arbitrary JS from 3rd-party servers, the situation might become even worse if docs.rs itself would serve those potentially malicious files. or am I missing something? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If you put a |
||||||
|
||||||
There are two options that will be impacted by this RFC: the favicon and logo that you can respectively set through: | ||||||
|
||||||
```rust | ||||||
#![doc(html_favicon_url = "some_path", html_logo_url = "some_other_path")] | ||||||
``` | ||||||
|
||||||
They will follow the same rule as for other images: if this is a local path, the local file will be copied and the paths to it will be rewritten. | ||||||
|
||||||
To support `#[doc(inline)]` for foreign items using local resources, it will rely on the `-Zrustdoc-map` option. | ||||||
|
||||||
# Reference-level explanation | ||||||
[reference-level-explanation]: #reference-level-explanation | ||||||
|
||||||
A new rustdoc pass will be added which would go through all documentation to gather local resources into a map. | ||||||
|
||||||
Then in HTML documentation generation, the local resources pathes will be replaced by their equivalent linking to the output directory instead. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||
|
||||||
The local resources files will be renamed as follows: `{original filename}-{hash}{extension}`. The `{hash}` information will be computed from the local resource file content. | ||||||
|
||||||
You can look at what the implementation could look like in [#107640](https://github.com/rust-lang/rust/pull/107640). | ||||||
|
||||||
When an image is included in an item that gets inlined across a crate, rustdoc will treat it like a cross-crate intra-doc link, using `--extern-html-root-url` so that `docs.rs` can hotlink the image from the crate that holds a copy of the image. This has a few upsides and downsides compared to the approach where the image itself is copied into the crate with the inlined docs. | ||||||
|
||||||
* reduces the number of duplicated images that `docs.rs` has to store | ||||||
* doesn't require the source code for the source crate when inlining | ||||||
* only requires storing the hash of the file in the `.rmeta`, not the whole image | ||||||
* requires rustc to look at the doc comments and hash the image(s) | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. While doing this, can it also produce a lint warning if the image is bigger than 50KiB or so? Of course, any value chosen here will be arbitrary and contentious, but so does adding operational complexity to crates.io, docs.rs, and third parties that want to run rustdoc builds in sandboxes (how would hosting images separately fit into Bazel?) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Perhaps; or perhaps the lint should be something like "the total of all your non-code resources is >10% of the crate size." It's still not really satisfying though, since it doesn't fully solve the problem. If you want to include lots of resources, or large resources, the only solution is going to be to refer to them using absolute URLs. Out of curiosity I spot-checked a couple of crates, regex and rayon. They're 248 kB and 169 kB respectively, so 10% would be 24.8 kB and 16.9 kB. Also it's worth mentioning one counterpoint to my concern: source files that are bundled into .crate files are not minified and comments aren't removed. That's important specifically so docs can be generated from them (including the source file view within rustdoc). So in some sense published crates already do contain bytes that are only useful for documentation. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Examples and tests are also arguably superfluous files — they benefit crater, and those who read the source code of the downloaded crate, but not the typical downloader. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Doing it with percentages seems unnecessarily complicated for something that’s just an arbitrary metric pulled out of a hat anyway.
In other words, it’s not just about cargo build. Coping with large images is just a chore. What I’m trying to do is gesture in the direction of “try to use small images.” Not just solve the problem for plain cargo builds. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Random musings on cargo's behavior To protect against zip bombs, cargo had a 512 MB limit when extracting crates and people ran into this (rust-lang/cargo#11151) and we updated it to take compression size into account (rust-lang/cargo#11337) so we are less likely to hit the 512 MB limit unintentionally. We also now report crate size on package/publish (rust-lang/cargo#11270). One future possibility mentioned in that PR is to warn when binary files are included (rust-lang/cargo#9058). In general, cargo has been hesitant about adding warnings that because we haven't had a way for people to disable them but with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should be cognizant of users. By including files that aren't as necessary
Some things that would help (and are currently being discussed)
Dependent CI jobs will never use this content. I wonder how many users are like me that almost exclusively use docs.rs rather than There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just answering:
It'd be needed for docs.rs too unfortunately. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
How so? I'm not seeing anything listed in the Motivation that benefits us docs.rs-exclusive users (speaking as someone who also has crates that include images). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Unless you host the images on a website, you can't actually use them in the documentation generated by docs.rs. There are some ways to go around this limitation, for example: generating the base64 equivalent of the image and including it directly into your crate doc comment. But it's not great. Hence this RFC. But the problem is as you mentioned that it would impact the community by downloading more content that wouldn't actually be used. A potential solution you mentioned ( |
||||||
* produces broken images when `cargo doc --no-deps` is run locally | ||||||
* requires the URLs generated for these images to be stable across rustdoc versions | ||||||
|
||||||
Embedding images in rmeta files is probably a bad idea, requiring access to the source code for a dependent crate when building the dependency would work poorly with some third-party build systems, and not supporting images in cross-crate inlined docs would be inconsistent and weird. | ||||||
|
||||||
# Drawbacks | ||||||
[drawbacks]: #drawbacks | ||||||
|
||||||
Allowing local resources in rustdoc output could lead to big output files if users include big resource files. This could lead to slower build times and increase the size of generated documentation (in particular in case of very big local resources!). | ||||||
|
||||||
Another problem is that people will add images into their published crates, increasing the package size whereas it's only used for documentation. | ||||||
|
||||||
# Prior art | ||||||
[prior-art]: #prior-art | ||||||
|
||||||
- [sphinx](https://www.sphinx-doc.org/en/master/usage/configuration.html#confval-latex_additional_files) | ||||||
- [haddock](https://haskell-haddock.readthedocs.io/en/latest/invoking.html?highlight=image#cmdoption-theme): it's mentioned in this command documentation that local files in the given directory will be copied into the generated output directory. | ||||||
- [doxygen](https://doxygen.nl/manual/commands.html#cmdimage): supported through `\image`. | ||||||
- [embed-doc-image](https://docs.rs/embed-doc-image/latest/embed_doc_image/): a proc-macro based version which directly embed the content into the generated documented as a base64 string. | ||||||
|
||||||
Another approach to this feature: | ||||||
|
||||||
[ePUB packages](https://www.w3.org/publishing/epub3/epub-packages.html#sec-pkg-manifest) use explicitly-declared manifests. It allows to have a fallback chain mechanism (going through resources for an entry until an available resource is found). | ||||||
|
||||||
# Rationale and alternatives | ||||||
[rationale-and-alternatives]: #rationale-and-alternatives | ||||||
|
||||||
Currently, to provide resources, users need to specify external URLs for resources or inline them (if possible like the `svg` image format) directly into the documentation. It has the advantage to avoid the problem of large output files, but it also requires users to upload their resources to a web server to make them available everywhere. | ||||||
|
||||||
# Unresolved Questions | ||||||
[unresolved-questions]: #unresolved-questions | ||||||
|
||||||
- Should we put a size limit on the local resources? | ||||||
- Should we somehow keep the original local resource filename instead of just using a number instead? | ||||||
- Should we use this feature for the logo if it's a local file? | ||||||
|
||||||
# Possible extensions | ||||||
[possible-extensions]: #possible-extensions | ||||||
|
||||||
This feature could be extended to DOM content using local resources. It would require to add parsing for HTML tags attributes. For example: | ||||||
|
||||||
```html | ||||||
/// <video src="../some-video.mp4"> | ||||||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how does this solve the
<script>
usecase?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't cover this case at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm... then I guess it might be better not to mention it in the "Motivation" section, or explicitly mention that this RFC does not cover this case :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On this line it mentions
The only local resources considered will be the ones in the markdown image syntax
. But it's true that a bit above we haveThey would also like to include images (like logos and diagrams), and scripts
, which is confusing. I'll remove this part. But otherwise, where is it not clear enough so I can make it more obvious?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe something like: "While using custom
<script>
tags is a valid usecase, this RFC only focusses on the usecase of dealing with images" at the end of the motivation sectionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍