Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scrape code examples from examples/ directory for Rustdoc #3123

Merged
merged 3 commits into from
Sep 9, 2021

Conversation

willcrichton
Copy link
Contributor

@willcrichton willcrichton commented May 11, 2021

Commit

Update 0000-rustdoc-scrape-examples.md

Update 0000-rustdoc-scrape-examples.md

Update 0000-rustdoc-scrape-examples.md

Create 0000-rustdoc-scrape-examples.md

Update 0000-rustdoc-scrape-examples.md

Update 0000-rustdoc-scrape-examples.md

Update 0000-rustdoc-scrape-examples.md

Update and rename 0000-rustdoc-scrape-examples.md to 3123-rustdoc-scrape-examples.md
@jyn514
Copy link
Member

jyn514 commented May 11, 2021

@shepmaster I don't think those issues are closely related. Those are about making it easier to manually write examples and guides, this is about automatically generating docs for existing examples. Manual docs will always be used less than automatic docs just because they take more effort to write.

@shepmaster
Copy link
Member

What prevents MyType::new from linking to every example? How do you choose which examples are in line and which are in the see more link?

@willcrichton
Copy link
Contributor Author

willcrichton commented May 11, 2021

What prevents MyType::new from linking to every example?

Nothing -- under this proposal, MyType::new would be linked to every call to it. If every example calls MyType::new, then every example will get linked.

How do you choose which examples are in line and which are in the see more link?

Currently, it's arbitrary, whichever example happens to get found first. I'm open to suggestions for more systematic ways to decide how to sort the examples. We could try sorting the examples by the size of the source file?

@camelid camelid added the T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC. label May 11, 2021
@GuillaumeGomez
Copy link
Member

In some crates, it could lead to a lot of links. I like idea though, not just a big fan of the current approach. If we wanted to use it on gtk-rs for example, some items would have a lot of examples, very likely too many.

Also, in your current implementation, you seem to link to the file using a github URL. I don't think this is the right approach considering that it would force us to support other repositories host websites (exposing us to URL scheme change if any), and that would simply not work if there is no website. I guess we could generate the examples in HTML like we do for source code?

@willcrichton
Copy link
Contributor Author

In some crates, it could lead to a lot of links. I like idea though, not just a big fan of the current approach. If we wanted to use it on gtk-rs for example, some items would have a lot of examples, very likely too many.

How about a configurable limit on the number of examples for any given function? Like --max-scraped-examples N as a flag to Rustdoc.

Also, in your current implementation, you seem to link to the file using a github URL. I don't think this is the right approach considering that it would force us to support other repositories host websites (exposing us to URL scheme change if any), and that would simply not work if there is no website. I guess we could generate the examples in HTML like we do for source code?

Yes, I discussed this in the RFC. Currently I match on the domain of the package.repository field of the manifest and special-case generate a link to the source viewer for that repo. However, if Rustdoc were changed to include examples in HTML, then I could just link to those.

@Nemo157
Copy link
Member

Nemo157 commented May 12, 2021

Other related issues:

For docs.rs specifically we already host all the source files from the package independently of rustdoc's source html output. So we could use something like --example-src-base-path=/crate/foobar/0.1.0/source/ to tell rustdoc how to generate links to the source files.

@GuillaumeGomez
Copy link
Member

How about a configurable limit on the number of examples for any given function? Like --max-scraped-examples N as a flag to Rustdoc.

That's an option on top of another option, definitely not a fan. I think the solution here would be in the UI.

Yes, I discussed this in the RFC. Currently I match on the domain of the package.repository field of the manifest and special-case generate a link to the source viewer for that repo. However, if Rustdoc were changed to include examples in HTML, then I could just link to those.

Well, nothing prevents us to generate it. Currently we don't because we have no use for them.

@shepmaster
Copy link
Member

I hope to leave a more useful comment at some point, but I wanted to at least jot these thoughts down before I lose them:

  • It would be nice to extend this to all code blocks in the docs.
  • It would be nice to click on the types / methods inside the code block and link back to the function.
  • I worry about the medium- and long-tail of customization. Some possible things people are going to want:
    • #[cfg] directives
    • the ability to hide / show pieces of code
    • the ability to add prose into the example (rustdoc /// comments inside the example; recursively. shudder)

@GuillaumeGomez
Copy link
Member

@shepmaster Seeing how difficult it is to get rust-lang/rust#84176 merged because some rustdoc team members aren't very favorable to the feature, I think it's going to be complicated (even though I agree).

@jyn514
Copy link
Member

jyn514 commented May 14, 2021

I hope to leave a more useful comment at some point, but I wanted to at least jot these thoughts down before I lose them:

* It would be nice to extend this to _all_ code blocks in the docs.

I'm not sure what you mean by this - do you want to extend this to doctests, so that rustdoc links to other items in the same crate? That seems pretty hard ... Rustdoc doesn't analyze doctests, just pass them straight through to rustc, so it would need fundamental changes.

* It would be nice to click on the types / methods _inside_ the code block and link back to the function.

This seems like it could be part of rust-lang/rust#84176 (or added later during implementation). I don't think it needs to be part of the RFC directly, this is still useful without.

* I worry about the medium- and long-tail of customization. Some possible things people are going to want:
  
  * `#[cfg]` directives

I don't know what you mean by this.

  * the ability to hide / show pieces of code

👍 Maybe this can be a per-item attribute? It's kind of tricky with re-exports though, see rust-lang/rust#84597 (comment).

  * the ability to add prose into the example (rustdoc `///` comments inside the example; recursively. _shudder_)

This seems like a misfeature, I don't know why we would support it.

@willcrichton
Copy link
Contributor Author

I've filed draft PRs if it's helpful for continuing discussion on this RFC.

Rustdoc: rust-lang/rust#85833
Cargo: rust-lang/cargo#9525

@willcrichton
Copy link
Contributor Author

@GuillaumeGomez I managed to compile the GTK docs with this extension. The only change I had to make is changing every instance of [[bin]] to [[example]] in examples/Cargo.toml so the examples got picked up by the --examples flag.

https://willcrichton.net/example-analyzer/gtk/

I think it's really nice! Since dropping into any arbitrary widget, you can quickly find use cases in context. For example:

@Kimundi
Copy link
Member

Kimundi commented Jun 2, 2021

This sounds like a good idea! In general I've found the ambiguity between providing doc-code examples and standalone examples always a bit confusing and7or redundant.

In that sense, it would be nice if the standalone examples would get their own navigable pages in rustdoc (maybe phrase them as "crate example"s?)

In general I agree with @shepmaster that there seem to be two orthogonal aspects here:

  • Scrape different sources of Rust code for occurrences of an API, and link them to the corresponding API entries
  • Include other Rust sources for documentation and/or scraping than the currently compiled crate

In the maximum-flexible case, you could include the current crates source, all its binaries, all its examples, all its integration tests, and all its code doc-comments into the list of code to scrape, expose them all with their own doc pages, and have some sensible defaults for which API links to display per default.

Also, rustdoc already has the feature of exposing source code pages for the crates source itself - maybe this could get unified by just giving additional source of Rust code - like examples - their own parallel source code pages.

Update to latest architecture
@willcrichton
Copy link
Contributor Author

I have updated the PRs, RFC, and demo links to reflect a new architecture for this extension based on the discussion above. Specifically, rather than linking to the crate's external repository, this extension instead generates source files for scraped examples and links to them locally. For example, try clicking on examples/list_box/main.rs on this page:

https://willcrichton.net/example-analyzer/gtk/struct.Entry.html

@sassman
Copy link

sassman commented Jul 8, 2021

I'd like to comment on this:

  • Include other Rust sources for documentation and/or scraping than the currently compiled crate

In the maximum-flexible case, you could include the current crates source, all its binaries, all its examples, all its integration tests, and all its code doc-comments into the list of code to scrape, expose them all with their own doc pages, and have some sensible defaults for which API links to display per default.

Also, rustdoc already has the feature of exposing source code pages for the crates source itself - maybe this could get unified by just giving additional source of Rust code - like examples - their own parallel source code pages.

I think a more powerful example / code inclusion feature would (at least for me) be already enough. The proposal in this RFC is still good an valid, but IMO a simpler version might does already a huge improvement.

Something like this:

/// my super cool crate 
/// ## usage example
/// ```
#[doc = include_code!("../examples/hello_world.rs")]
/// ```

So the in essence a simple possibility to include external code, from the examples folder here in this case, that will be rendered fully and also tested as part of the doctests.

This would create a link between example code and rust documentation and reduce code duplication when it comes to show usage of something.

@willcrichton
Copy link
Contributor Author

@sassman that's a good idea as well! I like the ability to manually specify relevant examples.

I still however would strongly support automatic linking. In my experience with large Rust codebases, very few developers take the time to find examples and link to them, nevertheless update these links when the examples change. Doing this automatically would significantly improve the discoverability of examples.

@sassman
Copy link

sassman commented Jul 8, 2021

@willcrichton I'm absolutely with you. An automatic linking / scraping of used examples, specifically on a functions level is very handy.

So what I'm proposing is more of a complementary side of things. It might be easier to start with, since it gives flexibility and freedom of usage to developers, hence it can improve the actual lack of links between example code and documentation.

And as way more sophisticated but super useful approach automatic scraping on a very low level of "code usage" like you proposing.

@jyn514
Copy link
Member

jyn514 commented Aug 9, 2021

I think a more powerful example / code inclusion feature would (at least for me) be already enough.

This has since been stabilized as #[doc = include_str!(...)].

I think this looks really good as is :)

@rfcbot fcp merge

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 9, 2021

Team member @jyn514 has proposed to merge this. The next step is review by the rest of the tagged team members:

Concerns:

Once a majority of reviewers approve (and at most 2 approvals are outstanding), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for info about what commands tagged team members can give me.

@rfcbot rfcbot added proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. labels Aug 9, 2021
@GuillaumeGomez
Copy link
Member

As long as it's not the default behaviour, I'm completely fine with it and really love the idea!

@rfcbot
Copy link
Collaborator

rfcbot commented Aug 30, 2021

🔔 This is now entering its final comment period, as per the review above. 🔔

@rfcbot rfcbot added final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. and removed proposed-final-comment-period Currently awaiting signoff of all team members in order to enter the final comment period. labels Aug 30, 2021
<kbd>
<img width="600" alt="Screen Shot 2021-05-09 at 7 49 35 PM" src="https://user-images.githubusercontent.com/663326/117592915-fe07b880-b0ff-11eb-97e9-43197fbcb2a7.png">
</kbd>
<br /><br />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eek, the markup! My eyes!

  • Please put real descriptive alt text, even if brief. (I will admit that this is one of the rare occasions that “Screen Shot ‹time›” actually isn’t completely bad alt text, but it’s still nasty.)

  • Please don’t abuse <kbd> like this to get a border, it makes me sad. Plus it doesn’t work when published in https://rust-lang.github.io/rfcs/ (but please don’t patch that to similarly style it, or I might cry!)

  • The images are part of the RFC, so I’d say they should be included in the repository, not externally-hosted; but apparently there’s no solid way of doing that at present, so I’ve just filed Images in RFCs: the one in-repository image is broken and needs fixing, and external images should be moved into the repository #3172.

  • Also I don’t like the <br /><br />; the manual line breaks don’t feel called for. (Incidentally also also, the trailing slash is an XHTML anachronism, misleading in HTML because it doesn’t close elements, but is just completely ignored by the parser. You’re better to write <br>.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've simplified the image markup and added alt text. I left the image links as-is until your issue is resolved.

@davidbarsky
Copy link

I recognize that this is going to be merged, and it's entirely possible that it's too late for me to comment on this, but: is it possible to provide an argument to rustdoc as to which example directory should be scraped? For instance, on tracing, we have a examples crate within the larger Cargo workspace. Would it be possible, for instance, for the core tracing crate to pull in examples from a peer crate in a workspace, or is the example scraping scoped to a single crate?

@willcrichton
Copy link
Contributor Author

@davidbarsky Yes, this is possible.

Examples can be scraped from any crate in the current workspace. Essentially, when you pass --scrape-examples, this recursively calls cargo check {some filter} where {some filter} could be --examples, --lib, and so on. I could allow the filter to be an argument, so you run e.g. cargo doc --scrape-examples "--crate my_examples".

One question is where this option would be stored so docs.rs knows about it. Would it make sense to add a rustdoc-scrape-examples key or something?

@Nemo157
Copy link
Member

Nemo157 commented Sep 5, 2021

That will not work for docs.rs, since only the root crate and its dependencies will be available.

@davidbarsky
Copy link

Examples can be scraped from any crate in the current workspace. Essentially, when you pass --scrape-examples, this recursively calls cargo check {some filter} where {some filter} could be --examples, --lib, and so on. I could allow the filter to be an argument, so you run e.g. cargo doc --scrape-examples "--crate my_examples".

I think the syntax can be bikeshed, but a command syntax along those lines seem reasonable. I'll note that I think I'd prefer a glob-like syntax, but again—that can be sorted out during implementation.

One question is where this option would be stored so docs.rs knows about it. Would it make sense to add a rustdoc-scrape-examples key or something?

I would not mind a key like this, but as Nemo said, docs.rs only get the root crate and its dependencies. I don't know if its possible or in-scope to allow docs.rs to get the entire Cargo workspace for documentation generation, but if it were, a key (or dictionary) of example scraping would be pretty helpful.

@willcrichton
Copy link
Contributor Author

That will not work for docs.rs, since only the root crate and its dependencies will be available.

Oh, I didn't know that! Hmm, that actually seems like a big issue. If this feature isn't available on docs.rs, then the vast majority of Rustaceans will never get to use it.

@Nemo157 can you elaborate a little bit on how docs.rs works? I'm surprised the workspace isn't available, since I'm assuming the docs.rs infra would just git clone the repo for a given crate. Really, I'm trying to determine if there's a way for my RFC to work with docs.rs today, and if not, how much work it would take to integrate them.

@Nemo157
Copy link
Member

Nemo157 commented Sep 6, 2021

docs.rs builds from the crate packages from crates.io, not git repos. There are 3 major relevant steps:

  1. The crate to be documented is downloaded from crates.io and extracted to a temporary directory
  2. The dependencies are fetched via cargo fetch
  3. The build runs inside a sandbox with no network access and limited disk access

There's currently a side-effect that [dev-dependencies] are also fetched, so it would be possible to scan the whole $CARGO_HOME/registry/src/* for examples that are packaged in (published) dev-dependencies, but that seems like a bug from rustwide being originally written for Crater which needs them.

Right now I can't think of any way to support out-of-package examples on docs.rs that doesn't have other major problems.

@willcrichton
Copy link
Contributor Author

Thanks for the clarifications. I suppose for now we'll just support in-package examples and figure out this bit later. I don't know what fraction of crates use examples/ vs. a separate examples crate. But I've seen enough crates use examples/ that a docs.rs deployment should still have some effect.

@rfcbot rfcbot added finished-final-comment-period The final comment period is finished for this RFC. and removed final-comment-period Will be merged/postponed/closed in ~10 calendar days unless new substational objections are raised. labels Sep 9, 2021
@rfcbot
Copy link
Collaborator

rfcbot commented Sep 9, 2021

The final comment period, with a disposition to merge, as per the review above, is now complete.

As the automated representative of the governance process, I would like to thank the author for their work and everyone else who contributed.

The RFC will be merged soon.

camelid added a commit that referenced this pull request Sep 9, 2021
Scrape code examples from `examples/` directory for Rustdoc
@camelid camelid merged commit a1eeaa6 into rust-lang:master Sep 9, 2021
@camelid
Copy link
Member

camelid commented Sep 9, 2021

Huzzah! The @rust-lang/rustdoc team has decided to accept this RFC.

To track further discussion, subscribe to the tracking issue here:
rust-lang/rust#88791

matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Oct 23, 2021
…n514

Scrape code examples from examples/ directory for Rustdoc

Adds support for the functionality described in rust-lang/rfcs#3123

Matching changes to Cargo are here: rust-lang/cargo#9525

Live demo here: https://willcrichton.net/example-analyzer/warp/trait.Filter.html#method.and
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Oct 23, 2021
…n514

Scrape code examples from examples/ directory for Rustdoc

Adds support for the functionality described in rust-lang/rfcs#3123

Matching changes to Cargo are here: rust-lang/cargo#9525

Live demo here: https://willcrichton.net/example-analyzer/warp/trait.Filter.html#method.and
bors added a commit to rust-lang/cargo that referenced this pull request Oct 28, 2021
Scrape code examples from examples/ directory for Rustdoc

Adds support for the functionality described in rust-lang/rfcs#3123

Matching changes to rustdoc are here: rust-lang/rust#85833
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
disposition-merge This RFC is in PFCP or FCP with a disposition to merge it. finished-final-comment-period The final comment period is finished for this RFC. T-rustdoc Relevant to rustdoc team, which will review and decide on the RFC. to-announce
Projects
None yet
Development

Successfully merging this pull request may close these issues.