Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for rustdoc jump to definition #1

Open
wants to merge 14 commits into
base: master
Choose a base branch
from

Conversation

GuillaumeGomez
Copy link
Owner

@GuillaumeGomez GuillaumeGomez commented Dec 9, 2021

text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Show resolved Hide resolved
@oli-obk
Copy link

oli-obk commented May 12, 2022

Funny thing I noticed: I don't actually need jump to definition. I need go back to docs from the source view xD

@GuillaumeGomez
Copy link
Owner Author

GuillaumeGomez commented May 12, 2022

There is a PR for that as well. And normally it's included in the RFC. When you click on an item definition name (so the Foo in the struct Foo {... }), it'll bring you to the item docs (if it has docs).

EDIT: the PR is here: rust-lang/rust#89157

@GuillaumeGomez GuillaumeGomez force-pushed the rustdoc-jump-to-definition branch 5 times, most recently from 249ade9 to fc325a1 Compare May 16, 2022 13:24
@Kobzol
Copy link

Kobzol commented May 18, 2022

I have to say that this feature is pretty awesome! Thank you for creating it.

After trying it a bit, I have two comments:

  1. I wasn't able to predict well when would the link take me to source code and when it would lead back to documentation. Struct definitions led to docs, struct usages led to source code, but then u8 usage led to docs again (I know, u8 doesn't really have source code, but it still surprised me that jump from "usage" led to docs). Maybe the underline/differently colored background/whatever is used to highlight the link could be different between links to docs and to source code? Like "red link = goes to docs, blue link = goes to source" (obviously not these colors :D ). The users wouldn't know at first what type of highlight is meant for what, but they would probably quickly learn it, if it's consistent.

  2. I was surprised that it didn't work for methods (although it's stated in the RFC). Of course this could be added later, but from my POV (while being used to definition jumps for basically everything from the IntelliJ Rust plugin), it was weird that I can click on structs, but not on methods. Resolving actual method calls on e.g. trait objects might be complicated, but the backlink from method definition to docs should be straightforward (I hope).

@GuillaumeGomez
Copy link
Owner Author

It wasn't implemented yet because we didn't agree. As simple as that. :)

I think it'd be nice to be able to jump to the docs of associated items (such as methods) too though.

As for having a different visual representation between link to doc and link to source code, I think it's a really good idea! I'll update the RFC.

@GuillaumeGomez
Copy link
Owner Author

@Kobzol I integrated your suggestion about different visual "markers" for both kinds of links. Thanks! :)

@BurntSushi
Copy link

I personally think this is an amazing feature and it would greatly enhance the source code view of rustdoc. This would be especially nice for std, which tends to have a lot of indirection.

At this point, you encounter an item used in this page but not defined in this page. Do you want to clone the crate locally to check it out or go to github/gitlab to check it there or do you prefer having the links directly available?

This is indeed what I do today. If I want to see the source of a public item, I'll click the src link. If there's indirection, then I'll do a quick CTRL-F to see if I can easily find it. If that fails, I clone the repo, open it in my text editor and use Rust Analyzer's "jump to definition" to navigate instead. So yeah, if I can skip all those steps and just click on links in the source code view, that's a really huge win IMO.

I do this quite a bit so I'm pretty excited about this feature! I expect many others will be too.

@BurntSushi
Copy link

Oh, here's a question: Does this have any perf impact on generating docs?

@GuillaumeGomez
Copy link
Owner Author

Oh, here's a question: Does this have any perf impact on generating docs?

Apart from the generated content size growing a bit, it didn't have much impact. I had put the numbers in rust-lang/rust#84176 if you're interested.

Implementation details: basically, I added a new visitor (like how clippy handle lints for example) which gather items that need to be turned into link into a hashmap and then when we render the source code, we check when we render an "item" if it is in the hashmap or not.

@ojeda
Copy link

ojeda commented Jun 8, 2022

You may want to mention in the table the C/C++ cross referencers like LXR, MXR, DXR and Elixir that originated from the need to browse the Linux kernel, e.g. live example at https://elixir.bootlin.com.

So thanks a lot for your work on this! This gets the source view closer to something like Elixir (I will add a comment in the PR about a potential [refs] link or similar for references).

@GuillaumeGomez
Copy link
Owner Author

Wow. I didn't know such a tool existed. It's quite convenient! Thanks for the information! I'll update the RFC as soon as possible.

@GuillaumeGomez
Copy link
Owner Author

I added Elixir in the tool list. Thanks again!

@ojeda
Copy link

ojeda commented Jun 8, 2022

My pleasure!

To give some context about Elixir and similar cross referencers: they parse all the code (i.e. conditional compilation does not hide anything, which is very useful for something like the kernel which has thousands of configuration options), they also have some docs.rs features in that it allows to go through versions of the kernel, may handle several projects, etc.

Copy link

@jsha jsha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great feature, and I'm excited to see it progress. Thanks for working on it. As you mention, the main concerns are about maintainability, given the small rustdoc team and the large existing complexity and feature base of rustdoc.

One of the tricky things about rustdoc is its tight coupling to the compiler. This feature seems like it doesn't need to be so tightly coupled. I think we could benefit greatly from doing this outside of rustdoc because it would allow us to iterate faster and work with a more self-contained piece of code.

GitHub's code navigation says:

GitHub has developed two code navigation approaches based on the open source tree-sitter and stack-graphs library

The tree-sitter library has a parser for Rust already: https://github.com/tree-sitter/tree-sitter-rust. Can we make use of that? It would save us on maintenance burden, and any improvements we make to it would presumably make GitHub's code navigation for Rust better (once they enabled it).

Also I think the UI section could use screenshots and descriptions of how the UI for other web sites implements this - in particular GitHub and Google Code Search. Notably, both of those sites offer "Find References", which makes me believe we will almost certainly want to support that ourselves eventually. I think this RFC should have a roadmap for implementing "Find References" even if it's not in the initial implementation.

# Reference-level explanation
[reference-level-explanation]: #reference-level-explanation

It doesn't require javascript since it's only generating links, making it work on both desktop and mobile, with or without javascript enabled.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
It doesn't require javascript since it's only generating links, making it work on both desktop and mobile, with or without javascript enabled.
It doesn't require javascript since it's only generating links, making it work with or without javascript enabled.

The use or non-use of JavaScript doesn't affect the desktop/mobile distinction, since both desktop and mobile platforms support JavaScript.


Once at their definition, if you click on the item name, you'll arrive on its documentation page (if it has one, otherwise nothing will happen).

An interesting thing to note: primitive types always link to documentation. Also, to make it more obvious if a link points to source code or to documentation, they should have different visual markers (different background/underline colors for example).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion of different visual markers is a bit out of place. In general, considerations of UI are scattered throughout the doc. I think it would be better for most of the doc to focus on the actions people should be able to perform (go to definition; go to documentation), and have a separate section discussion UI possibilities to achieve that.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should I create a new section or should I simply remove it then?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should create a new section to discuss UI.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if this has been mentioned already. I would advice against differentiating link types by color since that would harm accessibility for colorblind people (in case of red & blue: monochromacy / achromatopsia, blue cone monochromacy). Just something to keep in mind.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to not add new color for links (or background). Instead they will have an underline when hovered.

For the UI, we have the following constraints:

* Must work on mobile and desktop: we cannot rely on mouse hover events.
* It's better if it works without javascript (it's important for accessibility, but also for maintenance reasons too: less JS code, less maintenance).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

less JS code, less maintenance

I disagree with this statement. It's entirely possible that it turns out generating links on the server side is more maintenance because it is tightly coupled to compiler internals, which frequently change.

The flip side is that the Rust community has more people who know Rust than those who know JavaScript, so putting more of the implementation in Rust means there are more people available to do maintenance.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with your disagreement. 😄

Maybe I have a wrong idea but let me share what I had in mind while I wrote this statement: to have this information in the front-end, the back-end has to generate it in any case (either as a link or any other format). So deciding to use JS to handle this would mean that we still have code in the back-end but we would also have code in the front-end to treat it.

If that makes sense?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, that makes sense - can you put it in the doc? :-D

# Drawbacks
[drawbacks]: #drawbacks

**The biggest concern is actually about "scope expansion"**. If we add this feature, why not adding others like a "commit source code viewer", or a "git blame source code viewer", or "go to this line in github/gitlab"?
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also mention the most likely second feature: "find references." That is something I'd really like, and I think many other people would like too. We even got a request related to that in Zulip yesterday.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't consider it as a drawback but more like a potential extension.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree - I don't consider it a drawback. But given that (a) some of the team are worried about scope expansion, and (b) it is a scope expansion, you should address it somewhere in the doc, for instance to say "in the future, we are likely to implement 'find references' as well. Here's why that will fit well with the jump to definition feature we are implementing now"

text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
@jsha
Copy link

jsha commented Jun 12, 2022

I'd like to see in this RFC a discussion of storage size and client-side performance. For std, the nightly compiler docs, and some popular crates:

  • What is the total increase in storage size of generated docs (absolute bytes and percentages)?
  • What is the increase in storage size for the largest page?
  • What is the increase in DOM size (number of DOM elements) for the largest page?
  • What is the increase in page load time (measured by WebPageTest.org) for the largest page, on a slow connection?

Some crates have very large source files, and I worry that this feature may make their source pages so big and complex that browsers can't render them at all.

@GuillaumeGomez
Copy link
Owner Author

This is a great feature, and I'm excited to see it progress. Thanks for working on it. As you mention, the main concerns are about maintainability, given the small rustdoc team and the large existing complexity and feature base of rustdoc.

One of the tricky things about rustdoc is its tight coupling to the compiler. This feature seems like it doesn't need to be so tightly coupled. I think we could benefit greatly from doing this outside of rustdoc because it would allow us to iterate faster and work with a more self-contained piece of code.

Almost all of the required code is already there (minus the open PRs) and is basically a visitor and a hashmap. So even though the complexity isn't 0, it's still quite low.

The tree-sitter library has a parser for Rust already: https://github.com/tree-sitter/tree-sitter-rust. Can we make use of that? It would save us on maintenance burden, and any improvements we make to it would presumably make GitHub's code navigation for Rust better (once they enabled it).

tree-sitter is in C, meaning we would need to put back a C toolchain inside the Rust compiler for rustdoc. Something we took a good amount of time to remove. Also, we currently use a Visitor, meaning that if the Rust syntax is updated, we basically have nothing to do. We would exchange our current code (which is already self-contained since it's all in its own file span_map.rs and is 155 lines long, comments and imports included) to instead have an external crate doing the job, meaning we would need to have code to use its API. And I definitely don't think this code would be shorter.

I tried multiple approaches and they all had much bigger issues than the current approach which I still find as the simplest one.

Also I think the UI section could use screenshots and descriptions of how the UI for other web sites implements this - in particular GitHub and Google Code Search. Notably, both of those sites offer "Find References", which makes me believe we will almost certainly want to support that ourselves eventually. I think this RFC should have a roadmap for implementing "Find References" even if it's not in the initial implementation.

I'm a bit afraid that putting screenshots would increased too much the size of the RFC (don't know if it's a problem though). There are links in the comparison tables so I don't know. If anyone has an opinion either or not I should put screenshots?

Since we collect items, we already have the information for "Find References". However, I think it's out of scope. But like I said, we already have all the information we would need to implement this feature. At this point it would mostly be a UI issue (as usual).

I'd like to see in this RFC a discussion of storage size and client-side performance. For std, the nightly compiler docs, and some popular crates: ...

This is a good idea. I'll add some metrics.

@GuillaumeGomez GuillaumeGomez force-pushed the rustdoc-jump-to-definition branch 2 times, most recently from dcd7ddb to 16e1262 Compare June 14, 2022 11:49
@ojeda
Copy link

ojeda commented Jun 14, 2022

Since we collect items, we already have the information for "Find References". However, I think it's out of scope. But like I said, we already have all the information we would need to implement this feature. At this point it would mostly be a UI issue (as usual).

I appreciate that a project needs to have a defined scope. On the other hand, the docs generated by rustdoc are very useful and well done, thus as a user I wouldn't mind seeing rustdoc evolve into a more general tool (cross-referencer) if there is enough people to work on it.

In particular, "Find References" may be quite useful for complex projects, so if there are no major changes needed to support it, I think it would be worth it.

@GuillaumeGomez
Copy link
Owner Author

I already added it into "future extensions". But like I said, I don't want to include it in this RFC.

@GuillaumeGomez
Copy link
Owner Author

Just pushed it. This is really awesome, thanks a lot! I kept the last part saying that we need to add information about the linking too so it's not something we overlook.

@Manishearth
Copy link

This seems pretty solid, and it's exactly what I was looking for, thanks @jsha! In particular, this clearly motivates the feature as one for doc readers (rather than crate maintainers), and hopefully this can be used to inform the design.

@GuillaumeGomez
Copy link
Owner Author

I included the changes for paths suggested on this zulip thread and fixed an outdated link.

* Mention SourceGraph in the "Rationale and alternatives" section
@GuillaumeGomez
Copy link
Owner Author

I updated the UI part description and added SourceGraph in the "Rationale and alternatives" section. The discussion about SourceGraph can be found here.

GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Sep 16, 2022
…mp-to-definition, r=GuillaumeGomez

rustdoc: use more precise URLs for jump-to-definition links

As an example, this cuts down <https://doc.rust-lang.org/nightly/nightly-rustc/src/rustc_middle/ty/mod.rs.html> by about 11%.

    $ du -h new_mod.rs.html old_mod.rs.html
    296K	new_mod.rs.html
    332K	old_mod.rs.html

Like rust-lang#83237, but separate code since source links have a different URL structure.

Related to [Zulip discussion](https://rust-lang.zulipchat.com/#narrow/stream/266220-rustdoc/topic/RFC.20for.20.22jump.20to.20definition.22.20feature/near/299029786) and [the jump-to-definition pre-RFC](GuillaumeGomez/rfcs#1).

#### Local variables

Local variables should get a link to their definition (not their type):

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we ever answered the question about how to support this when the function might not pass type checking.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The link you gave is for supporting links for other targets more globally. And for that, the answer I currently have is: it cannot be supported with the current approach. I think it's fine considering that you will go to the source corresponding to the current documentation and look this one in particular. I don't think (personal opinion here) we should try to go any further than that as it would then outreach its original goal (but maybe I'm wrong on that?).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're talking past each other here. The problem isn't that links don't get generated in some corner cases; that's fine, as long as no incorrect links are generated. The problem is breaking changes.

Consider the below example:

#[cfg(windows)]
use std::os::windows::io::AsRawHandle;

#[cfg(any(doc, windows))]
pub fn my_fn() -> std::io::Result<std::fs::File> {
	let f = std::fs::File::create("foo.txt")?;
    let _handle = f.as_raw_handle();

    // use handle with native windows bindings

    Ok(f)
}

and the output of the following commands:

$ rustdoc +dev foo.rs 
$ rustdoc +dev -Z unstable-options --generate-link-to-definition foo.rs 
error[E0599]: no method named `as_raw_handle` found for struct `std::fs::File` in the current scope
 --> foo.rs:7:21
  |
7 |     let _handle = f.as_raw_handle();
  |                     ^^^^^^^^^^^^^ method not found in `std::fs::File`

error: aborting due to previous error

For more information about this error, try `rustc --explain E0599`.

Copy link
Owner Author

@GuillaumeGomez GuillaumeGomez Oct 20, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. I originally fixed this by hiding the compiler errors because they don't impact the final output although errors are emitted. It was reverted afterward so they're displayed again. I'm not too sure what to do about this.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should all be documented in the rfc, we should not be relying on the experimental implementation for "how is X case handled"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely. I just wanted to know first if we wanted to propose a solution in the RFC or not. I plan to add this and the "up limit" to the RFC.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When you're not sure what you should do about a problem, you should write that in the RFC. In particular, you should write:

  • what the problem is
  • what its impacts are (for instance, does this cause all crates to fail to build with this flag? only some crates?)
  • potentially some ideas about how to fix it, with the tradeoffs for those ideas

But if you don't have ideas for how to fix it, it's still important to write down the problem so that other people can help you figure out solutions (and so they don't have to discover the problem themselves from scratch).

@jsha
Copy link

jsha commented Oct 21, 2022

One thing that is still unclear from this RFC: what is the planned trajectory for this feature? Right now it's an unstable flag. Is the plan to make it a stable flag? Will the new behavior be on by default or off by default? Can this feature be made stable enough that it doesn't need a flag, and is on by default with no way to turn it off?

If the plan is to keep it as a stable flag indefinitely: will that flag be turned on for std builds deployed to doc.rust-lang.org? Will it be turned on for docs.rs builds?

@jsha
Copy link

jsha commented Oct 21, 2022

Work that still needs to be done on this proto-RFC:

  • A discussion of the problems posed by functions not passing type checking.
  • "There are nevertheless some challenges to selecting the version appropriately within the crates.io ecosystem. See the Reference-level explanation for more detail (TKTK - add detail in Reference-level explanation)." - we need to fill this in.
  • Fix up verb tenses. Particularly for the Reference section, there is a lot "should" ("No primitive type should get a link") mixed with present tense ("On Foo definition (pub struct Foo), we generate a link to its documentation page"). It would be easier to read if only a single form is used. I think it is most typical to use the future tense here: "No primitive type will get a link." "On Foo definition we will generate a link to its documentation page."
  • Describe the planned trajectory for the feature.

As a reminder, the TKTK was about the part of the conversation I started here: #1 (comment). That is, for inter-crate links, should we link to the exact version of dependencies that happened to be selected at doc build time, or should we try to link to a semver version so the user is more likely to see what they would get if they compiled today? Like the other open question, you don't necessarily have to come to a conclusion yet, but you should describe the problem in enough detail for someone else to understand it and give feedback. And also present the possible solutions.

@GuillaumeGomez
Copy link
Owner Author

About the cross-crate links solution you suggested with ?source=true from:

Therefore, inter-crate links should link to the documentation for an item, but with an added URL parameter ?source=true. So for instance, the link on line 4 of https://rustdoc.crud.net/imperio/jump-to-def-askama/src/askama_shared/filters/json.rs.html#4 wouldn't point to https://docs.rs/serde_json/1.0.79/src/serde_json/ser.rs.html#2174-2177 as it does today. Instead it would point to https://docs.rs/serde_json/1.0.*/serde_json/fn.to_writer_pretty.html?source=true. We would also add JS on page load that checks for ?source=true. If the URL parameter is present, that JS would immediately click on the source link.

It wouldn't work for private items as they don't have a public documentation page. I don't think there is a solution to this problem that would cover all cases and maybe we should simply not try to fix it and simply handle it as we do currently: if the target has been removed, it'll lead to a 404 page.

…pass type checking?" and "Adding an heuristic to automatically disable the feature on a file".

 * Describe the planned trajectory for the feature.
 * Explain more in details how cross-crate linking works.
 * Fix up verb tenses in the Reference section.
@GuillaumeGomez
Copy link
Owner Author

I went through all the points that @jsha wrote up. Writing a small summary of my changes alongside the description in the commit:

  • I described the planned trajectory for this feature, ie removing the current --generate-link-to-definition option, enable this feature by default and adding a new option which would allow to disable this feature (named --disable-jump-to-definition).
  • After thinking for a while about the suggestion made by @jsha on how to handle invalid cross-crate links and their suggestions, I encountered some issues that would appear and described them in a new Cross-crate links section and how the cross-crate links would be handled by this feature (this is very much open to debate as I might have misunderstood what @jsha wrote!).
  • I added new unresolved questions "Adding an heuristic to automatically disable the feature on a file" and "How to handle functions that do not pass type checking?" with explanations and how to encounter them in both cases. Hopefully I added enough explanations of the problem.
  • I fixed up the verb tenses in the Reference section and fixed some typos in a few other places.

@@ -324,6 +334,8 @@ External library (like `cpan` or `tree-sitter`): They would work to generate lin

[SourceGraph](https://sourcegraph.com/github.com/rust-lang/rustc-demangle@2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/-/blob/src/v0.rs?L65): SourceGraph uses [Language Server Index Format (LSIF)](https://github.com/Microsoft/language-server-protocol/blob/main/indexFormat/specification.md) to extract info from a language server and make it available in a static format. And rust-analyzer bindings for LSIF already exist. The problem is that it would require a live server to work, which is a blocker to be integrated into rustdoc as it must work serverless.

Another argument in favour of not relying on an external tool is to have support for cross-crate links as it is a very important part of any rust project. For example, if you're browsing a crate "A" and you encounter an item coming from crate "B" (because crates often delegate significant portions of their implementations to sub-crates, for compile performance and other reasons), it seems obvious that it should be able to link to it. This means any tool that does a good job of navigating Rust source code needs to be aware of that code's dependencies, and how to link directly to source code pages within those dependencies. For instance, `askama_shared/filters/json.rs.html` needs to link to `serde::Serialize`. Rustdoc can do that because it builds a copy of all dependencies when building documentation, and it can use reliable mapping from (crate, version) to a URL pointing to the source code. GitHub or Google Code Search can't really do this mppaing while staying within their own code navigation system, because there is not a reliable mapping from (crate, version) to (repository URL, tag).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is a particularly solid justification. It is important for rustdoc to know where dep docs are, and it already does! That does not explain why it needs to have it in its source view.

I might be misunderstanding this paragraph but it doesn't really seem to add new justification.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually took it from this comment. I thought @jsha wanted me to add it to underline even further why it made more sense to have it in rustdoc directly. But maybe I misunderstood what they meant? Or maybe my transcription of what they said is a complete failure.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this argument is: because rustdoc already knows where the dep docs are, and GitHub/Google Code Search probably never will (at least not with confidence), this is an advantage that code navigation within rustdoc's source view has over code navigation within GitHub/Google Code Search.

Here's a concrete example (unfortunately it will only work if you are in GitHub's Code Search preview):

Visit https://cs.github.com/servo/servo/blob/9901cb399320e67c3ad2d62bf3a51d6ca993667b/components/net/decoder.rs#L102, part of the servo source code. It defines a gzip decoder:


/// A gzip decoder that reads from a `libflate::gzip::Decoder` into a `BytesMut` and emits the results
/// as a `Bytes`.
struct Gzip {
    inner: Box<gzip::Decoder<Peeked<ReadableChunks<Body>>>>,
    buf: BytesMut,
    reader: Arc<Mutex<ReadableChunks<Body>>>,
}

How do I get to the definition of gzip::Decoder? If I click on it, GitHub Code Search tells me the definition is in the same file, line 69. Well, that's just plain wrong - that's another struct that happens to share the name Decoder.

image

But even setting aside that mistake - could GitHub Code Search find the source for libflate::non_blocking::gzip::Decoder and link me there? I think not; it's in a different repository, and I strongly suspect GitHub Code Search does not have any notion of cross-repository dependencies that follows Rust's package graph.

In practice, hard to test because even clicking the link to search "all repos" fails to find libflate::non_blocking::gzip::Decoder, presumably because it is not yet indexed.

@jsha
Copy link

jsha commented Oct 22, 2022

I described the planned trajectory for this feature, ie removing the current --generate-link-to-definition option, enable this feature by default and adding a new option which would allow to disable this feature (named --disable-jump-to-definition).

This is great! Thanks for clarifying that. Now that that is clear, it demonstrates another requirement: Because we plan to turn it on by default, we have a requirement that this feature cannot break doc building for any crate where doc building works today. This is why the question of functions that fail type checking is so important, because it seems to have an impact on whether docs will successfully build.

Cross-crate links

Some issues that can appear when generate cross-crate links is to link to non-existent locations.

This is one of the problems with cross-crate links, but not the only one. Here's how I would put it:

Most Rust libraries don't specify an exact version for their dependencies, but a semver version requirement in Cargo.toml (see Cargo FAQ about libraries and Cargo.lock).

That means the library could be built against any one of several versions of its dependencies, with the selection done when an a binary crate is built for the first time or cargo update is run for a binary crate.

If you're looking at the source for a library crate, and click through into the source code for one of its dependencies, which version of that dependency should you wind up at?

  1. The minimum version allowed by the semver version requirements.
  2. The current maximum version allowed by the semver version requirements.
  3. At the time the library's docs were built, the version that happened to be the maximum allowed by the semver version requirements.

Ideally we'd like (2). Currently, for pub use statements, we use (3) on docs.rs. Mostly this isn't a problem, because pub use statements link to documentation of elements of the public API, which is relatively stale. However, it may be more of a problem when jump to definition is enabled by default, because we will be generating links to implementation, which can and does change between versions.

In particular, one of the use cases for this feature is bug finding. Given some misbehavior in your program, you might go looking through the documentation of your dependencies, and then the source of those dependencies, to find the bug. Inevitably, the versions linked from the public documentation won't exactly match the versions selected by your binary. Given that, is it better to show the reader an older version of a dependency or a newer version? Reading the older version might cause the bug hunter to spot non-relevant bugs that have been fixed in newer versions, and report them. Reading the newer version might cause the bug hunter to incorrectly conclude there was no bug when in fact a bug exists in the version they are using, but was fixed in the latest version. On balance, I think we should still aim to choose the latest version.

There's also the problem of what to do for self-hosted docs, where there is no versioned URL scheme like docs.rs. If a set of self-hosted docs has packages foo and bar, and foo's source references bar's source by line number, then we have a hard dependency that foo and bar were built at the same time and foo selected the same version of bar that is being displayed. It would be better if self-hosted docs could be more loosely coupled with respect to version.

About the cross-crate links solution you suggested with ?source=true
It wouldn't work for private items as they don't have a public documentation page

Except source code from crate foo cannot refer to private items in crate bar, so this problem doesn't come up. There's a smaller version of this problem: if crate foo uses a #[doc(hidden)] item from crate bar, there would be no documentation page for it. But I think that is an okay tradeoff.


### Jump to documentation

All items with a documentation page should have a link to it (like methods, trait/impl associated items, etc). However, fields, variants and impl blocks should not (explications below).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
All items with a documentation page should have a link to it (like methods, trait/impl associated items, etc). However, fields, variants and impl blocks should not (explications below).
Each public [item](https://doc.rust-lang.org/reference/items.html) should have a link to its documentation (like methods, trait/impl associated items, etc). Fields, variants and impl blocks should not.

Fixing the plurality agreement: "all" vs "it". Also I reorganized to make the antecedent of "it" clearer. And used the defined term "item" for better clarity, since I think "item" is identical to what you are trying to say here. Am I correct in thinking that?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it's much better, thanks!

@@ -324,6 +334,8 @@ External library (like `cpan` or `tree-sitter`): They would work to generate lin

[SourceGraph](https://sourcegraph.com/github.com/rust-lang/rustc-demangle@2811a1ad6f7c8bead2ef3671e4fdc10de1553e96/-/blob/src/v0.rs?L65): SourceGraph uses [Language Server Index Format (LSIF)](https://github.com/Microsoft/language-server-protocol/blob/main/indexFormat/specification.md) to extract info from a language server and make it available in a static format. And rust-analyzer bindings for LSIF already exist. The problem is that it would require a live server to work, which is a blocker to be integrated into rustdoc as it must work serverless.

Another argument in favour of not relying on an external tool is to have support for cross-crate links as it is a very important part of any rust project. For example, if you're browsing a crate "A" and you encounter an item coming from crate "B" (because crates often delegate significant portions of their implementations to sub-crates, for compile performance and other reasons), it seems obvious that it should be able to link to it. This means any tool that does a good job of navigating Rust source code needs to be aware of that code's dependencies, and how to link directly to source code pages within those dependencies. For instance, `askama_shared/filters/json.rs.html` needs to link to `serde::Serialize`. Rustdoc can do that because it builds a copy of all dependencies when building documentation, and it can use reliable mapping from (crate, version) to a URL pointing to the source code. GitHub or Google Code Search can't really do this mppaing while staying within their own code navigation system, because there is not a reliable mapping from (crate, version) to (repository URL, tag).
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point of this argument is: because rustdoc already knows where the dep docs are, and GitHub/Google Code Search probably never will (at least not with confidence), this is an advantage that code navigation within rustdoc's source view has over code navigation within GitHub/Google Code Search.

Here's a concrete example (unfortunately it will only work if you are in GitHub's Code Search preview):

Visit https://cs.github.com/servo/servo/blob/9901cb399320e67c3ad2d62bf3a51d6ca993667b/components/net/decoder.rs#L102, part of the servo source code. It defines a gzip decoder:


/// A gzip decoder that reads from a `libflate::gzip::Decoder` into a `BytesMut` and emits the results
/// as a `Bytes`.
struct Gzip {
    inner: Box<gzip::Decoder<Peeked<ReadableChunks<Body>>>>,
    buf: BytesMut,
    reader: Arc<Mutex<ReadableChunks<Body>>>,
}

How do I get to the definition of gzip::Decoder? If I click on it, GitHub Code Search tells me the definition is in the same file, line 69. Well, that's just plain wrong - that's another struct that happens to share the name Decoder.

image

But even setting aside that mistake - could GitHub Code Search find the source for libflate::non_blocking::gzip::Decoder and link me there? I think not; it's in a different repository, and I strongly suspect GitHub Code Search does not have any notion of cross-repository dependencies that follows Rust's package graph.

In practice, hard to test because even clicking the link to search "all repos" fails to find libflate::non_blocking::gzip::Decoder, presumably because it is not yet indexed.

@GuillaumeGomez
Copy link
Owner Author

GuillaumeGomez commented Oct 22, 2022

This is great! Thanks for clarifying that. Now that that is clear, it demonstrates another requirement: Because we plan to turn it on by default, we have a requirement that this feature cannot break doc building for any crate where doc building works today. This is why the question of functions that fail type checking is so important, because it seems to have an impact on whether docs will successfully build.

No it doesn't impact it. It displays compiler errors but as I mentioned, it doesn't impact the documentation process. It's just that this ident won't have a link generated. Maybe my wording wasn't clear enough, I'll try to improve it.

Most Rust libraries don't specify an exact version for their dependencies, but a semver version requirement in Cargo.toml (see Cargo FAQ about libraries and Cargo.lock).

That means the library could be built against any one of several versions of its dependencies, with the selection done when an a binary crate is built for the first time or cargo update is run for a binary crate.

If you're looking at the source for a library crate, and click through into the source code for one of its dependencies, which version of that dependency should you wind up at?

1. The minimum version allowed by the semver version requirements.

2. The current maximum version allowed by the semver version requirements.

3. At the time the library's docs were built, the version that happened to be the maximum allowed by the semver version requirements.

Ideally we'd like (2). Currently, for pub use statements, we use (3) on docs.rs. Mostly this isn't a problem, because pub use statements link to documentation of elements of the public API, which is relatively stale. However, it may be more of a problem when jump to definition is enabled by default, because we will be generating links to implementation, which can and does change between versions.

This is where I'm a bit lost at understanding why it's an issue: the dependency's doc version on docs.rs will remain forever so why is it a problem? Also, like I mentioned, trying to keep track of a private item across versions would be a nightmare. I think only targeting one specific version solves all these problems and doesn't require adding new features.

In particular, one of the use cases for this feature is bug finding. Given some misbehavior in your program, you might go looking through the documentation of your dependencies, and then the source of those dependencies, to find the bug. Inevitably, the versions linked from the public documentation won't exactly match the versions selected by your binary. Given that, is it better to show the reader an older version of a dependency or a newer version? Reading the older version might cause the bug hunter to spot non-relevant bugs that have been fixed in newer versions, and report them. Reading the newer version might cause the bug hunter to incorrectly conclude there was no bug when in fact a bug exists in the version they are using, but was fixed in the latest version. On balance, I think we should still aim to choose the latest version.

But in this case, won't the user look directly at the dependency isn't of going through its own crate? On docs.rs, as you said, it'll target the last version currently published. What we can do though is to target the last version for doc links because they will always work contrary to source links.

There's also the problem of what to do for self-hosted docs, where there is no versioned URL scheme like docs.rs. If a set of self-hosted docs has packages foo and bar, and foo's source references bar's source by line number, then we have a hard dependency that foo and bar were built at the same time and foo selected the same version of bar that is being displayed. It would be better if self-hosted docs could be more loosely coupled with respect to version.

That seems to be a completely different issue that isn't only related to "jump to definition" but to rustdoc's URL scheme in general. "jump to definition" shares the same URL scheme (and the same code to generate URLs) with all other links so if we want to add such a feature in the future, it can totally be done.

Except source code from crate foo cannot refer to private items in crate bar, so this problem doesn't come up. There's a smaller version of this problem: if crate foo uses a #[doc(hidden)] item from crate bar, there would be no documentation page for it. But I think that is an okay tradeoff.

Oh you're right. Private items can't be used downstream. I'm still not convinced because I think we're trying to fix another problem but as I see it, it can be achieved the same way as changing the URL scheme.

Copy link

@fmease fmease left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added relatively minor suggestions. I haven't read all 139 GitHub comments but several of them, catching up a bit.

text/000-rustdoc-jump-to-definition.md Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved
text/000-rustdoc-jump-to-definition.md Outdated Show resolved Hide resolved

The idea here would be to follow the current URL strategy in use for cross-crate items: generate a URL relative to the root path. If `--extern-html-root-url` isn't used, then the root-path is the one provided to rustdoc where to build documentation.

This is somewhat the same answer for `cargo --no-deps`: unless rustdoc actually checks if the dependencies' locations are empty or not, there is no way for rustdoc to know if they're present. As such, it should simply assume that they are present and generate links for them as it does for the foreign items in the documentation pages.
Copy link

@fmease fmease Oct 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand the whole cross-crate links section correctly, you seem to be saying that it is not possible at all for rustdoc to detect if a to-be-generated link would point to a nonexistent location.

However, from the limited experience I've gathered from writing several A-cross-crate-reexport PRs, I've learned that rustdoc does know if a cross-crate link wouldn't resolve. Maybe that's not always the case but at least it is for code using html::format::href().

So from what I know, rustdoc should be able to “gray out” / not generate (most? all?) links that would otherwise be dead. Specifically, the following paragraph should be updated to mention this:

As such, it should simply assume that they are present and generate links for them as it does for the foreign items in the documentation pages.

Or am I missing something major?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to mean not the cross-crate item itself but the actual HTML file destination. As I mentioned, you can run cargo --no-deps and in this case, only your crate will be documented and not its dependencies. As such, any link to any dependency will result in a 404. Does it clear the confusion? How could I rewrite it to make it less confusing?

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to improve the wording and explanations of this section.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants