Skip to content

Commit

Permalink
Unrolled build for rust-lang#120736
Browse files Browse the repository at this point in the history
Rollup merge of rust-lang#120736 - notriddle:notriddle/toc, r=t-rustdoc

rustdoc: add header map to the table of contents

## Summary

Add header sections to the sidebar TOC.

### Preview

![image](https://github.com/user-attachments/assets/eae4df02-86aa-4df4-8c61-a95685cd8829)

* http://notriddle.com/rustdoc-html-demo-9/toc/rust/std/index.html
* http://notriddle.com/rustdoc-html-demo-9/toc/rust-derive-builder/derive_builder/index.html

## Motivation

Some pages are very wordy, like these.

| crate | word count |
|--|--|
| [std::option](https://doc.rust-lang.org/stable/std/option/index.html) | 2,138
| [derive_builder](https://docs.rs/derive_builder/0.13.0/derive_builder/index.html) | 2,403
| [tracing](https://docs.rs/tracing/0.1.40/tracing/index.html) | 3,912
| [regex](https://docs.rs/regex/1.10.3/regex/index.html) | 8,412

This kind of very long document is more navigable with a table of contents, like Wikipedia's or the one [GitHub recently added](https://github.blog/changelog/2021-04-13-table-of-contents-support-in-markdown-files/) for READMEs.

In fact, the use case is so compelling, that it's been requested multiple times and implemented in an extension:

* rust-lang#80858
* rust-lang#28056
* rust-lang#14475
* https://rust.extension.sh/#show-table-of-content

(Some of these issues ask for more than this, so don’t close them.)

It's also been implemented by hand in some crates, because the author really thought it was needed. Protip: for a more exhaustive list, run [`site:docs.rs table of contents`](https://duckduckgo.com/?t=ffab&q=site%3Adocs.rs+table+of+contents&ia=web), though some of them are false positives.

* https://docs.rs/figment/0.10.14/figment/index.html#table-of-contents
* https://docs.rs/csv/1.3.0/csv/tutorial/index.html#table-of-contents
* https://docs.rs/axum/0.7.4/axum/response/index.html#table-of-contents
* https://docs.rs/regex-automata/0.4.5/regex_automata/index.html#table-of-contents

Unfortunately for these hand-built ToCs, because they're just part of the docs, there's no consistent way to turn them off if the reader doesn't want them. It's also more complicated to ensure they stay in sync with the docs they're supposed to describe, and they don't stay with you when you scroll like Wikipedia's [does now](https://uxdesign.cc/design-notes-on-the-2023-wikipedia-redesign-d6573b9af28d).

## Guide-level explanation

When writing docs for a top-level item, the first and second level of headers will be shown in an outline in the sidebar. In this context, "top level" means "not associated".

This means, if you're writing very long guides or explanations, and you want it to have a table of contents in the sidebar for its headings, the ideal place to attach it is usually the *module* or *crate*, because this page has fewer other things on it (and is the ideal place to describe "cross-cutting concerns" for its child items).

If you're reading documentation, and want to get rid of the table of contents, open the ![image](https://github.com/rust-lang/rust/assets/1593513/2ad82466-5fe3-4684-b1c2-6be4c99a8666) Settings panel and checkmark "Hide table of contents."

## Reference-level explanation

Top-level items have an outline generated. This works for potentially-malformed header trees by pairing a header with the nearest header with a higher level. For example:

```markdown
## A
# B
# C
## D
## E
```

A, B, and C are all siblings, and D and E are children of C.

Rustdoc only presents two layers of tree, but it tracks up to the full depth of 6 while preparing it.

That means that these two doc comment both generate the same outline:

```rust
/// # First
/// ## Second
struct One;
/// ## First
/// ### Second
struct Two;
```

## Drawbacks

The biggest drawback is adding more stuff to the sidebar.

My crawl through docs.rs shows this to, surprisingly, be less of a problem than I thought. The manually-built tables of contents, and the pages with dozens of headers, usually seem to be modules or crates, not types (where extreme scrolling would become a problem, since they already have methods to deal with).

The best example of a type with many headers is [vec::Vec](https://doc.rust-lang.org/1.75.0/std/vec/struct.Vec.html), which still only has five headers, not dozens like [axum::extract](https://docs.rs/axum/0.7.4/axum/extract/index.html).

## Rationale and alternatives

### Why in the existing sidebar?

The method links and the top-doc header links have more in common with each other than either of them do with the "In [parent module]" links, and should go together.

### Why limited to two levels?

The sidebar is pretty narrow, and I don't want too much space used by indentation. Making the sidebar wider, while it has some upsides, also takes up more space on middling-sized screens or tiled WMs.

### Why not line wrap?

That behaves strangely when resizing.

## Prior art

### Doc generators that have TOC for headers

https://hexdocs.pm/phoenix/Phoenix.Controller.html is very close, in the sense that it also has header sections directly alongside functions and types.

Another example, referenced as part of the [early sidebar discussion](rust-lang#37856) that added methods, Ruby will show a table of contents in the sidebar (for example, on the [ARGF](https://docs.ruby-lang.org/en/master/ARGF.html) class). According to their changelog, [they added it in 2013](https://github.com/ruby/rdoc/blob/06137bde8ccc48cd502bc28178bcd8f2dfe37624/History.rdoc#400--2013-02-24-).

Haskell seems to mix text and functions even more freely than Elixir. For example, this [Naming conventions](https://hackage.haskell.org/package/base-4.19.0.0/docs/Control-Monad.html#g:3) is plain text, and is immediately followed by functions. And the [Pandoc top level](https://hackage.haskell.org/package/pandoc-3.1.11.1/docs/Text-Pandoc.html) has items split up by function, rather than by kind. Their TOC matches exactly with the contents of the page.

### Doc generators that don't have header TOC, but still have headers

Elm, interestingly enough, seems to have the same setup that Rust used to have: sibling navigation between modules, and no index within a single page. [They keep Haskell's habit of named sections with machine-generated type signatures](https://package.elm-lang.org/packages/elm/browser/latest/Browser-Dom), though.

[PHP](https://www.php.net/manual/en/book.datetime.php), like elm, also has a right-hand sidebar with sibling navigation. However, PHP has a single page for a single method, unlike Rust's page for an entire "class." So even though these pages have headers, it's never more than ten at most. And when they have guides, those guides are also multi-page.

## Unresolved questions

* Writing recommendations for anyone who wants to take advantage of this.
* Right now, it does not line wrap. That might be a bad idea: a lot of these are getting truncated.
* Split sidebars, which I [tried implementing](https://rust-lang.zulipchat.com/#narrow/stream/266220-t-rustdoc/topic/Table.20of.20contents), are not required. The TOC can be turned off, if it's really a problem. Implemented in rust-lang#120818, but needs more, separate, discussion.

## Future possibilities

I would like to do a better job of distinguishing global navigation from local navigation. Rustdoc has a pretty reasonable information architecture, if only we did a better job of communicating it.

This PR aims, mostly, to help doc authors help their users by writing docs that can be more effectively skimmed. But it doesn't do anything to make it easier to tell the TOC and the Module Nav apart.
  • Loading branch information
rust-timer authored Sep 5, 2024
2 parents 009e738 + bead042 commit e0decce
Show file tree
Hide file tree
Showing 24 changed files with 539 additions and 153 deletions.
2 changes: 1 addition & 1 deletion src/doc/not_found.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

<!-- Completely hide the TOC and the section numbers -->
<style type="text/css">
#TOC { display: none; }
#rustdoc-toc { display: none; }
.header-section-number { display: none; }
li {list-style-type: none; }
#search-input {
Expand Down
6 changes: 0 additions & 6 deletions src/librustdoc/clean/types.rs
Original file line number Diff line number Diff line change
Expand Up @@ -512,9 +512,6 @@ impl Item {
pub(crate) fn is_mod(&self) -> bool {
self.type_() == ItemType::Module
}
pub(crate) fn is_trait(&self) -> bool {
self.type_() == ItemType::Trait
}
pub(crate) fn is_struct(&self) -> bool {
self.type_() == ItemType::Struct
}
Expand Down Expand Up @@ -542,9 +539,6 @@ impl Item {
pub(crate) fn is_ty_method(&self) -> bool {
self.type_() == ItemType::TyMethod
}
pub(crate) fn is_type_alias(&self) -> bool {
self.type_() == ItemType::TypeAlias
}
pub(crate) fn is_primitive(&self) -> bool {
self.type_() == ItemType::Primitive
}
Expand Down
73 changes: 62 additions & 11 deletions src/librustdoc/html/markdown.rs
Original file line number Diff line number Diff line change
Expand Up @@ -51,12 +51,12 @@ use tracing::{debug, trace};
use crate::clean::RenderedLink;
use crate::doctest;
use crate::doctest::GlobalTestOptions;
use crate::html::escape::Escape;
use crate::html::escape::{Escape, EscapeBodyText};
use crate::html::format::Buffer;
use crate::html::highlight;
use crate::html::length_limit::HtmlWithLimit;
use crate::html::render::small_url_encode;
use crate::html::toc::TocBuilder;
use crate::html::toc::{Toc, TocBuilder};

#[cfg(test)]
mod tests;
Expand Down Expand Up @@ -102,6 +102,7 @@ pub struct Markdown<'a> {
/// A struct like `Markdown` that renders the markdown with a table of contents.
pub(crate) struct MarkdownWithToc<'a> {
pub(crate) content: &'a str,
pub(crate) links: &'a [RenderedLink],
pub(crate) ids: &'a mut IdMap,
pub(crate) error_codes: ErrorCodes,
pub(crate) edition: Edition,
Expand Down Expand Up @@ -533,9 +534,11 @@ impl<'a, 'b, 'ids, I: Iterator<Item = SpannedEvent<'a>>> Iterator
let id = self.id_map.derive(id);

if let Some(ref mut builder) = self.toc {
let mut text_header = String::new();
plain_text_from_events(self.buf.iter().map(|(ev, _)| ev.clone()), &mut text_header);
let mut html_header = String::new();
html::push_html(&mut html_header, self.buf.iter().map(|(ev, _)| ev.clone()));
let sec = builder.push(level as u32, html_header, id.clone());
html_text_from_events(self.buf.iter().map(|(ev, _)| ev.clone()), &mut html_header);
let sec = builder.push(level as u32, text_header, html_header, id.clone());
self.buf.push_front((Event::Html(format!("{sec} ").into()), 0..0));
}

Expand Down Expand Up @@ -1412,10 +1415,23 @@ impl Markdown<'_> {
}

impl MarkdownWithToc<'_> {
pub(crate) fn into_string(self) -> String {
let MarkdownWithToc { content: md, ids, error_codes: codes, edition, playground } = self;
pub(crate) fn into_parts(self) -> (Toc, String) {
let MarkdownWithToc { content: md, links, ids, error_codes: codes, edition, playground } =
self;

let p = Parser::new_ext(md, main_body_opts()).into_offset_iter();
// This is actually common enough to special-case
if md.is_empty() {
return (Toc { entries: Vec::new() }, String::new());
}
let mut replacer = |broken_link: BrokenLink<'_>| {
links
.iter()
.find(|link| &*link.original_text == &*broken_link.reference)
.map(|link| (link.href.as_str().into(), link.tooltip.as_str().into()))
};

let p = Parser::new_with_broken_link_callback(md, main_body_opts(), Some(&mut replacer));
let p = p.into_offset_iter();

let mut s = String::with_capacity(md.len() * 3 / 2);

Expand All @@ -1429,7 +1445,11 @@ impl MarkdownWithToc<'_> {
html::push_html(&mut s, p);
}

format!("<nav id=\"TOC\">{toc}</nav>{s}", toc = toc.into_toc().print())
(toc.into_toc(), s)
}
pub(crate) fn into_string(self) -> String {
let (toc, s) = self.into_parts();
format!("<nav id=\"rustdoc\">{toc}</nav>{s}", toc = toc.print())
}
}

Expand Down Expand Up @@ -1608,7 +1628,16 @@ pub(crate) fn plain_text_summary(md: &str, link_names: &[RenderedLink]) -> Strin

let p = Parser::new_with_broken_link_callback(md, summary_opts(), Some(&mut replacer));

for event in p {
plain_text_from_events(p, &mut s);

s
}

pub(crate) fn plain_text_from_events<'a>(
events: impl Iterator<Item = pulldown_cmark::Event<'a>>,
s: &mut String,
) {
for event in events {
match &event {
Event::Text(text) => s.push_str(text),
Event::Code(code) => {
Expand All @@ -1623,8 +1652,29 @@ pub(crate) fn plain_text_summary(md: &str, link_names: &[RenderedLink]) -> Strin
_ => (),
}
}
}

s
pub(crate) fn html_text_from_events<'a>(
events: impl Iterator<Item = pulldown_cmark::Event<'a>>,
s: &mut String,
) {
for event in events {
match &event {
Event::Text(text) => {
write!(s, "{}", EscapeBodyText(text)).expect("string alloc infallible")
}
Event::Code(code) => {
s.push_str("<code>");
write!(s, "{}", EscapeBodyText(code)).expect("string alloc infallible");
s.push_str("</code>");
}
Event::HardBreak | Event::SoftBreak => s.push(' '),
Event::Start(Tag::CodeBlock(..)) => break,
Event::End(TagEnd::Paragraph) => break,
Event::End(TagEnd::Heading(..)) => break,
_ => (),
}
}
}

#[derive(Debug)]
Expand Down Expand Up @@ -1975,7 +2025,8 @@ fn init_id_map() -> FxHashMap<Cow<'static, str>, usize> {
map.insert("default-settings".into(), 1);
map.insert("sidebar-vars".into(), 1);
map.insert("copy-path".into(), 1);
map.insert("TOC".into(), 1);
map.insert("rustdoc-toc".into(), 1);
map.insert("rustdoc-modnav".into(), 1);
// This is the list of IDs used by rustdoc sections (but still generated by
// rustdoc).
map.insert("fields".into(), 1);
Expand Down
6 changes: 4 additions & 2 deletions src/librustdoc/html/render/context.rs
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ use rustc_span::{sym, FileName, Symbol};
use tracing::info;

use super::print_item::{full_path, item_path, print_item};
use super::sidebar::{print_sidebar, sidebar_module_like, Sidebar};
use super::sidebar::{print_sidebar, sidebar_module_like, ModuleLike, Sidebar};
use super::write_shared::write_shared;
use super::{collect_spans_and_sources, scrape_examples_help, AllTypes, LinkFromSrc, StylePath};
use crate::clean::types::ExternalLocation;
Expand Down Expand Up @@ -617,12 +617,14 @@ impl<'tcx> FormatRenderer<'tcx> for Context<'tcx> {
let all = shared.all.replace(AllTypes::new());
let mut sidebar = Buffer::html();

let blocks = sidebar_module_like(all.item_sections());
// all.html is not customizable, so a blank id map is fine
let blocks = sidebar_module_like(all.item_sections(), &mut IdMap::new(), ModuleLike::Crate);
let bar = Sidebar {
title_prefix: "",
title: "",
is_crate: false,
is_mod: false,
parent_is_crate: false,
blocks: vec![blocks],
path: String::new(),
};
Expand Down
Loading

0 comments on commit e0decce

Please sign in to comment.