Skip to content

Commit

Permalink
Unrolled build for rust-lang#118194
Browse files Browse the repository at this point in the history
Rollup merge of rust-lang#118194 - notriddle:notriddle/tuple-unit, r=GuillaumeGomez

rustdoc: search for tuples and unit by type with `()`

This feature extends rustdoc to support the syntax that most users will naturally attempt to use to search for tuples. Part of rust-lang#60485

Function signature searches already support tuples and unit. The explicit name `primitive:tuple` and `primitive:unit` can be used to match a tuple or unit, while `()` will match either one. It also follows the direction set by the actual language for parens as a group, so `(u8,)` will only match a tuple, while `(u8)` will match a plain, unwrapped byte—thanks to loose search semantics, it will also match the tuple.

## Preview

* [`option<t>, option<u> -> (t, u)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=option%3Ct%3E%2C option%3Cu%3E -%3E (t%2C u)>)
* [`[t] -> (t,)`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=[t] -%3E (t%2C)>)
* [`(ipaddr,) -> socketaddr`](<https://notriddle.com/rustdoc-html-demo-5/tuple-unit/std/index.html?search=(ipaddr%2C) -%3E socketaddr>)

## Motivation

When type-based search was first landed, it was directly [described as incomplete][a comment].

[a comment]: rust-lang#23289 (comment)

Filling out the missing functionality is going to mean adding support for more of Rust's [type expression] syntax, such as tuples (in this PR), references, raw pointers, function pointers, and closures.

[type expression]: https://doc.rust-lang.org/reference/types.html#type-expressions

There does seem to be demand for this sort of thing, such as [this Discord message](https://discord.com/channels/442252698964721669/443150878111694848/1042145740065099796) expressing regret at rustdoc not supporting tuples in search queries.

## Reference description (from the Rustdoc book)

<table>
<thead>
  <tr>
    <th>Shorthand</th>
    <th>Explicit names</th>
  </tr>
</thead>
<tbody>
  <tr><td colspan="2">Before this PR</td></tr>
  <tr>
    <td><code>[]</code></td>
    <td><code>primitive:slice</code> and/or <code>primitive:array</code></td>
  </tr>
  <tr>
    <td><code>[T]</code></td>
    <td><code>primitive:slice&lt;T&gt;</code> and/or <code>primitive:array&lt;T&gt;</code></td>
  </tr>
  <tr>
    <td><code>!</code></td>
    <td><code>primitive:never</code></td>
  </tr>
  <tr><td colspan="2">After this PR</td></tr>
  <tr>
    <td><code>()</code></td>
    <td><code>primitive:unit</code> and/or <code>primitive:tuple</code></td>
  </tr>
  <tr>
    <td><code>(T)</code></td>
    <td><code>T</code></td>
  </tr>
  <tr>
    <td><code>(T,)</code></td>
    <td><code>primitive:tuple&lt;T&gt;</code></td>
  </tr>
</tbody>
</table>

A single type expression wrapped in parens is the same as that type expression, since parens act as the grouping operator. If they're empty, though, they will match both `unit` and `tuple`, and if there's more than one type (or a trailing or leading comma) it is the same as `primitive:tuple<...>`.

However, since items can be left out of the query, `(T)` will still return results for types that match tuples, even though it also matches the type on its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it also matches `Result<u32, Error>`.

## Future direction

The [type expression grammar](https://doc.rust-lang.org/reference/types.html#type-expressions) from the Reference is given below:

<pre><code>Syntax
    Type :
        TypeNoBounds
        | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitType</a>
        | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectType</a>
<br>
    TypeNoBounds :
        <a href="https://doc.rust-lang.org/reference/types.html#parenthesized-types">ParenthesizedType</a>
        | <a href="https://doc.rust-lang.org/reference/types/impl-trait.html">ImplTraitTypeOneBound</a>
        | <a href="https://doc.rust-lang.org/reference/types/trait-object.html">TraitObjectTypeOneBound</a>
        | <a href="https://doc.rust-lang.org/reference/paths.html#paths-in-types">TypePath</a>
        | <a href="https://doc.rust-lang.org/reference/types/tuple.html#tuple-types">TupleType</a>
        | <a href="https://doc.rust-lang.org/reference/types/never.html">NeverType</a>
        | <a href="https://doc.rust-lang.org/reference/types/pointer.html#raw-pointers-const-and-mut">RawPointerType</a>
        | <a href="https://doc.rust-lang.org/reference/types/pointer.html#shared-references-">ReferenceType</a>
        | <a href="https://doc.rust-lang.org/reference/types/array.html">ArrayType</a>
        | <a href="https://doc.rust-lang.org/reference/types/slice.html">SliceType</a>
        | <a href="https://doc.rust-lang.org/reference/types/inferred.html">InferredType</a>
        | <a href="https://doc.rust-lang.org/reference/paths.html#qualified-paths">QualifiedPathInType</a>
        | <a href="https://doc.rust-lang.org/reference/types/function-pointer.html">BareFunctionType</a>
        | <a href="https://doc.rust-lang.org/reference/macros.html#macro-invocation">MacroInvocation</a>
</code></pre>

ImplTraitType and TraitObjectType (and ImplTraitTypeOneBound and TraitObjectTypeOneBound) are not yet implemented. They would mostly desugar to `trait:`, similarly to how `!` desugars to `primitive:never`.

ParenthesizedType and TuplePath are added in this PR.

TypePath is already implemented (except const generics, which is not planned, and function-like trait syntax, which is planned as part of closure support).

NeverType is already implemented.

RawPointerType and ReferenceType require parsing and fixes to the search index to store this information, but otherwise their behavior seems simple enough. Just like tuples and slices, `&T` would be equivalent to `primitive:reference<T>`, `&mut T` would be equivalent to `primitive:reference<keyword:mut, T>`, `*T` would be equivalent to `primitive:pointer<T>`, `*mut T` would be equivalent to `primitive:pointer<keyword:mut, T>`, and `*const T` would be equivalent to `primitive:pointer<keyword:const, T>`. Lifetime generics support is not planned, because lifetime subtyping seems too complicated.

ArrayType is subsumed by SliceType right now. Implementing const generics is not planned, because it seems like it would require a lot of implementation complexity for not much gain.

InferredType isn't really covered right now. Its semantics in a search context are not obvious.

QualifiedPathInType is not implemented, and it is not planned. I would need a use case to justify it, and act as a guide for what the exact semantics should be.

BareFunctionType is not implemented. Along with function-like trait syntax, which is formally considered a TypePath, it's the biggest missing feature to be able to do structured searches over generic APIs like `Option`.

MacroInvocation is not parsed (macro names are, but they don't mean the same thing here at all). Those are gone by the time Rustdoc sees the source code.
  • Loading branch information
rust-timer authored Jan 6, 2024
2 parents 9212108 + e74339b commit 516bbf2
Show file tree
Hide file tree
Showing 9 changed files with 596 additions and 54 deletions.
47 changes: 36 additions & 11 deletions src/doc/rustdoc/src/read-documentation/search.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,15 +147,38 @@ will match these queries:
* `Read -> Result<Vec<u8>, Error>`
* `Read -> Result<Error, Vec>`
* `Read -> Result<Vec<u8>>`
* `Read -> u8`

But it *does not* match `Result<Vec, u8>` or `Result<u8<Vec>>`.

Function signature searches also support arrays and slices. The explicit name
`primitive:slice<u8>` and `primitive:array<u8>` can be used to match a slice
or array of bytes, while square brackets `[u8]` will match either one. Empty
square brackets, `[]`, will match any slice or array regardless of what
it contains, while a slice with a type parameter, like `[T]`, will only match
functions that actually operate on generic slices.
### Primitives with Special Syntax

| Shorthand | Explicit names |
| --------- | ------------------------------------------------ |
| `[]` | `primitive:slice` and/or `primitive:array` |
| `[T]` | `primitive:slice<T>` and/or `primitive:array<T>` |
| `()` | `primitive:unit` and/or `primitive:tuple` |
| `(T)` | `T` |
| `(T,)` | `primitive:tuple<T>` |
| `!` | `primitive:never` |

When searching for `[]`, Rustdoc will return search results with either slices
or arrays. If you know which one you want, you can force it to return results
for `primitive:slice` or `primitive:array` using the explicit name syntax.
Empty square brackets, `[]`, will match any slice or array regardless of what
it contains, or an item type can be provided, such as `[u8]` or `[T]`, to
explicitly find functions that operate on byte slices or generic slices,
respectively.

A single type expression wrapped in parens is the same as that type expression,
since parens act as the grouping operator. If they're empty, though, they will
match both `unit` and `tuple`, and if there's more than one type (or a trailing
or leading comma) it is the same as `primitive:tuple<...>`.

However, since items can be left out of the query, `(T)` will still return
results for types that match tuples, even though it also matches the type on
its own. That is, `(u32)` matches `(u32,)` for the exact same reason that it
also matches `Result<u32, Error>`.

### Limitations and quirks of type-based search

Expand Down Expand Up @@ -188,11 +211,10 @@ Most of these limitations should be addressed in future version of Rustdoc.
that you don't want a type parameter, you can force it to match
something else by giving it a different prefix like `struct:T`.

* It's impossible to search for references, pointers, or tuples. The
* It's impossible to search for references or pointers. The
wrapped types can be searched for, so a function that takes `&File` can
be found with `File`, but you'll get a parse error when typing an `&`
into the search field. Similarly, `Option<(T, U)>` can be matched with
`Option<T, U>`, but `(` will give a parse error.
into the search field.

* Searching for lifetimes is not supported.

Expand All @@ -216,8 +238,9 @@ Item filters can be used in both name-based and type signature-based searches.
```text
ident = *(ALPHA / DIGIT / "_")
path = ident *(DOUBLE-COLON ident) [!]
slice = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
arg = [type-filter *WS COLON *WS] (path [generics] / slice / [!])
slice-like = OPEN-SQUARE-BRACKET [ nonempty-arg-list ] CLOSE-SQUARE-BRACKET
tuple-like = OPEN-PAREN [ nonempty-arg-list ] CLOSE-PAREN
arg = [type-filter *WS COLON *WS] (path [generics] / slice-like / tuple-like / [!])
type-sep = COMMA/WS *(COMMA/WS)
nonempty-arg-list = *(type-sep) arg *(type-sep arg) *(type-sep)
generic-arg-list = *(type-sep) arg [ EQUAL arg ] *(type-sep arg [ EQUAL arg ]) *(type-sep)
Expand Down Expand Up @@ -263,6 +286,8 @@ OPEN-ANGLE-BRACKET = "<"
CLOSE-ANGLE-BRACKET = ">"
OPEN-SQUARE-BRACKET = "["
CLOSE-SQUARE-BRACKET = "]"
OPEN-PAREN = "("
CLOSE-PAREN = ")"
COLON = ":"
DOUBLE-COLON = "::"
QUOTE = %x22
Expand Down
3 changes: 3 additions & 0 deletions src/librustdoc/html/render/search_index.rs
Original file line number Diff line number Diff line change
Expand Up @@ -566,6 +566,9 @@ fn get_index_type_id(
// The type parameters are converted to generics in `simplify_fn_type`
clean::Slice(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Slice)),
clean::Array(_, _) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Array)),
clean::Tuple(ref n) if n.is_empty() => {
Some(RenderTypeId::Primitive(clean::PrimitiveType::Unit))
}
clean::Tuple(_) => Some(RenderTypeId::Primitive(clean::PrimitiveType::Tuple)),
clean::QPath(ref data) => {
if data.self_type.is_self_type()
Expand Down
100 changes: 71 additions & 29 deletions src/librustdoc/html/static/js/search.js
Original file line number Diff line number Diff line change
Expand Up @@ -260,6 +260,18 @@ function initSearch(rawSearchIndex) {
* Special type name IDs for searching by both array and slice (`[]` syntax).
*/
let typeNameIdOfArrayOrSlice;
/**
* Special type name IDs for searching by tuple.
*/
let typeNameIdOfTuple;
/**
* Special type name IDs for searching by unit.
*/
let typeNameIdOfUnit;
/**
* Special type name IDs for searching by both tuple and unit (`()` syntax).
*/
let typeNameIdOfTupleOrUnit;

/**
* Add an item to the type Name->ID map, or, if one already exists, use it.
Expand Down Expand Up @@ -295,11 +307,7 @@ function initSearch(rawSearchIndex) {
}

function isEndCharacter(c) {
return "=,>-]".indexOf(c) !== -1;
}

function isErrorCharacter(c) {
return "()".indexOf(c) !== -1;
return "=,>-])".indexOf(c) !== -1;
}

function itemTypeFromName(typename) {
Expand Down Expand Up @@ -585,8 +593,6 @@ function initSearch(rawSearchIndex) {
throw ["Unexpected ", "!", ": it can only be at the end of an ident"];
}
foundExclamation = parserState.pos;
} else if (isErrorCharacter(c)) {
throw ["Unexpected ", c];
} else if (isPathSeparator(c)) {
if (c === ":") {
if (!isPathStart(parserState)) {
Expand Down Expand Up @@ -616,11 +622,14 @@ function initSearch(rawSearchIndex) {
}
} else if (
c === "[" ||
c === "(" ||
isEndCharacter(c) ||
isSpecialStartCharacter(c) ||
isSeparatorCharacter(c)
) {
break;
} else if (parserState.pos > 0) {
throw ["Unexpected ", c, " after ", parserState.userQuery[parserState.pos - 1]];
} else {
throw ["Unexpected ", c];
}
Expand Down Expand Up @@ -661,43 +670,56 @@ function initSearch(rawSearchIndex) {
skipWhitespace(parserState);
let start = parserState.pos;
let end;
if (parserState.userQuery[parserState.pos] === "[") {
if ("[(".indexOf(parserState.userQuery[parserState.pos]) !== -1) {
let endChar = ")";
let name = "()";
let friendlyName = "tuple";

if (parserState.userQuery[parserState.pos] === "[") {
endChar = "]";
name = "[]";
friendlyName = "slice";
}
parserState.pos += 1;
getItemsBefore(query, parserState, generics, "]");
const { foundSeparator } = getItemsBefore(query, parserState, generics, endChar);
const typeFilter = parserState.typeFilter;
const isInBinding = parserState.isInBinding;
if (typeFilter !== null && typeFilter !== "primitive") {
throw [
"Invalid search type: primitive ",
"[]",
name,
" and ",
typeFilter,
" both specified",
];
}
parserState.typeFilter = null;
parserState.isInBinding = null;
parserState.totalElems += 1;
if (isInGenerics) {
parserState.genericsElems += 1;
}
for (const gen of generics) {
if (gen.bindingName !== null) {
throw ["Type parameter ", "=", " cannot be within slice ", "[]"];
throw ["Type parameter ", "=", ` cannot be within ${friendlyName} `, name];
}
}
elems.push({
name: "[]",
id: null,
fullPath: ["[]"],
pathWithoutLast: [],
pathLast: "[]",
normalizedPathLast: "[]",
generics,
typeFilter: "primitive",
bindingName: isInBinding,
bindings: new Map(),
});
if (name === "()" && !foundSeparator && generics.length === 1 && typeFilter === null) {
elems.push(generics[0]);
} else {
parserState.totalElems += 1;
if (isInGenerics) {
parserState.genericsElems += 1;
}
elems.push({
name: name,
id: null,
fullPath: [name],
pathWithoutLast: [],
pathLast: name,
normalizedPathLast: name,
generics,
bindings: new Map(),
typeFilter: "primitive",
bindingName: isInBinding,
});
}
} else {
const isStringElem = parserState.userQuery[start] === "\"";
// We handle the strings on their own mostly to make code easier to follow.
Expand Down Expand Up @@ -770,9 +792,11 @@ function initSearch(rawSearchIndex) {
* @param {Array<QueryElement>} elems - This is where the new {QueryElement} will be added.
* @param {string} endChar - This function will stop when it'll encounter this
* character.
* @returns {{foundSeparator: bool}}
*/
function getItemsBefore(query, parserState, elems, endChar) {
let foundStopChar = true;
let foundSeparator = false;
let start = parserState.pos;

// If this is a generic, keep the outer item's type filter around.
Expand All @@ -786,6 +810,8 @@ function initSearch(rawSearchIndex) {
extra = "<";
} else if (endChar === "]") {
extra = "[";
} else if (endChar === ")") {
extra = "(";
} else if (endChar === "") {
extra = "->";
} else {
Expand All @@ -802,6 +828,7 @@ function initSearch(rawSearchIndex) {
} else if (isSeparatorCharacter(c)) {
parserState.pos += 1;
foundStopChar = true;
foundSeparator = true;
continue;
} else if (c === ":" && isPathStart(parserState)) {
throw ["Unexpected ", "::", ": paths cannot start with ", "::"];
Expand Down Expand Up @@ -879,6 +906,8 @@ function initSearch(rawSearchIndex) {

parserState.typeFilter = oldTypeFilter;
parserState.isInBinding = oldIsInBinding;

return { foundSeparator };
}

/**
Expand Down Expand Up @@ -926,6 +955,8 @@ function initSearch(rawSearchIndex) {
break;
}
throw ["Unexpected ", c, " (did you mean ", "->", "?)"];
} else if (parserState.pos > 0) {
throw ["Unexpected ", c, " after ", parserState.userQuery[parserState.pos - 1]];
}
throw ["Unexpected ", c];
} else if (c === ":" && !isPathStart(parserState)) {
Expand Down Expand Up @@ -1599,6 +1630,11 @@ function initSearch(rawSearchIndex) {
) {
// [] matches primitive:array or primitive:slice
// if it matches, then we're fine, and this is an appropriate match candidate
} else if (queryElem.id === typeNameIdOfTupleOrUnit &&
(fnType.id === typeNameIdOfTuple || fnType.id === typeNameIdOfUnit)
) {
// () matches primitive:tuple or primitive:unit
// if it matches, then we're fine, and this is an appropriate match candidate
} else if (fnType.id !== queryElem.id || queryElem.id === null) {
return false;
}
Expand Down Expand Up @@ -1792,7 +1828,7 @@ function initSearch(rawSearchIndex) {
if (row.id > 0 && elem.id > 0 && elem.pathWithoutLast.length === 0 &&
typePassesFilter(elem.typeFilter, row.ty) && elem.generics.length === 0 &&
// special case
elem.id !== typeNameIdOfArrayOrSlice
elem.id !== typeNameIdOfArrayOrSlice && elem.id !== typeNameIdOfTupleOrUnit
) {
return row.id === elem.id || checkIfInList(
row.generics,
Expand Down Expand Up @@ -2886,12 +2922,15 @@ ${item.displayPath}<span class="${type}">${name}</span>\
*/
function buildFunctionTypeFingerprint(type, output, fps) {
let input = type.id;
// All forms of `[]` get collapsed down to one thing in the bloom filter.
// All forms of `[]`/`()` get collapsed down to one thing in the bloom filter.
// Differentiating between arrays and slices, if the user asks for it, is
// still done in the matching algorithm.
if (input === typeNameIdOfArray || input === typeNameIdOfSlice) {
input = typeNameIdOfArrayOrSlice;
}
if (input === typeNameIdOfTuple || input === typeNameIdOfUnit) {
input = typeNameIdOfTupleOrUnit;
}
// http://burtleburtle.net/bob/hash/integer.html
// ~~ is toInt32. It's used before adding, so
// the number stays in safe integer range.
Expand Down Expand Up @@ -2991,7 +3030,10 @@ ${item.displayPath}<span class="${type}">${name}</span>\
// that can be searched using `[]` syntax.
typeNameIdOfArray = buildTypeMapIndex("array");
typeNameIdOfSlice = buildTypeMapIndex("slice");
typeNameIdOfTuple = buildTypeMapIndex("tuple");
typeNameIdOfUnit = buildTypeMapIndex("unit");
typeNameIdOfArrayOrSlice = buildTypeMapIndex("[]");
typeNameIdOfTupleOrUnit = buildTypeMapIndex("()");

// Function type fingerprints are 128-bit bloom filters that are used to
// estimate the distance between function and query.
Expand Down
17 changes: 4 additions & 13 deletions tests/rustdoc-js-std/parser-errors.js
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ const PARSED = [
original: "-> *",
returned: [],
userQuery: "-> *",
error: "Unexpected `*`",
error: "Unexpected `*` after ` `",
},
{
query: 'a<"P">',
Expand Down Expand Up @@ -107,23 +107,14 @@ const PARSED = [
userQuery: "a<::a>",
error: "Unexpected `::`: paths cannot start with `::`",
},
{
query: "((a))",
elems: [],
foundElems: 0,
original: "((a))",
returned: [],
userQuery: "((a))",
error: "Unexpected `(`",
},
{
query: "(p -> p",
elems: [],
foundElems: 0,
original: "(p -> p",
returned: [],
userQuery: "(p -> p",
error: "Unexpected `(`",
error: "Unexpected `-` after `(`",
},
{
query: "::a::b",
Expand Down Expand Up @@ -204,7 +195,7 @@ const PARSED = [
original: "a (b:",
returned: [],
userQuery: "a (b:",
error: "Unexpected `(`",
error: "Expected `,`, `:` or `->`, found `(`",
},
{
query: "_:",
Expand Down Expand Up @@ -249,7 +240,7 @@ const PARSED = [
original: "ab'",
returned: [],
userQuery: "ab'",
error: "Unexpected `'`",
error: "Unexpected `'` after `b`",
},
{
query: "a->",
Expand Down
18 changes: 18 additions & 0 deletions tests/rustdoc-js-std/parser-slice-array.js
Original file line number Diff line number Diff line change
Expand Up @@ -266,6 +266,24 @@ const PARSED = [
userQuery: "]",
error: "Unexpected `]`",
},
{
query: '[a<b>',
elems: [],
foundElems: 0,
original: "[a<b>",
returned: [],
userQuery: "[a<b>",
error: "Unclosed `[`",
},
{
query: 'a<b>]',
elems: [],
foundElems: 0,
original: "a<b>]",
returned: [],
userQuery: "a<b>]",
error: "Unexpected `]` after `>`",
},
{
query: 'primitive:[u8]',
elems: [
Expand Down
Loading

0 comments on commit 516bbf2

Please sign in to comment.