Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Guess diagnostics #1941

Closed
wants to merge 8 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
258 changes: 258 additions & 0 deletions text/0000-guesses.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,258 @@
- Feature Name: guess_diagnostic
- Start Date: 2017-02-23
- RFC PR: (leave this empty)
- Rust Issue: (leave this empty)

# Summary
[summary]: #summary

Add a new kind of compiler diagnostic ("guess"), to distinguish "suggestions"
from code snippets that are found by a heuristic or have multiple possible
solutions. Change all current "suggestions" which are not guaranteed to be correct
to "guesses" or make them not emit a suggestion in the corner cases that would
yield incorrect suggestions.

# Motivation
[motivation]: #motivation

TLDR: `rustfix` should be the semantic companion to `rustfmt` (which is purely syntactical) and automatically change code to be more idiomatic.

Clippy (and some compiler builtin lints) produce "suggestions", which are code
snippets that can be copy pasted over a part of the linted code to replace,
enhance or remove the detected suboptimal piece of code. In some cases there is
a clearly correct solution, like removing one ampersand on the rhs of the following
`let` binding (due to it being automatically dereferenced by the compiler anyway):

```rust
let x: &i32 = &&5;
```

This would be a situation where a "suggestion" would be used, since there is no
possible ambiguity or disadvantage to having a tool (e.g. `rustfix`) automatically
remove the ampersand.

When there is no clear solution (e.g. when the user references a type `Iter`,
but there is no such type in the scope) heuristics can supply "guesses" (e.g.
right now, the compiler supplies an unstructured list of types called `Iter`).

# Detailed design
[design]: #detailed-design

## Diagnostics frontend API changes

The compiler diagnostics API is extended with the following methods:

```rust
pub fn span_guesses<I>(&mut self, sp: Span, msg: &str, guesses: I) -> &mut Self where I: IntoIterator<Item = String>;
pub fn span_guess(&mut self, sp: Span, msg: &str, guess: String) -> &mut Self {
self.span_guesses(sp, msg, ::std::iter::once(guess))
}
```

`span_guess` is just a convenience method which forwards to `span_guesses`.

`span_guesses` takes an iterator of guesses to be presented to the user.
The span `sp` is the region of code which will be replaced by one of the code snippets
given by the `guesses` iterator. The span can be zero bytes long, if the text is supposed
to be inserted (e.g. adding a new `use xyz::abc;` at the top of the file) instead of
replacing existing code.

If multispan replacements (replacing multiple separate spans at once, e.g. when modifying the
pattern in a for loop together with the object being iterated over) are desired,
one should modify the `Diagnostic` object manually. This might be inconvenient, but
multispan replacements are very rare (occurring exactly
[twice in clippy](https://github.com/Manishearth/rust-clippy/search?utf8=%E2%9C%93&q=multispan_sugg))
and therefore the API should be figured out once/if they become more common.

## Changes in diagnostic API usage

### Replace `suggestions` with `guesses`

All current calls to `span_suggestion` are replaced with `span_guess` if
their replacement is obtained by a heuristic. A new type of tests is added: "suggestion". Tests in the "suggestion" directory must be failing
with an error (or denied lint) and contain a suggestion (not required to
be emitted by the error or lint failing the test). Then the suggestion is
applied, and the test must now compile and run successfully.

### Change `help` messages containing code snippets to `guesses`

Whenever a `span_help` is passed a snippet or hand built expression,
it is changed into a `span_guess` or `span_guesses` if multiple helps
are generated in some form of a loop. If `span_guesses` is used, all
arbitrary limits on the number of displayed items is removed. This limit
is instead enforced by the command line diagnostic backend.

## JSON diagnostics backend API changes

The JSON object gets a new field `guesses` which is an array of
`Guess` objects. Every `Guess` object contains an array of span +
snippet pairs stating which part of the code should be replaced
with what replacment. There is no need for all guesses to replace
the same piece of code or even require the same number of replacements.

## Command line diagnostics backend API changes

Currently suggestions are inserted directly into the error structure as a "help" sub-diagnostic
of the main error, with a special `Option` field set for the replacement text.
This means that backends like the existing command line backend and JSON backend will need to
process the main error's sub-diagnostics to figure out how to treat the suggestion and extract
that information back out of the sub-diagnostic.

The backend API is changed to add a `code_hints` field to the main diagnostic, which contains
an enum with a `Suggestion` and a `Guesses` variant. The actual backend will then destructure
these variants and apply them as they see fit best. An implementation of this backend change
exists [in a rust PR](https://github.com/rust-lang/rust/pull/39973) (does not implement guesses yet).

The command line backend will print guesses as a sequence of "help" messages, just like in the following example

```
error[E0412]: type name `Iter` is undefined or not in scope
--> <anon>:4:12
|
4 | let t: Iter;
| ^^^^ undefined or not in scope
|
= help: you can import several candidates into scope (`use ...;`):
= help: `std::collections::binary_heap::Iter`
= help: `std::collections::btree_map::Iter`
= help: `std::collections::btree_set::Iter`
= help: `std::collections::hash_map::Iter`
= help: and 8 other candidates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's up with all these lines starting with "help:" anyway? Probably not part of this RFC but this does not look helpful to me. Starting the first line with a (highlighted) "help" should suffice.

```

With the only change being that the `use ...;` is directly inserted into the help messages as shown below.
This has precedent in the diagnostics for unimported traits:

```
= help: items from traits can only be used if the trait is in scope; the following trait is implemented but not in scope, perhaps add a `use` for it:
= help: candidate #1: `use std::borrow::Borrow;`
```

For the import guesses this would look like

```
error[E0412]: type name `Iter` is undefined or not in scope
--> <anon>:4:12
|
4 | let t: Iter;
| ^^^^ undefined or not in scope
|
= help: you can import several candidates into scope:
= help: `use std::collections::binary_heap::Iter;`
= help: `use std::collections::btree_map::Iter;`
= help: `use std::collections::btree_set::Iter;`
= help: `use std::collections::hash_map::Iter;`
= help: and 8 other candidates
```

The actual span where the guess should be inserted is not shown in the command line diagnostics and has
no influence on the display of the guess.

### Special case: Only one guess

In case there is only a single guess, there are multiple variants on what can happen.

* replacement contains `\n`
* display in an extra `help`
* No `note` attached to the main diagnostic
* span of the replacement is the same as the main diagnostic
* attach the replacement string to the guess message
```
42 | abc
| ^^^ did you mean `abcd`
```
* span of the replacement is part of the main diagnostic's span
* attach the replacement string to the guess message, even if it's technically wrong
```
42 | Opition<u8>
| ^^^^^^^^^^^ did you mean `Option`
* span of the replacement contains the main diagnostic's span
* expand the displayed span to the entire span which should be replaced
```
42 | a.I
| ^^^ did you mean `a::I`
```
* `note` already attched to the main diagnostic
* Attach the guess below the note according to the no-note rules (as if the note were the main diagnostic)
```
5 | ignore(|z| if false { &y } else { z });
| ^^^ - `y` is borrowed here
| |
| may outlive borrowed value `y`
| did you mean `move |z|`
```

# How We Teach This
[how-we-teach-this]: #how-we-teach-this

There is concern for the teachability of all the different diagnostic levels of rustc. The full current list is

* fatal
* Bug (panic and abort)
* Error (abort at the next convenient stage, but continue with reporting more diagnostics)
* Warning (print and continue)
* sub-diagnostic
* Help (print in its own line and continue)
* Note (print inline in the textual output right after the `^^^^^` under the span)
* code-hints
* Suggestion (print like a help)

This list would get extended by Guesses under the code hint category

The complexity is not in the diagnostic levels themselves, but in choosing the correct level for a situation. This is fairly straight-forward for errors vs warnings, but in the sub-diagnostic and code-hint levels, it is less clear.

The following chart should clarify the decision process for subdiagnostics

![chart](https://cloud.githubusercontent.com/assets/332036/23554529/be228b76-0025-11e7-99e7-627d60f5a328.png)
Copy link
Member

@killercup killercup Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! For sake of keeping this self contained you should either reproduce the state machine in prose, or include the chart as ASCII, though. I spent a few minutes to do the latter with asciiflow.com, feel free to copy this:

                             +--------------------------+
                             | Does the message contain |
                    +--------+ snippets derived         +--------------+
                    |        | from real code?          |              |
                Yes |        +--------------------------+              | No
                    |                                                  |
        +-----------v--------------+                          +--------v----------+
    +---+ Are there multiple       +---+                      | Is the message    |
    |   | snippets that may apply? |   |                      | reasonably short? |
    |   +--------------------------+   |                      +---+-----------+---+
Yes |                                  | No                       |           |
    |                                  |                      Yes |           | No
    |              +-------------------v---------+                |           |
    |              | Are there situations where  |            +---v--+     +--v---+
    |              | the snippet is plain wrong? |            | Note |     | Help |
    |              +---+-----------------------+-+            +------+     +------+
    |                  |                       |
    |              Yes |                       | No
    |                  |                       |
    |    +-------------v----------------+    +-v----------+
    |    | Can you detect those         |    | Suggestion |
    |    | situations programmatically? |    +------------+
    |    +------+-----------------+-----+
    |           |                 |
    |           | No          Yes |
    |           |                 |
 +--v----+      |        +--------v------------------------+
 | Guess <------+        | Suggestion where correct,       |
 +-------+               | Guess or no code hint otherwise |
                         +---------------------------------+

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dot code is in the commit message: aed4d38

I'll always push the dot code together with changes in the image.

You can use an online tool like http://sandbox.kidstrythisathome.com/erdos/ to generate new images and edit the graph.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nice, I hadn't seen that. Maybe put it in an addendum section?


# Drawbacks
[drawbacks]: #drawbacks

## The new "guess" diagnostic category does not add any benefit

Citing @nrc in [the original pull request](https://github.com/rust-lang/rust/pull/39458#issuecomment-277898885)

> My work flow for 'automatic application' is that the user opts in to doing this
> in an IDE or rustfix tool, possibly with some further input.
> I don't think there is a need or expectation that after such an application
> the resulting code is guaranteed to be logically correct or even to compile.
> So, I think a suggestion with placeholders would be fine as would a suggestion with multiple choices.
> In other words, I'd rather make suggestions more flexible than add another category of error.

## JSON and plain text error messages might differ

Plain text will output many simple guesses as notes on the relevant span (e.g. "did you mean `abcd`")
instead of showing them as full suggestions like:

```
42 | abc
| ^^^ unknown identifier
help: did you mean `abcd`?
|
42 | abcd
```

It would be confusing for users to get semantically differend kinds of messages depending on how they view the errors.


# Alternatives
[alternatives]: #alternatives

## Don't add a new category, just make everything a suggestion

This will make it impossible for automatic semantic code rewriting.

## Don't add a new category, but allow adding multiple suggestions

This allows changing "help" messages like "did you mean ..." to suggestions, if there are multiple things that apply.

## Add a new "replacement" category

Denotes the automatically applicable code hints, while suggestions stay as the heuristic category.
This is just nomenclature, but eases the transition, since nothing needs to be changed in existing `span_suggestion` calls.

# Unresolved questions
[unresolved]: #unresolved-questions

1. I could not come up with a nice API for multiple guesses each with multiple spans
2. Also apply all single-guess rules to suggestions to reduce error verbosity? This can be done later.