Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Suggestion: provide better report of function argument mismatch #65853

Open
Quantumplation opened this issue Oct 26, 2019 · 11 comments
Open

Suggestion: provide better report of function argument mismatch #65853

Quantumplation opened this issue Oct 26, 2019 · 11 comments
Assignees
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. C-enhancement Category: An issue proposing an enhancement or a PR with one. D-papercut Diagnostics: An error or lint that needs small tweaks. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@Quantumplation
Copy link
Contributor

Quantumplation commented Oct 26, 2019

While discussing #64915 with @estebank, I asked if it would make sense to also provide a more detailed report about how closures mismatched. For example, introspecting on the types to detect when types are in the wrong order, or when arguments are mismatched.

He pointed out that we don't even do that for regular function calls right now. For example:

https://play.rust-lang.org/?version=stable&mode=debug&edition=2018&gist=8288d94517c964071278be6833560f70

The above snippet could, in theory, provide a more detailed report which suggested that the arguments might be swapped in the first call, or that the second argument is missing in the second call. (Things like longest common subsequence could help here).

@estebank said he's been thinking about something like this for a while, and offered to provide some mentorship if I wanted to tackle this as a slightly bigger contribution, since it'd be easier than closure comparison off the bat.

This issue has been assigned to @Quantumplation via this comment.

@estebank estebank added A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. D-papercut Diagnostics: An error or lint that needs small tweaks. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 26, 2019
@estebank
Copy link
Contributor

I believe that the only place you'll have to look at to implement this is

/// Generic function that factors out common logic from function calls,
/// method calls and overloaded operators.
fn check_argument_types(
&self,
sp: Span,
expr: &'tcx hir::Expr,
fn_inputs: &[Ty<'tcx>],
expected_arg_tys: &[Ty<'tcx>],
args: &'tcx [hir::Expr],
c_variadic: bool,
tuple_arguments: TupleArgumentsFlag,
def_span: Option<Span>,
) {
let tcx = self.tcx;
// Grab the argument types, supplying fresh type variables
// if the wrong number of arguments were supplied
let supplied_arg_count = if tuple_arguments == DontTupleArguments {
args.len()
} else {
1
};
// All the input types from the fn signature must outlive the call
// so as to validate implied bounds.
for (fn_input_ty, arg_expr) in fn_inputs.iter().zip(args.iter()) {
self.register_wf_obligation(fn_input_ty, arg_expr.span, traits::MiscObligation);
}
let expected_arg_count = fn_inputs.len();
let param_count_error = |expected_count: usize,
arg_count: usize,
error_code: &str,
c_variadic: bool,
sugg_unit: bool| {
let mut err = tcx.sess.struct_span_err_with_code(sp,
&format!("this function takes {}{} but {} {} supplied",
if c_variadic { "at least " } else { "" },
potentially_plural_count(expected_count, "parameter"),
potentially_plural_count(arg_count, "parameter"),
if arg_count == 1 {"was"} else {"were"}),
DiagnosticId::Error(error_code.to_owned()));
if let Some(def_s) = def_span.map(|sp| tcx.sess.source_map().def_span(sp)) {
err.span_label(def_s, "defined here");
}
if sugg_unit {
let sugg_span = tcx.sess.source_map().end_point(expr.span);
// remove closing `)` from the span
let sugg_span = sugg_span.shrink_to_lo();
err.span_suggestion(
sugg_span,
"expected the unit value `()`; create it with empty parentheses",
String::from("()"),
Applicability::MachineApplicable);
} else {
err.span_label(sp, format!("expected {}{}",
if c_variadic { "at least " } else { "" },
potentially_plural_count(expected_count, "parameter")));
}
err.emit();
};
let mut expected_arg_tys = expected_arg_tys.to_vec();
let formal_tys = if tuple_arguments == TupleArguments {
let tuple_type = self.structurally_resolved_type(sp, fn_inputs[0]);
match tuple_type.kind {
ty::Tuple(arg_types) if arg_types.len() != args.len() => {
param_count_error(arg_types.len(), args.len(), "E0057", false, false);
expected_arg_tys = vec![];
self.err_args(args.len())
}
ty::Tuple(arg_types) => {
expected_arg_tys = match expected_arg_tys.get(0) {
Some(&ty) => match ty.kind {
ty::Tuple(ref tys) => tys.iter().map(|k| k.expect_ty()).collect(),
_ => vec![],
},
None => vec![],
};
arg_types.iter().map(|k| k.expect_ty()).collect()
}
_ => {
span_err!(tcx.sess, sp, E0059,
"cannot use call notation; the first type parameter \
for the function trait is neither a tuple nor unit");
expected_arg_tys = vec![];
self.err_args(args.len())
}
}
} else if expected_arg_count == supplied_arg_count {
fn_inputs.to_vec()
} else if c_variadic {
if supplied_arg_count >= expected_arg_count {
fn_inputs.to_vec()
} else {
param_count_error(expected_arg_count, supplied_arg_count, "E0060", true, false);
expected_arg_tys = vec![];
self.err_args(supplied_arg_count)
}
} else {
// is the missing argument of type `()`?
let sugg_unit = if expected_arg_tys.len() == 1 && supplied_arg_count == 0 {
self.resolve_vars_if_possible(&expected_arg_tys[0]).is_unit()
} else if fn_inputs.len() == 1 && supplied_arg_count == 0 {
self.resolve_vars_if_possible(&fn_inputs[0]).is_unit()
} else {
false
};
param_count_error(expected_arg_count, supplied_arg_count, "E0061", false, sugg_unit);
expected_arg_tys = vec![];
self.err_args(supplied_arg_count)
};
debug!("check_argument_types: formal_tys={:?}",
formal_tys.iter().map(|t| self.ty_to_string(*t)).collect::<Vec<String>>());
// If there is no expectation, expect formal_tys.
let expected_arg_tys = if !expected_arg_tys.is_empty() {
expected_arg_tys
} else {
formal_tys.clone()
};
let mut final_arg_types: Vec<(usize, Ty<'_>)> = vec![];
// Check the arguments.
// We do this in a pretty awful way: first we type-check any arguments
// that are not closures, then we type-check the closures. This is so
// that we have more information about the types of arguments when we
// type-check the functions. This isn't really the right way to do this.
for &check_closures in &[false, true] {
debug!("check_closures={}", check_closures);
// More awful hacks: before we check argument types, try to do
// an "opportunistic" vtable resolution of any trait bounds on
// the call. This helps coercions.
if check_closures {
self.select_obligations_where_possible(false, |errors| {
self.point_at_type_arg_instead_of_call_if_possible(errors, expr);
self.point_at_arg_instead_of_call_if_possible(
errors,
&final_arg_types[..],
sp,
&args,
);
})
}
// For C-variadic functions, we don't have a declared type for all of
// the arguments hence we only do our usual type checking with
// the arguments who's types we do know.
let t = if c_variadic {
expected_arg_count
} else if tuple_arguments == TupleArguments {
args.len()
} else {
supplied_arg_count
};
for (i, arg) in args.iter().take(t).enumerate() {
// Warn only for the first loop (the "no closures" one).
// Closure arguments themselves can't be diverging, but
// a previous argument can, e.g., `foo(panic!(), || {})`.
if !check_closures {
self.warn_if_unreachable(arg.hir_id, arg.span, "expression");
}
let is_closure = match arg.kind {
ExprKind::Closure(..) => true,
_ => false
};
if is_closure != check_closures {
continue;
}
debug!("checking the argument");
let formal_ty = formal_tys[i];
// The special-cased logic below has three functions:
// 1. Provide as good of an expected type as possible.
let expected = Expectation::rvalue_hint(self, expected_arg_tys[i]);
let checked_ty = self.check_expr_with_expectation(&arg, expected);
// 2. Coerce to the most detailed type that could be coerced
// to, which is `expected_ty` if `rvalue_hint` returns an
// `ExpectHasType(expected_ty)`, or the `formal_ty` otherwise.
let coerce_ty = expected.only_has_type(self).unwrap_or(formal_ty);
// We're processing function arguments so we definitely want to use
// two-phase borrows.
self.demand_coerce(&arg, checked_ty, coerce_ty, AllowTwoPhase::Yes);
final_arg_types.push((i, coerce_ty));
// 3. Relate the expected type and the formal one,
// if the expected type was used for the coercion.
self.demand_suptype(arg.span, formal_ty, coerce_ty);
}
}
// We also need to make sure we at least write the ty of the other
// arguments which we skipped above.
if c_variadic {
fn variadic_error<'tcx>(s: &Session, span: Span, t: Ty<'tcx>, cast_ty: &str) {
use crate::structured_errors::{VariadicError, StructuredDiagnostic};
VariadicError::new(s, span, t, cast_ty).diagnostic().emit();
}
for arg in args.iter().skip(expected_arg_count) {
let arg_ty = self.check_expr(&arg);
// There are a few types which get autopromoted when passed via varargs
// in C but we just error out instead and require explicit casts.
let arg_ty = self.structurally_resolved_type(arg.span, arg_ty);
match arg_ty.kind {
ty::Float(ast::FloatTy::F32) => {
variadic_error(tcx.sess, arg.span, arg_ty, "c_double");
}
ty::Int(ast::IntTy::I8) | ty::Int(ast::IntTy::I16) | ty::Bool => {
variadic_error(tcx.sess, arg.span, arg_ty, "c_int");
}
ty::Uint(ast::UintTy::U8) | ty::Uint(ast::UintTy::U16) => {
variadic_error(tcx.sess, arg.span, arg_ty, "c_uint");
}
ty::FnDef(..) => {
let ptr_ty = self.tcx.mk_fn_ptr(arg_ty.fn_sig(self.tcx));
let ptr_ty = self.resolve_vars_if_possible(&ptr_ty);
variadic_error(tcx.sess, arg.span, arg_ty, &ptr_ty.to_string());
}
_ => {}
}
}
}
}

particularly,

// is the missing argument of type `()`?
let sugg_unit = if expected_arg_tys.len() == 1 && supplied_arg_count == 0 {
self.resolve_vars_if_possible(&expected_arg_tys[0]).is_unit()
} else if fn_inputs.len() == 1 && supplied_arg_count == 0 {
self.resolve_vars_if_possible(&fn_inputs[0]).is_unit()
} else {
false
};
param_count_error(expected_arg_count, supplied_arg_count, "E0061", false, sugg_unit);

@Quantumplation
Copy link
Contributor Author

@rustbot claim

I'll give this a shot! Might be too ambitious for me though. :)

@rustbot rustbot self-assigned this Oct 26, 2019
@workingjubilee
Copy link
Member

workingjubilee commented Oct 26, 2019

I have a related remark that I had built up while reviewing related diagnostic commits/issues that have been floating around:

E0631 uses a concept of "found/expected", as does E0271, "found/expected", but these errors have been reported as confusing and backwards. So it seems wise to head off all confusion in the future, and it seems to me that the implied "directionality" could be read as flowing one way or another depending on perspective, and might in fact change if a compiler optimization happens to adjust the orders of things.

E0281, which is the error code that E0631 retired, used an expression more along the lines of "Z has Y, but that requires A according to B", which seems logically more time agnostic and resilient to refactoring (especially because it describes more closely the "this atom is being asked to be both Hydrogen and Helium" problem). The only problem as-such is that diagnostics probably are best if they are kept relatively terse and simple, but explicitness is still preferable.

@Quantumplation
Copy link
Contributor Author

@workingjubilee I had noticed the same. Especially confusing is that in the code for #64915, it seems variable names got swapped around (in multiple places!), so it got really confusing to trace through what happened...

@Quantumplation
Copy link
Contributor Author

Some notes for myself, while I begin to understand the code and think through the problem. Please feel free to chime in if you notice something I'm missing or misinterpreting!

Closures

I think

ty::Tuple(arg_types) if arg_types.len() != args.len() => {
param_count_error(arg_types.len(), args.len(), "E0057", false, false);
expected_arg_tys = vec![];
self.err_args(args.len())
}

is where we handle closure cases (this tuple_arguments call seems to be set to TupleArguments when for closures and overloaded functions), so could probably benefit from the same enhancement.

Variadic Functions

Similarly,

} else {
param_count_error(expected_arg_count, supplied_arg_count, "E0060", true, false);
expected_arg_tys = vec![];
self.err_args(supplied_arg_count)
}

handles external variadic c functions, and if we have a description of the types, we could suggest what's missing.

formal_tys

param_count_error(expected_arg_count, supplied_arg_count, "E0061", false, sugg_unit);

Here, we report the mismatch arg count error, and then here:

expected_arg_tys = vec![];
self.err_args(supplied_arg_count)

we assign a bunch of err_args to the formal_tys, which we use below:

let formal_ty = formal_tys[i];
// The special-cased logic below has three functions:
// 1. Provide as good of an expected type as possible.
let expected = Expectation::rvalue_hint(self, expected_arg_tys[i]);
let checked_ty = self.check_expr_with_expectation(&arg, expected);
// 2. Coerce to the most detailed type that could be coerced
// to, which is `expected_ty` if `rvalue_hint` returns an
// `ExpectHasType(expected_ty)`, or the `formal_ty` otherwise.
let coerce_ty = expected.only_has_type(self).unwrap_or(formal_ty);
// We're processing function arguments so we definitely want to use
// two-phase borrows.
self.demand_coerce(&arg, checked_ty, coerce_ty, AllowTwoPhase::Yes);
final_arg_types.push((i, coerce_ty));
// 3. Relate the expected type and the formal one,
// if the expected type was used for the coercion.
self.demand_suptype(arg.span, formal_ty, coerce_ty);

to actually check the types.

I'm not sure whether I should

  • Just modify the mismatch arg count error stuff, and in that case do similar type comparison logic with adjacent parameters to probe for missing arguments
  • Remove the mismatch arg count stuff entirely, and make the type comparison do the neighbor-probing

The first is likely overall faster (we're only doing the extra work when counts don't match, meaning we've definitely skipped or added extra args) but incomplete (if we've skipped and argument and added one extra, or swapped two arguments). Solving this problem in general and trying to figure out exact user intent is probably overkill, but would certainly be cool. Thoughts, @estebank?

Algorithm for detecting skips

These are the simplest cases we could try to detect:

function inputs:   A B C
      arguments:   A B C    // Typecheck!
      arguments:   A D C    // D is incorrect!
      arguments:   A C      // Forgot B!
      arguments:   A B D C  // D is extra!
      arguments:   A C B    // B and C are swapped!

Some more complex cases we might want to handle (assume AB could satisfy A or B):

function inputs:   A  B  C
      arguments:   A  BC B     // B and C are swapped!

(I'll add more to the above tables as I think of them

The really tricky part is trying to detect combinations of these in general, and I think starts getting into the domain of diminishing returns.

What I propose, at least for a first pass to introduce this probing, is to assume that at most one of these mistakes have been made, and if we can't find such a mistake that makes the rest of the types line up, fall back to the error reporting we have now. As always, need someone with more experience/context to help guide my decision making here.

@estebank
Copy link
Contributor

Closures

One thing to note: the AST keeps closures as a function with a single argument which is a tuple. A closure |()| {} would be encoded in the AST as a callable with a single argument which is a tuple of a tuple, while || {} would be encoded as a callable with a single argument which is an empty tuple. That is part of what you're seeing there.

we assign a bunch of err_args to the formal_tys, which we use below:

As you can see we use tcx.types.err, which is a special type similar to ! and () in that it has special meaning but users have no way of creating this in rust code. We use this to signal parts of the AST or the type tree that have had problems. When doing typechecking, for example saying that foo and bar have to have the same type, of either one of them is a TyErr we take a note of it, make the resulting type TyErr and do not emit an error, because whatever created a TyErr has already emitted one. This way, TyErr is "viral" (in that it infects everything it touches) and has a "calming" effect (in that knock down errors caused by an earlier error are not emitted). This later behavior is useful for cases like the following:

let foo = <Something that causes a type error>; // We emit the original error that the user can fix
bar(foo); // We hide a type error that will be fixed if `foo` is fixed.

Marking all the arguments TyErr just hide all possible errors from arguments. This was to address similar concerns to this ticket: if the first argument of a 10 parameter function (don't write these :P) is missing, then you'd get 10 errors (1 for the mismatched argument count and 9 for type errors).

Also, there's an fn similar to demand_coerce (which generates a new type from two types in an expression, the one known so far and the expected one and stores it in the type tables) called can_coerce which will perform the same validation, but is side effect free and is suitable for error recovery when trying to check if different synthetic expressions would have been valid, like removing or adding a borrow.

thoughts?

For the two cases (incorrect order and missing args) I think that a reasonable approach would be:

  • If there are multiple fn parameters with the same type, keep the current behavior
  • Do the current type checking in order, and if there's a failure try to skip arguments until one succeeds, including the ones that have previously failed, or you reach the end. (This can devolve to O(n^2) depending on how its written, which should be fine if it only happens in error recovery.)
    • If all args have been found to match some param, then emit a single error suggesting the resorting.
    • Otherwise, current output
    • The same approach can work for the case of missing or extra params.

I think you have a good base to work out in your summary, it's likely that we'll find things in the PR to change to make sure that we don't affect compilation times on the happy path, but it shouldn't be too hard.

@workingjubilee
Copy link
Member

Only favoring one suggestion off the table you offer seems like it would make mathematical sense, I think, because

  1. all suggestions are merely probable answers (though confidence in an answer might exceed 99.9%, eventually the suggestion will falter) but
  2. due to simple rules of probability, it is more likely that one suggestion will be correct, rather than two suggestions both being correct. Even if two errors seems likely, the combinatorial logic weighs against it.
  3. So because we are only reporting one, suggestions should be made in the order the compiler (engineer) has the most confidence that the suggestion is applicable at all, which can weigh more on mere evidence.

@Quantumplation
Copy link
Contributor Author

Quantumplation commented Oct 27, 2019

@estebank

Thanks for the detailed writeup! Hopefully my verbose style won't be too big a drain on you as I start to learn more about the compiler. ;)

re: Closures, that's what I figured. Out of curiosity, why are closures stored this way?

re: err types, also in line with what I figured was happening.

re: algorithm, I wasn't too concerned about a high asymptotic implementation here, for two reasons. First, as you mention, it's error recovery so it doesn't impact the happy path, and two, the number of elements we're working with here is really small. Even in contrived cases, we're talking dozens of parameters to compare, not thousands. I'm glad you're also not too concerned.

If we were only worried about missing arguments, longest common subsequence solves this problem pretty well (same algorithm that powers textual diffs and such). However, I'm not sure how to adapt it to consider swaps as well, so I'll do some sketching and thinking on that over the next few days.

@workingjubilee My comment about assuming 1 error was less about making one suggestion, and more about assuming that there was only one mistake, and falling back if we couldn't find it. So, in these examples:

function inputs:   A B C
      arguments:   B        // A and C are missing!
function inputs:   A B C D
      arguments:   B A D    // A and B are swapped, and C is missing.  Or is A missing, and the third parameter is the wrong type?

Since (by your first point) we're not trying to figure out the "right" answer, only the "useful" answer, probing for just simple mistakes gives us good net value without huge investment in complexity.

@Quantumplation
Copy link
Contributor Author

Algorithm

Been working through a POC of the algorithm I'll use ( in a language I'm more familiar with, still not fluent enough in Rust to make it my scratch pad ;) ).

https://glot.io/snippets/fhbpy4kp93

Essentially, you're filling in a matrix of whether various inputs can be coerced to the relevant parameters. A row or column of all falses means an input that can't be satisfies or an argument that can't satisfy anything.

It still doesn't look for swap detection, and is still funky around duplicated types, but I wanted to share my work in progress.

@JohnTitor JohnTitor added the C-enhancement Category: An issue proposing an enhancement or a PR with one. label Jan 12, 2020
@workingjubilee
Copy link
Member

So like

fn grid<T>(w: usize, h: usize) -> Vec<Vec<T>> {
    let mut v = vec![];
    for _ in 0..w {
        v.push(Vec::with_capacity(h));
    }
    v
}

// this next fn definitely doesn't work because it doesn't have a T yet
fn is_easygoing(par: T, arg: T) -> bool {
   par == arg
}

Hmmm....

@Quantumplation
Copy link
Contributor Author

This fell off my radar for a while because the world got complicated, but I'm about to open a PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-diagnostics Area: Messages for errors, warnings, and lints A-suggestion-diagnostics Area: Suggestions generated by the compiler applied by `cargo fix`. C-enhancement Category: An issue proposing an enhancement or a PR with one. D-papercut Diagnostics: An error or lint that needs small tweaks. E-mentor Call for participation: This issue has a mentor. Use #t-compiler/help on Zulip for discussion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants