Account for 'duplicate' closure regions in borrowck diagnostics #67911

Aaron1011 · 2020-01-06T01:49:49Z

When we process closures/generators, we create a new NLL inference variable
each time we encounter an early-bound region (e.g. "'a") in the substs
of the closure. These region variables are then treated as
universal regions when the perform region inference for the closure.

However, we may encounter the same region multiple times, such
as when the closure references multiple upvars that are bound
by the same early-bound lifetime. For example:

fn foo<'a>(x: &'a u8, y: &'a u8) -> u8 { (|| *x + *y)() }

This results in the creation of multiple 'duplicate' region variables,
which all correspond to the same early-bound region. During
type-checking of the closure, we don't really care - any constraints
involving these regions will get propagated back up to the enclosing
function, which is then responsible for checking said constraints
using the 'real' regions.

Unfortunately, this presents a problem for diagnostic code, which may
run in the context of the closure. In order to display a good error
message, we need to map arbitrary region inference variables (which may
not correspond to anything meaningful to the user) into a 'nicer' region
variable that can be displayed to the user (e.g. a universally bound
region, written by the user). To accomplish this, we repeatedly
compute an 'upper bound' of the region variable, stopping once
we hit a universally bound region, or are unable to make progress.

During the processing of a closure, we may determine that a region
variable needs to outlive mutliple universal regions. In a closure
context, some of these universal regions may actually be 'the same'
region - that is, they correspond to the same early-bound region.
If this is the case, we will end up trying to compute an upper bound
using these regions variables, which will fail (we don't know about
any relationship between them).

However, we don't actually need to find an upper bound involving these
duplicate regions - since they're all actually "the same" region, we can
just pick an arbirary region variable from a given "duplicate set" (all
region variables that correspond to a given early-bound region).

By doing so, we can generate a more precise diagnostic, since we will be
able to print a message involving a particular early-bound region (and
the variables using it), instead of falling back to a more generic error
message.

Fixes rust-lang#67765 When we process closures/generators, we create a new NLL inference variable each time we encounter an early-bound region (e.g. "'a") in the substs of the closure. These region variables are then treated as universal regions when the perform region inference for the closure. However, we may encounter the same region multiple times, such as when the closure references multiple upvars that are bound by the same early-bound lifetime. For example: `fn foo<'a>(x: &'a u8, y: &'a u8) -> u8 { (|| *x + *y)() }` This results in the creation of multiple 'duplicate' region variables, which all correspond to the same early-bound region. During type-checking of the closure, we don't really care - any constraints involving these regions will get propagated back up to the enclosing function, which is then responsible for checking said constraints using the 'real' regions. Unfortunately, this presents a problem for diagnostic code, which may run in the context of the closure. In order to display a good error message, we need to map arbitrary region inference variables (which may not correspond to anything meaningful to the user) into a 'nicer' region variable that can be displayed to the user (e.g. a universally bound region, written by the user). To accomplish this, we repeatedly compute an 'upper bound' of the region variable, stopping once we hit a universally bound region, or are unable to make progress. During the processing of a closure, we may determine that a region variable needs to outlive mutliple universal regions. In a closure context, some of these universal regions may actually be 'the same' region - that is, they correspond to the same early-bound region. If this is the case, we will end up trying to compute an upper bound using these regions variables, which will fail (we don't know about any relationship between them). However, we don't actually need to find an upper bound involving these duplicate regions - since they're all actually "the same" region, we can just pick an arbirary region variable from a given "duplicate set" (all region variables that correspond to a given early-bound region). By doing so, we can generate a more precise diagnostic, since we will be able to print a message involving a particular early-bound region (and the variables using it), instead of falling back to a more generic error message.

rust-highfive · 2020-01-06T01:49:53Z

r? @matthewjasper

(rust_highfive has picked a reviewer for you, use r? to override)

Aaron1011 · 2020-01-06T01:51:26Z

Note that I'm using a FxHashMap<RegionVid, FxHashSet<RegionVid> to store the duplicate regions, which isn't necessarily very efficient. However, we only construct it once (at the start of region checking for a given body), and only access it if an error has occured.

Centril · 2020-01-06T02:23:19Z

src/librustc_mir/borrow_check/universal_regions.rs

@@ -462,13 +503,14 @@ impl<'cx, 'tcx> UniversalRegionsBuilder<'cx, 'tcx> {
            defining_ty,
            unnormalized_output_ty,
            unnormalized_input_tys,
-            yield_ty: yield_ty,
+            yield_ty,
+            diagnostic_dup_regions: dup_regions,


Suggested change

diagnostic_dup_regions: dup_regions,

diagnostic_dup_regions,

Would be clearer to just use diagnostic_dup_regions locally as well as it clarifies the purpose (and what it isn't for) immediately. Same idea in fn defining_ty below.

Centril · 2020-01-06T02:26:25Z

src/librustc_mir/borrow_check/universal_regions.rs

+    /// regions we've seen so far. Before we compute an upper bound,
+    /// we check if the region appears in our duplicates set - if so,
+    /// we skip it.
+    pub diagnostic_dup_regions: FxHashMap<RegionVid, FxHashSet<RegionVid>>,


Suggested change

pub diagnostic_dup_regions: FxHashMap<RegionVid, FxHashSet<RegionVid>>,

pub diagnostic_dup_regions: RVarMapToEarlyBound,

(+ a type alias to use elsewhere)

Centril · 2020-01-06T02:32:31Z

src/librustc_mir/borrow_check/universal_regions.rs

+            dup_regions_map.entry(region).or_insert_with(|| Vec::new()).push(*new_vid);
+            new_region
+        });
+        let mut dup_regions: FxHashMap<RegionVid, FxHashSet<RegionVid>> = Default::default();


A comment here would be good.

Centril · 2020-01-06T02:34:21Z

@bors try @rust-timer queue

rust-timer · 2020-01-06T02:34:22Z

Awaiting bors try build completion

bors · 2020-01-06T02:34:34Z

⌛ Trying commit ced109f with merge b8bd9785eccbb7b71ae79dc8bf35260885607fb6...

bors · 2020-01-06T05:18:20Z

☀️ Try build successful - checks-azure
Build commit: b8bd9785eccbb7b71ae79dc8bf35260885607fb6 (b8bd9785eccbb7b71ae79dc8bf35260885607fb6)

rust-timer · 2020-01-06T05:18:22Z

Queued b8bd9785eccbb7b71ae79dc8bf35260885607fb6 with parent 0731573, future comparison URL.

matthewjasper · 2020-01-06T12:39:12Z

Wg-grammar is essentially a stress test for this code so the regression there isn't unexpected.

cc @nikomatsakis

nikomatsakis · 2020-01-06T15:05:25Z

Hmm. The reason for the current approach is because I did not want to rely on any information about regions coming from the "ordinary typeck". As we move forward with completely removing the old region check information, I would ideally like to only have "erased regions" present in the type info that the borrow checker starts with.

@matthewjasper -- we should probably sync up a bit on those plans, as I saw some PRs in this direciton.

Aaron1011 · 2020-01-07T00:49:35Z

As we move forward with completely removing the old region check information, I would ideally like to only have "erased regions" present in the type info that the borrow checker starts with.

I think this approach is still valid after such a change. Assuming that closure-specific errors are still reported in a closure context, we need some way of knowing which regions are actually 'the same', meaning that we should only use at most one of them when computing an upper bound.

If the borrow checker starts out with all regions being erased (e.g. the defining_ty method is gone), I think we would want to pass in the 'duplicate region' information from whatever place erases the regions.

nikomatsakis · 2020-01-07T18:10:49Z

@Aaron1011 the point is that we would never have that information in the first place, because the regions would all have been erased during type check itself.

Aaron1011 · 2020-01-13T20:30:49Z

@Aaron1011 the point is that we would never have that information in the first place, because the regions would all have been erased during type check itself.

Wouldn't type-check still have this information before it gets erased? My changes only affects the diagnosic code (nothing changes when compilation is successful), so it should be possible to pass in the "extra" information needed without affecting the rest of the borrow checker.

nikomatsakis · 2020-01-14T11:00:35Z

@Aaron1011 Type check would not have the information, if all goes to plan, no. It would do all of its computations with fully erased regions, so it would immediately "lose track" of where each reference comes from.

matthewjasper · 2020-01-19T16:53:50Z

I think that the issue here is that explain_why_borrow_contains_point shouldn't be using to_error_region_vid and should be calling something that takes an arbitrary region (with the lowest RegionVid) from the appropriate SCC. That should also improve the error in cases where there is no relation known between the lifetimes:

fn f<'a, 'b>(x: i32) -> (&'a i32, &'b i32) {
    let y = &x;
    (y, y)
}

Aaron1011 · 2020-01-19T17:03:42Z

I think that the issue here is that explain_why_borrow_contains_point shouldn't be using to_error_region_vid and should be calling something that takes an arbitrary region (with the lowest RegionVid) from the appropriate SCC.

It looks like to_error_region_vid (and to_error_region) are used in several other places. Won't we run into the same problem of subpar diagnostics in closures, since we'll be unable to compute an upper bound despite one existing?

matthewjasper · 2020-01-19T17:56:23Z

I think most of those are being called on universal regions, so they don't have this problem. There's type outlives errors:

fn g<'a, T: 'a>(t: &T) -> &'a i32 {
    &0
}

fn f<'a, 'b, T>(x: T) -> (&'a i32, &'b i32) { // compare with returning (&'a i32, &'a i32) 
    let y = g(&x);
    (y, y)
}

but to_error_region can't return 'a + 'b, so it probably shouldn't be used at all here.

Aaron1011 · 2020-01-20T21:01:28Z

I think most of those are being called on universal regions, so they don't have this problem.

I thought the whole point of to_error_region is that it does something for non-universal regions. If all of the callers always have universal regions, it could just be removed.

As long as to_error_region exists, I think it should properly account for 'duplicate' closure regions.

matthewjasper · 2020-01-27T20:20:40Z

to_error_region probably should be removed.

Dylan-DPC-zz · 2020-03-07T02:20:07Z

@Aaron1011 any updates?

nikomatsakis · 2020-03-09T17:33:19Z

Note that @matthewjasper is actively making some of the changes to remove old region check that I was talking about, so I guess we can revisit this error once those PRs have landed (not sure where we are in the process just now).

Aaron1011 · 2020-05-26T03:56:49Z

Several of @matthewjasper's PRs have landed, but to_error_region_vid still exists. Should it still be removed?

nikomatsakis · 2020-06-08T15:37:10Z

I'm not sure! Quite possibly :)

Aaron1011 · 2020-06-27T18:03:29Z

Closing in favor of #73806, which does not depend on HIR typeck region inference.

rust-highfive assigned matthewjasper Jan 6, 2020

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 6, 2020

Centril reviewed Jan 6, 2020

View reviewed changes

JohnCSimon added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Feb 10, 2020

joelpalmer added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Mar 30, 2020

Dylan-DPC-zz added the S-blocked Status: Blocked on something else such as an RFC or other implementation work. label Mar 30, 2020

Dylan-DPC-zz removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Mar 30, 2020

Aaron1011 closed this Jun 27, 2020

Aaron1011 mentioned this pull request Jun 27, 2020

Contrived type-outlives code does not compile #73808

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Account for 'duplicate' closure regions in borrowck diagnostics #67911

Account for 'duplicate' closure regions in borrowck diagnostics #67911

Aaron1011 commented Jan 6, 2020

rust-highfive commented Jan 6, 2020

Aaron1011 commented Jan 6, 2020

Centril Jan 6, 2020

Centril Jan 6, 2020

Centril Jan 6, 2020

Centril commented Jan 6, 2020

rust-timer commented Jan 6, 2020

bors commented Jan 6, 2020

bors commented Jan 6, 2020

rust-timer commented Jan 6, 2020

matthewjasper commented Jan 6, 2020

nikomatsakis commented Jan 6, 2020

Aaron1011 commented Jan 7, 2020

nikomatsakis commented Jan 7, 2020 •

edited

Loading

Aaron1011 commented Jan 13, 2020

nikomatsakis commented Jan 14, 2020 •

edited

Loading

matthewjasper commented Jan 19, 2020

Aaron1011 commented Jan 19, 2020 •

edited

Loading

matthewjasper commented Jan 19, 2020

Aaron1011 commented Jan 20, 2020

matthewjasper commented Jan 27, 2020

Dylan-DPC-zz commented Mar 7, 2020

nikomatsakis commented Mar 9, 2020

Aaron1011 commented May 26, 2020

nikomatsakis commented Jun 8, 2020

Aaron1011 commented Jun 27, 2020

	pub diagnostic_dup_regions: FxHashMap<RegionVid, FxHashSet<RegionVid>>,
	pub diagnostic_dup_regions: RVarMapToEarlyBound,

Account for 'duplicate' closure regions in borrowck diagnostics #67911

Account for 'duplicate' closure regions in borrowck diagnostics #67911

Conversation

Aaron1011 commented Jan 6, 2020

rust-highfive commented Jan 6, 2020

Aaron1011 commented Jan 6, 2020

Centril Jan 6, 2020

Choose a reason for hiding this comment

Centril Jan 6, 2020

Choose a reason for hiding this comment

Centril Jan 6, 2020

Choose a reason for hiding this comment

Centril commented Jan 6, 2020

rust-timer commented Jan 6, 2020

bors commented Jan 6, 2020

bors commented Jan 6, 2020

rust-timer commented Jan 6, 2020

matthewjasper commented Jan 6, 2020

nikomatsakis commented Jan 6, 2020

Aaron1011 commented Jan 7, 2020

nikomatsakis commented Jan 7, 2020 • edited Loading

Aaron1011 commented Jan 13, 2020

nikomatsakis commented Jan 14, 2020 • edited Loading

matthewjasper commented Jan 19, 2020

Aaron1011 commented Jan 19, 2020 • edited Loading

matthewjasper commented Jan 19, 2020

Aaron1011 commented Jan 20, 2020

matthewjasper commented Jan 27, 2020

Dylan-DPC-zz commented Mar 7, 2020

nikomatsakis commented Mar 9, 2020

Aaron1011 commented May 26, 2020

nikomatsakis commented Jun 8, 2020

Aaron1011 commented Jun 27, 2020

nikomatsakis commented Jan 7, 2020 •

edited

Loading

nikomatsakis commented Jan 14, 2020 •

edited

Loading

Aaron1011 commented Jan 19, 2020 •

edited

Loading