-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove Spans from HIR -- 1/N -- Span collection #72878
Conversation
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit 547e5b78d962249e8b4f7152b5878a84eee00acf with merge 93a501ef44530cdc593a276b6b29f34bf28ad520... |
☀️ Try build successful - checks-azure |
Queued 93a501ef44530cdc593a276b6b29f34bf28ad520 with parent b85e3fe, future comparison URL. |
@@ -195,7 +200,7 @@ impl<'a, 'hir> NodeCollector<'a, 'hir> { | |||
(self.map, svh) | |||
} | |||
|
|||
fn insert_entry(&mut self, id: HirId, entry: Entry<'hir>, hash: Fingerprint) { | |||
fn insert_entry(&mut self, span: Span, id: HirId, entry: Entry<'hir>, hash: Fingerprint) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm. Why does the span
come before the id
in this API? The id
is the "key" in the map, right? So I would have the span
come at some point after it in the parameter list, just like the other "payloads"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept the API from the insert
function. This span parameter is only used to assert the consistency of the span map. It will be removed later.
src/librustc_ast_lowering/lib.rs
Outdated
@@ -648,14 +668,14 @@ impl<'a, 'hir> LoweringContext<'a, 'hir> { | |||
}) | |||
} | |||
|
|||
fn next_id(&mut self) -> hir::HirId { | |||
fn next_id(&mut self, span: Span) -> hir::HirId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to decide whether it is not intuitive for fn next_id
to take a span
as a parameter.
I think I understand the intent you have here: Every call to next_id
is matched up with some portion of the HIR that is being generated via the lowering of the AST, and each such portion of the HIR is meant to always have a span associated with it. So in that sense, it makes sense to pass along the span
so that you can enforce the invariant that all HIR entries have a span.
But at the same time, when I read the invocations written as self.next_id(span)
, my own intuitive reading is "this is generating an id
from the span somehow" (which is certainly not what is happening).
I'll keep musing on this. In the meantime, I would suggest you add a doc-comment to fn next_id
saying what the span
parameter is. (I also played with the idea of suggesting renaming the method, but I wasn't really happy with any of the alternatives I came up with...)
I've looked at the first two commits, and they lead me to one overall question: The end goal here is to refactoring the data structure to remove a field from a set of structures and replace that field with a mapping elsewhere. Normally I'm all for trying to work incrementally and making each commit as small as possible. In this case, it seems like you have made one PR dedicated to adding the necessary mapping, but not removing the But when it comes to something like this, where it might make sense, even in terms of simplifying the review, to do both the removal and the addition in one step? This would presumably make it trivial to confirm "yes, we clearly have added all the instances of the mapping where we need it, because every spot where the field access was removed, there is a corresponding access to the new mapping." ? Or does that make the PR so huge as to make the review process unmanageable? Another reason to do both in one PR is that it will produce more realistic end numbers for the perf runs. Right now, this PR's version of the compiler is going to pay in memory usage to maintain both the separate mapping of HirID to Span, as well as the spans embedded in the HIR itself. So that means some amount of wasted memory due to that redundancy, and that presumably will cause some amount of a speed hit that will go away later when the embedded span fields are removed. It would be potentially better to couple both those changes into one PR so that we get a more accurate idea of the real end effect on performance. |
(The question in my previous comment is important enough, in terms of it implying a big revision to this PR, that I'm pausing my review to wait for @cjgillot to answer the question. Thus I changed the S-waiting tags accordingly.) |
During the development of this PR, I found easiest to do the collection first, put some assertions I did not change the behaviour, and then proceed with each HIR type one by one. The current end state is at #72015. That PR is already quite large (+3200, -2700). Furthermore, it does not remove all spans yet: HIR types that do not have a HirId are in progress. The goal splitting is also to manage the effect on performance. I expect the reduction in invalidation to happen very late. In the mean time, using an exterior map gated behind a query will create slowdown. Those intermediate slowdowns deserve investigation. Otherwise, I would just be pushing the problem into another data structure. Does this make sense to you @pnkfelix? How would you rather proceed? |
Finished benchmarking try commit 93a501ef44530cdc593a276b6b29f34bf28ad520, comparison URL. |
@cjgillot okay, I'll trust your judgement about how to structure the changes. |
src/librustc_ast_lowering/item.rs
Outdated
), | ||
VariantData::Unit(id) => hir::VariantData::Unit(self.lower_node_id(id, DUMMY_SP)), | ||
VariantData::Unit(id) => hir::VariantData::Unit(self.lower_node_id(id, span)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These fixes on lines 722 and 724 seem important, or at least worth considering trying to land on their own in a separate PR?
That is, it seems like this could cause an interesting change to the diagnostic output in some cases, and it would be good to vet that change independently of the others changes in this PR, which I had hoped would be a refactoring....
(I understand the temptation to fold fixes like this in as a drive-by change, especially since I'm not sure how best to make this change without making the other changes to e.g. thread the span
argument around. But I'm worried about coupling it with such a big change overall, especially one that we might choose to revert later on as we gather more experience with the change...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The behaviour does not change here. This choice of span for the VariantData is the one already implemented in hir::map::NodeCollector::visit_variant
.
src/librustc_ast_lowering/lib.rs
Outdated
_ => {} | ||
}; | ||
let stored_span = self.spans.entry(hir_id).or_insert(span); | ||
if *stored_span == DUMMY_SP { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not keep using is_dummy()
here? (And !is_dummy()
below?)
I don't have a preference personally, but I also wonder when I see a change from an existing code style like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this particular instance of comparison with DUMMY_SP
was removed in a follow-up commit. But the question still stands over all, since the PR in general does inject more comparisons with DUMMY_SP
, for better or for worse.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In some cases, the HIR owners are assigned a HirId before the AST traversal. When this HirId is created, we do not have access to the proper Span. I use DUMMY_SP as a placeholder for delayed initialisation. I will add a comment about that.
Okay, I've left my feedback. I think this all seems relatively fine; I'm not 100% sure if the ordering of the changes is ideal (e.g. would it be better to switch to an The biggest issue I have is that I was expecting this to be a pure-refactoring, but there are places where dummy spans are being replaced with spans that are newly threaded through. I personally advise that we try to avoid coupling large refactorings with drive-by enduser-visible improvements like that. I would prefer to land the improvements in a separate PR. (Of course, with the bors turn-around time the way it is, separate PR's may end up coupled together anyway in a rollup. But I'd prefer that decision be made at rollup time, rather than at PR authorship time, unless there is some strong reason that the drive-by improvements be coupled with the refactoring.) ((And its also possible that I am wrong that the drive-by improvements are end-user visible? The fact that none of the ui tests needed updated is a hint that the drive-by improvements don't end up in any diagnostics? Or it could just be a sign of weakness in our ui testing...)) |
I re-shuffled the commits to avoid the back and forth with the modifications. The only differences are a few comments. |
2840a46
to
90d802e
Compare
Collect spans directly into an IndexVec.
c7a22f5
to
7b9999c
Compare
@bors try @rust-timer queue |
Awaiting bors try build completion. |
⌛ Trying commit 7b9999c with merge 434d20b9fd160c5b91331a8761accd9640a75aec... |
☀️ Try build successful - checks-actions |
Queued 434d20b9fd160c5b91331a8761accd9640a75aec with parent 8fec6c7, future comparison URL. @rustbot label: +S-waiting-on-perf |
Finished benchmarking try commit (434d20b9fd160c5b91331a8761accd9640a75aec): comparison url. Benchmarking this pull request likely means that it is perf-sensitive, so we're automatically marking it as not fit for rolling up. Please note that if the perf results are neutral, you should likely undo the rollup=never given below by specifying Importantly, though, if the results of this run are non-neutral do not roll this PR up -- it will mask other regressions or improvements in the roll up. @bors rollup=never |
Regressions of up to 5.6%. |
@cjgillot waiting on conflicts to be resolved |
First step in rust-lang/compiler-team#294
Split out of #72015
This PR collects the spans in the AST when lowering it to HIR.
This collection constructs a
HirId -> Span
map as eachHirId
is assigned.This map is rendered accessible to the world through a query,
and replaces the implementation of
tcx.hir().span(...)
.This PR needs a perf run.
r? @pnkfelix