Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add spans to injected coverage counters, extract with CoverageData query #73684

Merged
merged 1 commit into from
Jul 2, 2020

Conversation

richkadel
Copy link
Contributor

This is the next iteration on the Rust Coverage implementation, and follows PR #73488

@tmandry @wesleywiser

I came up with an approach for coverage spans, pushing them through the Call terminator as additional args so they can be extracted by the CoverageData query.

I'm using an IndexVec to store them in CoverageData such that there can be only one per index (even if parts of the MIR get duplicated during optimization).

If this approach works for you, I can quickly expand on this to build a separate IndexVec for counter expressions, using a separate call that will be ignored during code generation, but from which I can extract the counter expression values.

Let me know your thoughts. Thanks!

r? @tmandry

Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: #34701 - Implement support for LLVMs code coverage instrumentation

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jun 24, 2020
@wesleywiser
Copy link
Member

Sorry, this notification fell off my radar. I will schedule some time tonight and tomorrow to review this.

@richkadel
Copy link
Contributor Author

no worries @wesleywiser ... it actually works out because I'm about to push an update to this PR today

src/librustc_middle/mir/mod.rs Outdated Show resolved Hide resolved
src/librustc_mir/transform/inline.rs Outdated Show resolved Hide resolved
src/librustc_mir/transform/instrument_coverage.rs Outdated Show resolved Hide resolved
src/librustc_mir/transform/instrument_coverage.rs Outdated Show resolved Hide resolved
@richkadel
Copy link
Contributor Author

@wesleywiser @tmandry - I just uploaded an iteration on this PR that shows how I may pull out all of the injected counterinfo (renamed from CoverageData, to be more consistent with debuginfo) in order to generate the coverage map in LLVM IR.

BUT as I was looking at the debug output, I realized there's another way to do this that might be simpler and faster. Please chime in if you agree or disagree with this:

When generating the LLVM intrinsic calls for each counter (src/librustc_codegen_llvm/intrinsic.rs), I now have the caller_instance, counter index, and span right here, and I also have the CodegenCx (via self.cx(), as the Builder).

Instead of expanding the query to extract the spans from the MIR, I could call a mutable function on the Builder to insert the counter info into the CodegenCx (in a coverageinfo-specific field). Then, on coverageinfo::finalize(cx), I can generate the coverage map directly from the cx.

This has the added benefit of ensuring a direct correlation between counters injected and their inclusion in the coverage map.

It also means the query for coverageinfo can be reverted back to holding only the hash and num_counters again, and returned by copy, rather than by reference.

Thoughts?

@tmandry
Copy link
Member

tmandry commented Jun 26, 2020

Instead of expanding the query to extract the spans from the MIR, I could call a mutable function on the Builder to insert the counter info into the CodegenCx (in a coverageinfo-specific field). Then, on coverageinfo::finalize(cx), I can generate the coverage map directly from the cx.

Makes sense to me. That would sidestep the issue of figuring out which functions we need to generate a coverage map for (which is the set of functions we codegen). The analogy to debuginfo gives it precedent as well.

@bors
Copy link
Contributor

bors commented Jun 26, 2020

☔ The latest upstream changes (presumably #73513) made this pull request unmergeable. Please resolve the merge conflicts.

@wesleywiser
Copy link
Member

Thoughts?

That seems reasonable to me as well so I would be interested to see a version of the code that uses that approach!

@richkadel
Copy link
Contributor Author

That seems reasonable to me as well so I would be interested to see a version of the code that uses that approach!

Awesome! I just finished making the changes. I'm updating to the current baseline and will push a new patch momentarily.

richkadel added a commit to richkadel/rust that referenced this pull request Jun 27, 2020
Moved from `query coverageinfo` (renamed from `query coverage_data`),
as discussed in the PR at:

rust-lang#73684 (comment)
@richkadel richkadel force-pushed the llvm-coverage-map-gen-2 branch from e59fb32 to 3cfc4b3 Compare June 27, 2020 19:16
@richkadel
Copy link
Contributor Author

@wesleywiser - @tmandry is out of office for the next week. (He may have told you.) Unless you're doing the same, I hope you don't mind being the primary reviewer, and pushing this to bors when ready.

I'm happy with the current version of this PR, so please feel free to review when you have time.

Thanks!

@richkadel
Copy link
Contributor Author

@wesleywiser - I was curious why I didn't need to add the new Rust intrinsics added in this PR to https://github.com/richkadel/rust/blob/cf6208643a7def50bc2cdd9a6539604afa555737/src/librustc_mir/interpret/intrinsics.rs#L413

I had to add the case for count_code_region here:

            }
            // FIXME(#73156): Handle source code coverage in const eval
            sym::count_code_region => (),
            _ => return Ok(false),

because this function caused the compiler to break without it.

But since it's working for the new intrinsics, without this change, I tried commenting out this change, and found that the current implementation still works without it (now).

I suspect that the problem occurred because of where the pass was executing at one time, and the current location seems to avoid the check and failure.

So I should be consistent here for all of the intrinsics, and it looks like I have the option of a) removing the check for count_code_region, or adding the new intrinsics to the case along side it.

Let me know what you think.

@richkadel
Copy link
Contributor Author

So I should be consistent here for all of the intrinsics, and it looks like I have the option of a) removing the check for count_code_region, or (b) adding the new intrinsics to the case along side it.

Let me know what you think.

I'm actually going to ahead with option (a), adding the new intrinsics, since they are "known". If emulate_intrinsic() is ever called with these intrinsics, we know it won't break (and the FIXME reference is there for future support for const evaluations of coverage calls, if needed).

@richkadel
Copy link
Contributor Author

@wesleywiser Please note the additional commit I just uploaded. I realized something after thinking about it overnight. How the coverage intrinsic calls should be handled (whether to process them internally or pass them onto the backend or both) should be fully orchestrated by the backend-specific code, but I had half of it in rustc_codegen_ssa (in block.rs) and half in rustc_codegen_llvm. I fixed this with the change in the new commit.

@@ -82,6 +84,53 @@ fn get_simple_intrinsic(cx: &CodegenCx<'ll, '_>, name: &str) -> Option<&'ll Valu
}

impl IntrinsicCallMethods<'tcx> for Builder<'a, 'll, 'tcx> {
fn is_codegen_intrinsic(
Copy link
Member

@bjorn3 bjorn3 Jun 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not add the code to codegen_intrinsic_call? You could just codegen nothing there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to extract the values from the const u32 operands, and for the non-codegened intrinsics, I didn't like the unnecessary overhead of converting them to LLVM values (OperandRefs) and back again for no reason. So is_codegen_intrinsic takes the rustc_middle::mir::Operands directly, before the args are converted to the backend.

Plus, for the intrinsics that are not passed through to codegen_intrinsic_call, this also avoids all of the other steps between the call to is_codegen_intrinsic() and codegen_intrinsic_call() in block.rs, and all of the prepwork in the implementation of codegen_intrinsic_call() prior to match name {. That's a lot of code that I can avoid calling by implementing it this way.

src/librustc_codegen_ssa/coverageinfo/map.rs Outdated Show resolved Hide resolved
src/librustc_codegen_llvm/coverageinfo/mod.rs Outdated Show resolved Hide resolved
src/librustc_codegen_llvm/coverageinfo/mod.rs Outdated Show resolved Hide resolved
src/librustc_codegen_ssa/coverageinfo/map.rs Outdated Show resolved Hide resolved
src/librustc_codegen_ssa/coverageinfo/map.rs Outdated Show resolved Hide resolved
src/librustc_codegen_ssa/mir/block.rs Show resolved Hide resolved
src/librustc_codegen_ssa/traits/intrinsic.rs Outdated Show resolved Hide resolved
src/librustc_middle/mir/coverage/mod.rs Show resolved Hide resolved
Comment on lines +413 to +416
sym::count_code_region
| sym::coverage_counter_add
| sym::coverage_counter_subtract
| sym::coverage_unreachable => (),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heads up @rust-lang/wg-const-eval, this PR is adding some new stub intrinsics which are used by the LLVM code coverage feature @richkadel is adding.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also cc @rust-lang/lang. This PR adds a few new intrinsics which are being used by the LLVM code coverage feature @richkadel is adding. Currently, these intrinsics are no-ops when evaluated at compile time (see #73156 for possible future extensions where these could measure code coverage when run by the const eval machinery).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These intrinsics seem like they're an implementation detail at this point, is that correct?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a fair characterization. The intrinsics are injected via a MIR pass enabled by the -Zinstrument-coverage flag.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK 👍 then I think it's all fine from the lang team perspective.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Niko!

src/librustc_codegen_ssa/traits/coverageinfo.rs Outdated Show resolved Hide resolved
@richkadel
Copy link
Contributor Author

All feedback has been addressed to this point. Thanks!

@richkadel
Copy link
Contributor Author

r? @wesleywiser
(re-assigning review to Wesley as agreed, since @tmandry is out of office this week)

@rust-highfive rust-highfive assigned wesleywiser and unassigned tmandry Jun 29, 2020
@wesleywiser
Copy link
Member

Would you mind squashing the commits together?

@richkadel richkadel force-pushed the llvm-coverage-map-gen-2 branch from 54eb05b to 8426c2a Compare June 29, 2020 03:37
@richkadel richkadel force-pushed the llvm-coverage-map-gen-2 branch from 8426c2a to 02bb58e Compare June 29, 2020 03:40
@richkadel
Copy link
Contributor Author

@wesleywiser I think I got this working. (I needed to update to the latest baseline first, and the squash worked without conflicts this time.) But it still shows my baseline syncs as separate commits.

Is this what you need? If not, please advise. Thanks!

@wesleywiser
Copy link
Member

@richkadel This looks mostly right to me although there's still 3 merge commits in addition to the actual changes.

What I usually do is this (assuming you have the remote origin which points to this repo and fork which points to your fork of the repo):

$ git fetch origin # grab the latest upstream changes
$ # assuming you're already on the latest commit of your branch with changes
$ git rebase -i origin/master # take the commits in your branch and replay them on top of origin/master
$ # this is the point where you can edit the git todo file and squash the commits together
$ git push fork -f # If you don't have a matching branch set up, then use `git push fork -f HEAD:{branch_name}`

@bjorn3
Copy link
Member

bjorn3 commented Jun 29, 2020

git push fork -f

I prefer git push fork --force-with-lease. It prevents a push when the destination has new commits that you haven't fetched yet. This is a little safeguard.

added regions with counter expressions and counters.

Added codegen_llvm/coverageinfo mod for upcoming coverage map

Move coverage region collection to CodegenCx finalization

Moved from `query coverageinfo` (renamed from `query coverage_data`),
as discussed in the PR at:

rust-lang#73684 (comment)

Address merge conflict in MIR instrument_coverage test

The MIR test output format changed for int types.

moved debug messages out of block.rs

This makes the block.rs calls to add coverage mapping data to the
CodegenCx much more concise and readable.

move coverage intrinsic handling into llvm impl

I realized that having half of the coverage intrinsic handling in
`rustc_codegen_ssa` and half in `rustc_codegen_llvm` meant that any
non-llvm backend would be bound to the same decisions about how the
coverage-related MIR terminators should be handled.

To fix this, I moved the non-codegen portion of coverage intrinsic
handling into its own trait, and implemented it in `rustc_codegen_llvm`
alongside `codegen_intrinsic_call`.

I also added the (required?) stubs for the new intrinsics to
`IntrepretCx::emulate_intrinsic()`, to ensure calls to this function do
not fail if called with these new but known intrinsics.

address PR Feedback on 28 June 2020 2:48pm PDT
@richkadel richkadel force-pushed the llvm-coverage-map-gen-2 branch from 02bb58e to 5239a68 Compare June 29, 2020 19:32
@richkadel
Copy link
Contributor Author

richkadel commented Jun 29, 2020

Looks like I got it this time, but it was more painful than that (mostly because it was more trial and error to figure it out). I tried your commands Wesley but they had no effect. I still ended up with the additional merge commits showing up.

Here's what I ended up having to do:

$ git rebase --rebase-merges -i HEAD~2

Without --rebase-merges none of the merge commits showed up in the todo list, which is why they were not removed when I squashed everything else last night. And note, HEAD~1, HEAD, origin/master, etc. did not give me a usable todo list.

Then in the todo list, I had to make two changes:

Starting from something like:

pick <hash> Description of last commit since my most recent pull to my remote master
label some-label

reset onto
merge -C <hash> Description of my last merge commit
pick <hash> Description of the only commit I want to keep

I changed it to:

pick <hash> Description of last commit since my most recent pull to my remote master
label some-label

reset some-label
pick <hash> Description of the only commit I want to keep

I saved this and exited vim and a few seconds later I was able to push the PR with a single commit.

@richkadel
Copy link
Contributor Author

For future reference, how do you sync your fork with the latest version of the rust-lang/rust repo, and then update your development branch for an ongoing pull request without creating these merge commits?

What I was doing:

  1. Create a PR in my fork to merge the changes from rust-lang/rust:master into richkadel/rust:master and merge it.

  2. On my workstation:

$ git checkout master
$ git fetch
$ git pull
$ git checkout <dev-branch-for-current-rust-lang-PR>
$ git rebase master
$ git push -f   # will try --force-with-lease next time

This creates the merge commit in the PR, which required the steps above.

I'm guessing this is not how others do it, if this is not a common workaround for removing merge commits?

@wesleywiser
Copy link
Member

wesleywiser commented Jun 29, 2020

@richkadel I have a similar workflow but I have pull.rebase set to true so that git pull always does git pull --rebase which does not introduce merge commits (see git-pull). Without the merge commits to deal with, you can simplify your workflow a bit:

$ git checkout <dev-branch-for-current-rust-lang-PR>
$ git fetch origin # make the latest upstream changes available in your local repo
$ git rebase origin/master # Rebase your changes on top of upstream. NOTE: origin/branch means use branch as it is on origin, not your local repo. Therefore, you don't have to checkout master and pull it before rebasing. 
$ git push -f   # will try --force-with-lease next time

@richkadel
Copy link
Contributor Author

@wesleywiser The fully-squashed 1 commit version passed all checks.

Good to merge?

@wesleywiser
Copy link
Member

Yes, this looks great!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Jun 30, 2020

📌 Commit 5239a68 has been approved by wesleywiser

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 30, 2020
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 30, 2020
…r=wesleywiser

add spans to injected coverage counters, extract with CoverageData query

This is the next iteration on the Rust Coverage implementation, and follows PR rust-lang#73488

@tmandry @wesleywiser

I came up with an approach for coverage spans, pushing them through the Call terminator as additional args so they can be extracted by the CoverageData query.

I'm using an IndexVec to store them in CoverageData such that there can be only one per index (even if parts of the MIR get duplicated during optimization).

If this approach works for you, I can quickly expand on this to build a separate IndexVec for counter expressions, using a separate call that will be ignored during code generation, but from which I can extract the counter expression values.

Let me know your thoughts. Thanks!

r? @tmandry

Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: rust-lang#34701 - Implement support for LLVMs code coverage instrumentation
Manishearth added a commit to Manishearth/rust that referenced this pull request Jun 30, 2020
…r=wesleywiser

add spans to injected coverage counters, extract with CoverageData query

This is the next iteration on the Rust Coverage implementation, and follows PR rust-lang#73488

@tmandry @wesleywiser

I came up with an approach for coverage spans, pushing them through the Call terminator as additional args so they can be extracted by the CoverageData query.

I'm using an IndexVec to store them in CoverageData such that there can be only one per index (even if parts of the MIR get duplicated during optimization).

If this approach works for you, I can quickly expand on this to build a separate IndexVec for counter expressions, using a separate call that will be ignored during code generation, but from which I can extract the counter expression values.

Let me know your thoughts. Thanks!

r? @tmandry

Rust compiler MCP rust-lang/compiler-team#278
Relevant issue: rust-lang#34701 - Implement support for LLVMs code coverage instrumentation
bors added a commit to rust-lang-ci/rust that referenced this pull request Jul 2, 2020
…arth

Rollup of 10 pull requests

Successful merges:

 - rust-lang#73414 (Implement `slice_strip` feature)
 - rust-lang#73564 (linker: Create GNU_EH_FRAME header by default when producing ELFs)
 - rust-lang#73622 (Deny unsafe ops in unsafe fns in libcore)
 - rust-lang#73684 (add spans to injected coverage counters, extract with CoverageData query)
 - rust-lang#73812 (ast_pretty: Pass some token streams and trees by reference)
 - rust-lang#73853 (Add newline to rustc MultiSpan docs)
 - rust-lang#73883 (Compile rustdoc less often.)
 - rust-lang#73885 (Fix wasm32 being broken due to a NodeJS version bump)
 - rust-lang#73903 (Changes required for rustc/cargo to build for iOS targets)
 - rust-lang#73938 (Optimise fast path of checked_ops with `unlikely`)

Failed merges:

r? @ghost
@bors bors merged commit dc762ce into rust-lang:master Jul 2, 2020
indirect pushed a commit to indirect/rust that referenced this pull request Jul 4, 2020
added regions with counter expressions and counters.

Added codegen_llvm/coverageinfo mod for upcoming coverage map

Move coverage region collection to CodegenCx finalization

Moved from `query coverageinfo` (renamed from `query coverage_data`),
as discussed in the PR at:

rust-lang#73684 (comment)

Address merge conflict in MIR instrument_coverage test

The MIR test output format changed for int types.

moved debug messages out of block.rs

This makes the block.rs calls to add coverage mapping data to the
CodegenCx much more concise and readable.

move coverage intrinsic handling into llvm impl

I realized that having half of the coverage intrinsic handling in
`rustc_codegen_ssa` and half in `rustc_codegen_llvm` meant that any
non-llvm backend would be bound to the same decisions about how the
coverage-related MIR terminators should be handled.

To fix this, I moved the non-codegen portion of coverage intrinsic
handling into its own trait, and implemented it in `rustc_codegen_llvm`
alongside `codegen_intrinsic_call`.

I also added the (required?) stubs for the new intrinsics to
`IntrepretCx::emulate_intrinsic()`, to ensure calls to this function do
not fail if called with these new but known intrinsics.

address PR Feedback on 28 June 2020 2:48pm PDT
@cuviper cuviper added this to the 1.46 milestone May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants