Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement & use metadata on arguments in LLVM #50156

Closed
3 tasks
hanna-kruppe opened this issue Apr 22, 2018 · 8 comments · Fixed by #128371
Closed
3 tasks

Implement & use metadata on arguments in LLVM #50156

hanna-kruppe opened this issue Apr 22, 2018 · 8 comments · Fixed by #128371
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-llvm Working group: LLVM backend code generation

Comments

@hanna-kruppe
Copy link
Contributor

hanna-kruppe commented Apr 22, 2018

We use metadata attached to LLVM instructions to tell LLVM passes about the range of possible integers that a value can take on. This allows additional optimizations. For example, when loading from a &NonZeroU32, the LLVM IR looks like this:

  %2 = load i32, i32* %1, align 4, !range !33

where !33 refers to a module-level entry like this (usually located near the end of the module source code):

!33 = !{i32 1, i32 0}

This tells LLVM that it's UB if the load returns anything that is not in the range ">= 1, < 0" (with wrap-around, i.e. it includes all values except 0). This allows x.get() != 0 (where x: &NonZeroU32) to be optimized to true, for example.

Unfortunately LLVM does not have a way to attach this information to function arguments, which leads to missed optimizations (e.g. #49572, #49420 (comment)). Adding this capability has been discussed before on llvm-dev and received positively, but was never actually implemented. If we want to use this in Rust, we're likely going to have to implement (& upstream) it outselves.

This issue tracks:

  • Implementing metadata-on-arguments in LLVM (mentoring instructions)
  • Upstreaming / backporting that LLVM patch
  • Emitting range metadata on arguments from rustc

cc @nox @eddyb

@nox
Copy link
Contributor

nox commented Apr 22, 2018

Nice! Thanks for filing this and being willing to mentor its implementation.

@hanna-kruppe
Copy link
Contributor Author

cc #50157 which is much easier but strongly related and complementary.

@hanna-kruppe
Copy link
Contributor Author

hanna-kruppe commented Apr 22, 2018

Mentoring instructions for the LLVM changes:

Currently metadata can be attached to instructions, and attributes can be attached to both function arguments and parameters, as well as the return value. For metadata-on-arguments we'll draw inspiration from the APIs for both. The implementation can be simpler than both of those, though you may still want to look through it. For that, and in general for browsing LLVM code, I recommend https://code.woboq.org/llvm/llvm/ which offers go-to-definition, go-to-declaration, find-all-uses, etc.

We don't need metadata on return values (you can already attach metadata to the call instruction, and this indeed already works for range metadata today). I also don't see a need for metadata on the values passed to a call (e.g. there also isn't range metadata on stores). So we restrict outselves to the formal parameters of the function definition -- confusingly called Argument in the LLVM code base. I think we should add an optional collection of metadata to the Function object (one per parameter), and both Function and Argument would gain APIs for adding, querying, and removing metadata from individual parameters.

The APIs would be similar to what Instruction offers for metadata, though pared down -- we don't need anything related to debug metadata, for example. However, like with attributes, there should probably be APIs on both Function (taking an ArgNo integer to identify the argument) and on Argument (delegating to Function and pass in their own ArgNo). Take a look at how it's done for attributes).

Metadata on instructions is implemented in the superclass Instruction and it's fairly complicated to avoid wasting storage for the majority of instructions that don't have metadata. The actual metadata is stored out-of-line in the LLVMContext and Instruction only stores one bit indicating whether there's any metadata. We don't need that, we can just have an optional pointer to an array with one MDAttachmentMap per parameter, mirroring the Argument *Arguments; member. See the Instruction methods for examples of how to use the MDAttachmentMap.

Once that is done, passes need to be updated to query the new metadata when analyzing Arguments. I haven't looked at this part in detail yet, but going through the existing uses of LLVMContext::MD_range will almost certainly suffice. But anyone who wants to dive into this should ping me so I can help or take over.

@kennytm kennytm added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. WG-llvm Working group: LLVM backend code generation labels Apr 24, 2018
@nox
Copy link
Contributor

nox commented May 6, 2018

@rkruppe Should the Function class have a new MDAttachmentMap * field and handle memory allocation of the array itself, or should that go in a separate type?

@hanna-kruppe
Copy link
Contributor Author

I think in analogy with Function::Arguments it's fine to open-code that in Function.

@eddyb
Copy link
Member

eddyb commented May 6, 2018

Would this change mean that attributes like nonnull would get replaced with metadata?
Actually, don't we have to use !nonnull for pointers and !range for integers?
So nonnull would become !nonnull (how? this should be backwards-compatible).

@hanna-kruppe
Copy link
Contributor Author

Yes, I believe this would obsolete some attributes on formal parameters. For argument values at specific call sites you'd still use attributes, but I don't think callsite annotations are useful for nonnull specifically. So yeah, nonnull probably becomes obsolete with this change.

@hanna-kruppe
Copy link
Contributor Author

Reviewing the list of attributes in the Language Reference, metadata-on-parameters could also obsolete dereferencable(<n>) and dereferencable_or_null(<n>) if corresponding metadata is added.

Other optimization hints that could conceivably become metadata are noalias, nocapture and returned, but these are very function-parameter-specific so turning them into metadata that can be applied to other values may not make sense. There's !noalias metadata but it's not obvious to me whether the noalias attribute can be exactly expressed with that metadata.

@jonas-schievink jonas-schievink added the T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. label Mar 18, 2020
aykevl added a commit to tinygo-org/tinygo that referenced this issue May 27, 2020
I ran into an issue where I did a method call on a nil interface and it
resulted in a HardFault. Luckily I quickly realized what was going on so
I could fix it, but I think undefined behavior is definitely the wrong
behavior in this case. This commit therefore changes such calls to cause
a nil panic instead of introducing undefined behavior.

This does have a code size impact. It's relatively minor, much lower
than I expected. When comparing the before and after of the drivers
smoke tests (probably the most representative sample available), I found
that most did not change at all and those that did change, normally not
more than 100 bytes (16 or 32 byte changes are typical).

Right now the pattern is the following:

    switch typecode {
    case 1:
        call method 1
    case 2:
        call method 2
    default:
        nil panic
    }

I also tried the following (in the hope that it would be easier to
optimize), but it didn't really result in a code size reduction:

    switch typecode {
    case 1:
        call method 1
    case 2:
        call method 2
    case 0:
        nil panic
    default:
        unreachable
    }

Some code got smaller, while other code (the majority) got bigger. Maybe
this can be improved once range[1] is finally allowed[2] on function
parameters, but it'll probably take a while before that is implemented.

[1]: https://llvm.org/docs/LangRef.html#range-metadata
[2]: rust-lang/rust#50156
deadprogram pushed a commit to tinygo-org/tinygo that referenced this issue May 28, 2020
I ran into an issue where I did a method call on a nil interface and it
resulted in a HardFault. Luckily I quickly realized what was going on so
I could fix it, but I think undefined behavior is definitely the wrong
behavior in this case. This commit therefore changes such calls to cause
a nil panic instead of introducing undefined behavior.

This does have a code size impact. It's relatively minor, much lower
than I expected. When comparing the before and after of the drivers
smoke tests (probably the most representative sample available), I found
that most did not change at all and those that did change, normally not
more than 100 bytes (16 or 32 byte changes are typical).

Right now the pattern is the following:

    switch typecode {
    case 1:
        call method 1
    case 2:
        call method 2
    default:
        nil panic
    }

I also tried the following (in the hope that it would be easier to
optimize), but it didn't really result in a code size reduction:

    switch typecode {
    case 1:
        call method 1
    case 2:
        call method 2
    case 0:
        nil panic
    default:
        unreachable
    }

Some code got smaller, while other code (the majority) got bigger. Maybe
this can be improved once range[1] is finally allowed[2] on function
parameters, but it'll probably take a while before that is implemented.

[1]: https://llvm.org/docs/LangRef.html#range-metadata
[2]: rust-lang/rust#50156
@workingjubilee workingjubilee added the C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such label Oct 8, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Jul 31, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 7, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 9, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

r​? `@nikic`
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 9, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.

try-job: i686-gnu
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 10, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
bors added a commit to rust-lang-ci/rust that referenced this issue Aug 12, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang#127513

closes rust-lang#50156
cc rust-lang#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
@bors bors closed this as completed in e08b80c Aug 12, 2024
lnicola pushed a commit to lnicola/rust-analyzer that referenced this issue Aug 13, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang/rust#127513

closes rust-lang/rust#50156
cc rust-lang/rust#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
RalfJung pushed a commit to RalfJung/miri that referenced this issue Aug 14, 2024
Add range attribute to scalar function results and arguments

as LLVM 19 adds the range attribute this starts to use it for better optimization.
hade been interesting to see a perf run with the rust-lang/rust#127513

closes rust-lang/rust#50156
cc rust-lang/rust#49572 shall be fixed but not possible to see as there is asserts that already trigger the optimization.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-enhancement Category: An issue proposing an enhancement or a PR with one. C-optimization Category: An issue highlighting optimization opportunities or PRs implementing such T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. WG-llvm Working group: LLVM backend code generation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants