Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autodiff Upstreaming - single commit #129175

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

ZuseZ4
Copy link
Contributor

@ZuseZ4 ZuseZ4 commented Aug 17, 2024

Tracking issue:

This PR/Commit is just here to show the whole picture. I'll also push individual PRs that are easier to review, starting with the backend which due to bootstrapping is the part I am the least confident about.

The documentation currently is hosted at https://enzyme.mit.edu/index.fcgi/rust/. Please see especially the Installation chapter for build instructions (For rustc devs these shouldn't be surprising).

The tests are currently hosted at https://github.com/EnzymeAD/rustbook. To simplify reviewing I would suggest we first focus on upstreaming the code, and then I'll move over the tests once that's done.

A few of the lines like were contributed by other people who helped me with my fork. Since this work touches multiple parts of rustc, it was easier to reimplement than rebase this once in a while. Once individual PRs are approved I'll look up the history and restore author information where needed.

The first individual PR for the backend (builds on it's own) is here: #129176
The frontend PR is here: #129458

@rustbot
Copy link
Collaborator

rustbot commented Aug 17, 2024

r? @michaelwoerister

rustbot has assigned @michaelwoerister.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot
Copy link
Collaborator

rustbot commented Aug 17, 2024

⚠️ Warning ⚠️

  • These commits modify submodules.

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Aug 17, 2024
@jackh726
Copy link
Member

This is cool. Over the medium to long-term, though, I think we should work to make the interface in the compiler agnostic to whatever backend/tool is used. I know that's a big ask, and I think it's valuable to land this work in the meantime. But, I don't think we want to commit ourselves to Enzyme or to e.g. supporting autodiff in bootstrap.

@michaelwoerister michaelwoerister removed their assignment Aug 19, 2024
@michaelwoerister michaelwoerister removed the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Aug 19, 2024
@ZuseZ4
Copy link
Contributor Author

ZuseZ4 commented Aug 19, 2024

Agree, both of my Autodfiff and GPU Offloading work could be handled in a crate, by providing better access to compiler internals. Java's reflection proposal even includes Autodiff as example (though they still don't handle all necessary features), and in Julia land, Enzyme.jl is a package that just has a dependency on their compiler package GPUCompiler.jl (terrible name, independent of GPUs these days). The equivalent of my rustc-gpu work is KA.jl (KernelAbstractions) and also works as a normal package for Julia. I talked with Niko about what compiler internals I would need access too.
That being said, I developed this fork for 3 years and it becomes exhausting to keep an out of fork tree up to date, which is why I'm really happy that these got approved as experimental nightly featutes.

@jieyouxu jieyouxu added the S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. label Aug 23, 2024
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 6, 2024
…san68

Autodiff Upstreaming - enzyme backend

Tracking issue: rust-lang#124509

Part of rust-lang#129175

This PR should allow building Enzyme from source on Tier 1 targets (when also building LLVM), except MSVC.
It's only a small fraction (~200 lines) of the whole upstream PR, but due to bootstrapping and the number of configurations in which rustc can be build I assume that this will be the hardest to merge, so I'm starting with it.
Happy to hear what changes are required to be able to upstream this code.

**Content:**
It contains a new configure flag `--enable-llvm-enzyme`, and will build the new Enzyme submodule when it is set.

**Discussion:**
Apparently Rust CI isn't able to clone repositories outside the rust-lang org? At least I'm seeing this error in CI:
```
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
```
Does that mean we would need to mirror github.com/EnzymeAD/Enzyme in rust-lang, until LLVM upgrades Enzyme from an Incubator project to something that ships as part of the monorepo?

Tracking:

- rust-lang#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 6, 2024
…san68

Autodiff Upstreaming - enzyme backend

Tracking issue: rust-lang#124509

Part of rust-lang#129175

This PR should allow building Enzyme from source on Tier 1 targets (when also building LLVM), except MSVC.
It's only a small fraction (~200 lines) of the whole upstream PR, but due to bootstrapping and the number of configurations in which rustc can be build I assume that this will be the hardest to merge, so I'm starting with it.
Happy to hear what changes are required to be able to upstream this code.

**Content:**
It contains a new configure flag `--enable-llvm-enzyme`, and will build the new Enzyme submodule when it is set.

**Discussion:**
Apparently Rust CI isn't able to clone repositories outside the rust-lang org? At least I'm seeing this error in CI:
```
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
```
Does that mean we would need to mirror github.com/EnzymeAD/Enzyme in rust-lang, until LLVM upgrades Enzyme from an Incubator project to something that ships as part of the monorepo?

Tracking:

- rust-lang#124509
@jieyouxu jieyouxu added the F-autodiff `#![feature(autodiff)]` label Sep 7, 2024
lnicola pushed a commit to lnicola/rust-analyzer that referenced this pull request Sep 25, 2024
Autodiff Upstreaming - enzyme backend

Tracking issue: rust-lang/rust#124509

Part of rust-lang/rust#129175

This PR should allow building Enzyme from source on Tier 1 targets (when also building LLVM), except MSVC.
It's only a small fraction (~200 lines) of the whole upstream PR, but due to bootstrapping and the number of configurations in which rustc can be build I assume that this will be the hardest to merge, so I'm starting with it.
Happy to hear what changes are required to be able to upstream this code.

**Content:**
It contains a new configure flag `--enable-llvm-enzyme`, and will build the new Enzyme submodule when it is set.

**Discussion:**
Apparently Rust CI isn't able to clone repositories outside the rust-lang org? At least I'm seeing this error in CI:
```
git@github.com: Permission denied (publickey).
fatal: Could not read from remote repository.

Please make sure you have the correct access rights
and the repository exists.
```
Does that mean we would need to mirror github.com/EnzymeAD/Enzyme in rust-lang, until LLVM upgrades Enzyme from an Incubator project to something that ships as part of the monorepo?

Tracking:

- rust-lang/rust#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Sep 30, 2024
add has_enzyme/needs-enzyme to the test infra

This unblocks merging the Enzyme / Autodiff frontend.
For the full implementation, see: rust-lang#129175

We don't want to run tests that require Enzyme / Autodiff support when we build rustc without the required features.

It correctly filtered out a test which started with `//@ needs-enzyme`.
```
running 80 tests
i...............................................................................

test result: ok. 79 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 380.41ms
```

Tracking:

- rust-lang#124509

r? jieyouxu
RalfJung pushed a commit to RalfJung/miri that referenced this pull request Oct 3, 2024
add has_enzyme/needs-enzyme to the test infra

This unblocks merging the Enzyme / Autodiff frontend.
For the full implementation, see: rust-lang/rust#129175

We don't want to run tests that require Enzyme / Autodiff support when we build rustc without the required features.

It correctly filtered out a test which started with `//@ needs-enzyme`.
```
running 80 tests
i...............................................................................

test result: ok. 79 passed; 0 failed; 1 ignored; 0 measured; 0 filtered out; finished in 380.41ms
```

Tracking:

- rust-lang/rust#124509

r? jieyouxu
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 13, 2024
Autodiff Upstreaming - enzyme frontend

This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: rust-lang#129175

**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

- rust-lang#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 14, 2024
Autodiff Upstreaming - enzyme frontend

This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: rust-lang#129175

**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

- rust-lang#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 14, 2024
Autodiff Upstreaming - enzyme frontend

This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: rust-lang#129175

**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

- rust-lang#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Oct 15, 2024
Autodiff Upstreaming - enzyme frontend

This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: rust-lang#129175

**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

- rust-lang#124509

try-job: dist-x86_64-msvc
github-actions bot pushed a commit to rust-lang/miri that referenced this pull request Oct 17, 2024
Autodiff Upstreaming - enzyme frontend

This is an upstream PR for the `autodiff` rustc_builtin_macro that is part of the autodiff feature.

For the full implementation, see: rust-lang/rust#129175

**Content:**
It contains a new `#[autodiff(<args>)]` rustc_builtin_macro, as well as a `#[rustc_autodiff]` builtin attribute.
The autodiff macro is applied on function `f` and will expand to a second function `df` (name given by user).
It will add a dummy body to `df` to make sure it type-checks. The body will later be replaced by enzyme on llvm-ir level,
we therefore don't really care about the content. Most of the changes (700 from 1.2k) are in `compiler/rustc_builtin_macros/src/autodiff.rs`, which expand the macro. Nothing except expansion is implemented for now.
I have a fallback implementation for relevant functions in case that rustc should be build without autodiff support. The default for now will be off, although we want to flip it later (once everything landed) to on for nightly. For the sake of CI, I have flipped the defaults, I'll revert this before merging.

**Dummy function Body:**
The first line is an `inline_asm` nop to make inlining less likely (I have additional checks to prevent this in the middle end of rustc. If `f` gets inlined too early, we can't pass it to enzyme and thus can't differentiate it.
If `df` gets inlined too early, the call site will just compute this dummy code instead of the derivatives, a correctness issue. The following black_box lines make sure that none of the input arguments is getting optimized away before we replace the body.

**Motivation:**
The user facing autodiff macro can verify the user input. Then I write it as args to the rustc_attribute, so from here on I can know that these values should be sensible. A rustc_attribute also turned out to be quite nice to attach this information to the corresponding function and carry it till the backend.
This is also just an experiment, I expect to adjust the user facing autodiff macro based on user feedback, to improve usability.

As a simple example of what this will do, we can see this expansion:
From:
```
#[autodiff(df, Reverse, Duplicated, Const, Active)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    unimplemented!()
}
```
to
```
#[rustc_autodiff]
#[inline(never)]
pub fn f1(x: &[f64], y: f64) -> f64 {
    ::core::panicking::panic("not implemented")
}
#[rustc_autodiff(Reverse, Duplicated, Const, Active,)]
#[inline(never)]
pub fn df(x: &[f64], dx: &mut [f64], y: f64, dret: f64) -> f64 {
    unsafe { asm!("NOP"); };
    ::core::hint::black_box(f1(x, y));
    ::core::hint::black_box((dx, dret));
    ::core::hint::black_box(f1(x, y))
}
```
I will add a few more tests once I figured out why rustc rebuilds every time I touch a test.

Tracking:

- rust-lang/rust#124509

try-job: dist-x86_64-msvc
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Dec 12, 2024
Autodiff Upstreaming - rustc_codegen_llvm changes

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the `rustc_codegen_llvm` changes.
It also includes small changes to three files under `compiler/rustc_ast`, which overlap with my frontend PR (rust-lang#129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to `compiler/rustc_codegen_ssa`, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?

To already highlight the things which reviewers might want to discuss:

1) `enzyme_ffi.rs`: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

2) `add_panic_msg_to_global` was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

3) `SanitizeHWAddress`: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

4) Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is _the_ reason why Enzyme is so fast, so the compile time is acceptable for autodiff users:  https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

5) I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

6) *I'm happy to split this PR up further if reviewers have recommendations on how to.*

For the full implementation, see: rust-lang#129175

Tracking:

- rust-lang#124509
compiler-errors added a commit to compiler-errors/rust that referenced this pull request Dec 13, 2024
Autodiff Upstreaming - rustc_codegen_llvm changes

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the `rustc_codegen_llvm` changes.
It also includes small changes to three files under `compiler/rustc_ast`, which overlap with my frontend PR (rust-lang#129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to `compiler/rustc_codegen_ssa`, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?

To already highlight the things which reviewers might want to discuss:

1) `enzyme_ffi.rs`: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

2) `add_panic_msg_to_global` was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

3) `SanitizeHWAddress`: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

4) Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is _the_ reason why Enzyme is so fast, so the compile time is acceptable for autodiff users:  https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

5) I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

6) *I'm happy to split this PR up further if reviewers have recommendations on how to.*

For the full implementation, see: rust-lang#129175

Tracking:

- rust-lang#124509
Zalathar added a commit to Zalathar/rust that referenced this pull request Dec 13, 2024
Autodiff Upstreaming - rustc_codegen_llvm changes

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the `rustc_codegen_llvm` changes.
It also includes small changes to three files under `compiler/rustc_ast`, which overlap with my frontend PR (rust-lang#129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to `compiler/rustc_codegen_ssa`, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?

To already highlight the things which reviewers might want to discuss:

1) `enzyme_ffi.rs`: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

2) `add_panic_msg_to_global` was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

3) `SanitizeHWAddress`: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

4) Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is _the_ reason why Enzyme is so fast, so the compile time is acceptable for autodiff users:  https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

5) I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

6) *I'm happy to split this PR up further if reviewers have recommendations on how to.*

For the full implementation, see: rust-lang#129175

Tracking:

- rust-lang#124509
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 13, 2024
Autodiff Upstreaming - rustc_codegen_llvm changes

Now that the autodiff/Enzyme backend is merged, this is an upstream PR for the `rustc_codegen_llvm` changes.
It also includes small changes to three files under `compiler/rustc_ast`, which overlap with my frontend PR (rust-lang#129458).
Here I only include minimal definitions of structs and enums to be able to build this backend code.
The same goes for minimal changes to `compiler/rustc_codegen_ssa`, the majority of changes there will be in another PR, once either this or the frontend gets merged.

We currently have 68 files left to merge, 19 in the frontend PR, 21 (+3 from the frontend) in this PR, and then ~30 in the middle-end.

This PR is large because it includes two of my three large files (~800 loc each). I could also first only upstream enzyme_ffi.rs, but I think people might want to see some use of these bindings in the same PR?

To already highlight the things which reviewers might want to discuss:

1) `enzyme_ffi.rs`: I do have a fallback module to make sure that we don't link rustc against Enzyme when we build rustc without autodiff support.

2) `add_panic_msg_to_global` was a pain to write and I currently can't even use it. Enzyme writes gradients into shadow memory. Pass in one float scalar? We'll allocate and return an extra float telling you how this float affected the output. Pass in a slice of floats? We'll let you allocate the vector and pass in a mutable reference to a float slice, we'll then write the gradient into that slice. It should be at least as large as your original slice, so we check that and panic if not. Currently we panic silently, but I already generate a nicer panic message with this function. I just don't know how to print it to the user. yet. I discussed this with a few rustc devs and the best we could come up with (for now), was to look for mangled panic calls in the IR and pick one, which works surprisingly reliably. If someone knows a good way to clean this up and print the panic message I'm all in, otherwise I can remove the code that writes the nicer panic message and keep the silent panic, since it's enough for soundness. Especially since this PR is already a bit larger.

3) `SanitizeHWAddress`: When differentiating C++, Enzyme can use TBAA to "understand" enums/unions, but for Rust we don't have this information. LLVM might to speculative loads which (without TBAA) confuse Enzyme, so we disable those with this attribute. This attribute is only set during the first opt run before Enzyme differentiates code. We then remove it again once we are done with autodiff and run the opt pipeline a second time. Since enums are everywhere in Rust, support for them is crucial, but if this looks too cursed I can remove these ~100 lines and keep them in my fork for now, we can then discuss them separately to make this PR simpler?

4) Duplicated llvm-opt runs: Differentiating already optimized code (and being able to do additional optimizations on the fly, e.g. for GPU code) is _the_ reason why Enzyme is so fast, so the compile time is acceptable for autodiff users:  https://enzyme.mit.edu/talks/Publications/ (There are also algorithmic issues in Enzyme core which are more serious than running opt twice).

5) I assume that if we merge these minimal cg_ssa changes here already, I also need to fix the other backends (GCC and cliff) to have dummy implementations, correct?

6) *I'm happy to split this PR up further if reviewers have recommendations on how to.*

For the full implementation, see: rust-lang#129175

Tracking:

- rust-lang#124509
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
F-autodiff `#![feature(autodiff)]` S-experimental Status: Ongoing experiment that does not require reviewing and won't be merged in its current state. T-bootstrap Relevant to the bootstrap subteam: Rust's build system (x.py and src/bootstrap) T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants