Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NVPTX: Avoid PassMode::Direct for args in C abi #117671

Merged
merged 1 commit into from
May 28, 2024

Conversation

kjetilkjeka
Copy link
Contributor

Fixes #117480

I must admit that I'm confused about PassMode altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer".

The return used to keep PassMode::Direct for small aggregates. It turns out that make_indirect messes up the tests and one way to fix it is to keep PassMode::Direct for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with byval as an alternative i ran into this assert

I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like #117458 is completed and merged.

CC: @RalfJung since you seem to be the expert on this and have already helped me out tremendously

CC: @RDambrosio016 in case this influence your work on rustc_codegen_nvvm

@rustbot label +O-NVPTX

@rustbot
Copy link
Collaborator

rustbot commented Nov 7, 2023

r? @compiler-errors

(rustbot has picked a reviewer for you, use r? to override)

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Nov 7, 2023
@kjetilkjeka
Copy link
Contributor Author

@rustbot label +O-NVPTX

@rustbot rustbot added the O-NVPTX Target: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.html label Nov 7, 2023
@RalfJung
Copy link
Member

RalfJung commented Nov 7, 2023

I must admit that I'm confused about PassMode altogether, is there a good sum-up threads for this anywhere?

Sadly, no. The best we have are the API docs. I improved them a bit recently, but we are still lacking some proper "big-picture" docs for this. largely this is because while the system was designed with some big-picture ideas ~5 years ago, those were not written down at the time and since then it's been hotfixed and patched without regard for a coherent story. The entire system needs an overhaul. See rust-lang/compiler-team#672: the compiler team accepted the proposal to overhaul this, but we're still looking for a volunteer to do the work.

CC: @RalfJung since you seem to be the expert on this and have already helped me out tremendously

😂 I knew nothing about this until around 3 months ago, and spent way too much time on it since.^^ I still don't feel like an expert, since all I know is reverse engineered from looking at parts of rustc. There are many things about ABI I do not know. But at this point I think I know at least how rustc deals with ABI, and I know some necessary (but not sufficient) conditions for ABI adjustments to make sense.

@RalfJung
Copy link
Member

RalfJung commented Nov 7, 2023

Sadly, for this PR, I can't really tell if it makes any sense. But if it helps fix the assertions and also still generates working code for your tests, then it's at least an improvement...

@bjorn3
Copy link
Member

bjorn3 commented Nov 8, 2023

I think you will want to use PassMode::Cast for aggregate types.

@kjetilkjeka
Copy link
Contributor Author

kjetilkjeka commented Nov 8, 2023

I did experiment with casting to Uniform with Reg size equal to alignment and size equal to size. This caused miscompilation when passing types like

#[repr(C)]
pub struct SingleU8 {
    f: u8,
}

The problem was that since it was passed as a single register of size 8bit it passed it like a primitive type u8 instead of an aggregate. In ptx these primitives are padded while aggregates are not.

@bjorn3
Copy link
Member

bjorn3 commented Nov 9, 2023

In ptx these primitives are padded while aggregates are not.

😱

@compiler-errors
Copy link
Member

Sorry, too busy to review this 😰

r? compiler

@rustbot rustbot assigned cjgillot and unassigned compiler-errors Nov 16, 2023
@cjgillot
Copy link
Contributor

r? compiler

@rustbot rustbot assigned davidtwco and unassigned cjgillot Nov 19, 2023
@bors
Copy link
Contributor

bors commented Nov 19, 2023

☔ The latest upstream changes (presumably #117500) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@davidtwco davidtwco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for the delay in getting to this - we don't have an expert in NVPTX on the team, so I'll take a stab at reviewing this so it isn't stuck forever. It seems reasonable to me.

Does this mean that NVPTX can be re-enabled in tests/ui/abi/compatibility.rs now? If so, could you do that in this PR? Otherwise let me know and I'll approve this.

@kjetilkjeka
Copy link
Contributor Author

When rebasing there are unfortunately regressions on some of the tests related to returning structs. I will look into re-enabling the compatibility test when I have investigated these regressions.

If there's any silver lining I guess it is that it highlights the importance of #117458 and getting the nvptx tests up and running again in CI.

@RalfJung
Copy link
Member

RalfJung commented Dec 7, 2023

Does this mean that NVPTX can be re-enabled in tests/ui/abi/compatibility.rs now? If so, could you do that in this PR? Otherwise let me know and I'll approve this.

Yeah it should be re-enabled, otherwise the claim that it fixes the issue is not accurate.

@apiraino
Copy link
Contributor

Checking status of this PR. @kjetilkjeka is this PR waiting on #117458? Switching the review flag for a rebase, thanks.

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jan 25, 2024
@kjetilkjeka
Copy link
Contributor Author

Yes, this is waiting for #117458

It turned out to be more difficult than anticipated to get correct. I will of course spend the time to figure it out, but when I manage to get it right I would hate for it to break again. Therefore I would love to wait until the nvptx tests are enabled in CI so I can write tests and be confident that things will keep working.

@apiraino apiraino added the S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. label Jan 30, 2024
@Dylan-DPC Dylan-DPC removed the S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. label Feb 17, 2024
@OwenTrokeBillard
Copy link

Is this still blocked now that #117458 is merged?

@apiraino
Copy link
Contributor

@kjetilkjeka when you have a sec. can you look at this patch again? Thanks

@rustbot author

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-blocked Status: Marked as blocked ❌ on something else such as an RFC or other implementation work. labels Apr 24, 2024
@kjetilkjeka
Copy link
Contributor Author

It's unblocked now 🎆 . I will take a look at this at the end of this week or early next week.

@kjetilkjeka
Copy link
Contributor Author

kjetilkjeka commented May 10, 2024

After looking into this my conclusions is that it's impossible to both avoid PassMode::Direct for the C ABI and generate correct ptx.

An example that illustrates this is returning a type in in this way

#[repr(C)]
pub struct SingleU8 {
    f: u8,
}

#[no_mangle]
pub unsafe extern "C" fn f_single_u8_ret() -> SingleU8 {
    SingleU8 { f: 42 }
}

Which should give the following signature in ptx.

.visible .func  (.param .align 1 .b8 func_retval0[1]) f_single_u8_ret()

The only two PassModes that give the correct signature is Direct and Indirect{on_stack: true}. For arguments it's possible to use Indirect{on_stack: true} but that is not the case for returns.

PassMode::Cast casted to an uniform does the correct thing for many types, but unfortunately not for single element structs (it passes like it was the underlying type, not the struct itself).

The solution is probably to have something like the CAbiType described in this issue and a PassMode::CAbiType(CAbiType) that can pass like a concrete C type. But everything related to this ABI compatibility things is unfortunately too complex for me to handle.

That means that this PR avoids the use of PassMode::Direct in argument position. But unfortunately doesn't manage to solve it in return position. I have added tests for both return and argument position, but the test for return position is ignored.

Copy link
Member

@davidtwco davidtwco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like an improvement even if it doesn't fix everything so LGTM.

@davidtwco
Copy link
Member

@bors r+ rollup

@bors
Copy link
Contributor

bors commented May 28, 2024

📌 Commit ead02ba has been approved by davidtwco

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels May 28, 2024
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request May 28, 2024
…ct, r=davidtwco

NVPTX: Avoid PassMode::Direct for args in C abi

Fixes rust-lang#117480

I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer".

The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22)

I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged.

CC: `@RalfJung` since you seem to be the expert on this and have already helped me out tremendously

CC: `@RDambrosio016` in case this influence your work on `rustc_codegen_nvvm`

`@rustbot` label +O-NVPTX
bors added a commit to rust-lang-ci/rust that referenced this pull request May 28, 2024
…iaskrgr

Rollup of 6 pull requests

Successful merges:

 - rust-lang#117671 (NVPTX: Avoid PassMode::Direct for args in C abi)
 - rust-lang#124251 (Add an intrinsic for `ptr::metadata`)
 - rust-lang#125573 (Migrate `run-make/allow-warnings-cmdline-stability` to `rmake.rs`)
 - rust-lang#125590 (Add a "Setup Python" action for github-hosted runners and remove unnecessary `CUSTOM_MINGW` environment variable)
 - rust-lang#125598 (Make `ProofTreeBuilder` actually generic over `Interner`)
 - rust-lang#125637 (rustfmt fixes)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request May 28, 2024
…iaskrgr

Rollup of 5 pull requests

Successful merges:

 - rust-lang#117671 (NVPTX: Avoid PassMode::Direct for args in C abi)
 - rust-lang#125573 (Migrate `run-make/allow-warnings-cmdline-stability` to `rmake.rs`)
 - rust-lang#125590 (Add a "Setup Python" action for github-hosted runners and remove unnecessary `CUSTOM_MINGW` environment variable)
 - rust-lang#125598 (Make `ProofTreeBuilder` actually generic over `Interner`)
 - rust-lang#125637 (rustfmt fixes)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 713c852 into rust-lang:master May 28, 2024
6 checks passed
@rustbot rustbot added this to the 1.80.0 milestone May 28, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request May 28, 2024
Rollup merge of rust-lang#117671 - kjetilkjeka:nvptx_c_abi_avoid_direct, r=davidtwco

NVPTX: Avoid PassMode::Direct for args in C abi

Fixes rust-lang#117480

I must admit that I'm confused about `PassMode` altogether, is there a good sum-up threads for this anywhere? I'm especially confused about how "indirect" and "byval" goes together. To me it seems like "indirect" basically means "use a indirection through a pointer", while "byval" basically means "do not use indirection through a pointer".

The return used to keep `PassMode::Direct` for small aggregates. It turns out that `make_indirect` messes up the tests and one way to fix it is to keep `PassMode::Direct` for all aggregates. I have mostly seen this PassMode mentioned for args. Is it also a problem for returns? When experimenting with `byval` as an alternative i ran into [this assert](https://github.com/rust-lang/rust/blob/61a3eea8043cc1c7a09c2adda884e27ffa8a1172/compiler/rustc_codegen_llvm/src/abi.rs#L463C22-L463C22)

I have added tests for the same kind of types that is already tested for the "ptx-kernel" abi. The tests cannot be enabled until something like rust-lang#117458 is completed and merged.

CC: ``@RalfJung`` since you seem to be the expert on this and have already helped me out tremendously

CC: ``@RDambrosio016`` in case this influence your work on `rustc_codegen_nvvm`

``@rustbot`` label +O-NVPTX
@RalfJung
Copy link
Member

Unfortunately this did not fully fix the problem, see #117480 (comment).

//@ assembly-output: ptx-linker
//@ compile-flags: --crate-type cdylib -C target-cpu=sm_86 -Z unstable-options -Clinker-flavor=llbc
//@ only-nvptx64
//@ ignore-nvptx64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this have to be ignored? You are still using PassMode::Direct for the return value, so I thought that would still generate the right ABI?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunately slightly broken (from before this PR) since there never were any tests for this.

I will enable these tests in the follow up PR we talked about that I hopefully will be able to create today

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Enabled in #125980

workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Jun 12, 2024
…ssmode, r=davidtwco

Nvptx remove direct passmode

This PR does what should have been done in rust-lang#117671. That is fully avoid using the `PassMode::Direct` for `extern "C" fn` for `nvptx64-nvidia-cuda` and enable the compatibility test. `@RalfJung` [pointed me in the right direction](rust-lang#117480 (comment)) for solving this issue.

There are still some ABI bugs after this PR is merged. These ABI tests are created based on what is actually correct, and since they continue passing with even more of them enabled things are improving. I don't have the time to tackle all the remaining issues right now, but I think getting these improvements merged is very valuable in themselves and plan to tackle more of them long term.

This also doesn't remove the use of `PassMode::Direct` for `extern "ptx-kernel" fn`. This was also not trivial to make work. And since the ABI is hidden behind an unstable feature it's less urgent.

I don't know if it's correct to request `@RalfJung` as a reviewer (due to team structures), but he helped me a lot to figure out this stuff. If that's not appropriate then `@davidtwco` would be a good candidate since he know about this topic from rust-lang#117671

r​? `@RalfJung`
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Jun 12, 2024
Rollup merge of rust-lang#125980 - kjetilkjeka:nvptx_remove_direct_passmode, r=davidtwco

Nvptx remove direct passmode

This PR does what should have been done in rust-lang#117671. That is fully avoid using the `PassMode::Direct` for `extern "C" fn` for `nvptx64-nvidia-cuda` and enable the compatibility test. `@RalfJung` [pointed me in the right direction](rust-lang#117480 (comment)) for solving this issue.

There are still some ABI bugs after this PR is merged. These ABI tests are created based on what is actually correct, and since they continue passing with even more of them enabled things are improving. I don't have the time to tackle all the remaining issues right now, but I think getting these improvements merged is very valuable in themselves and plan to tackle more of them long term.

This also doesn't remove the use of `PassMode::Direct` for `extern "ptx-kernel" fn`. This was also not trivial to make work. And since the ABI is hidden behind an unstable feature it's less urgent.

I don't know if it's correct to request `@RalfJung` as a reviewer (due to team structures), but he helped me a lot to figure out this stuff. If that's not appropriate then `@davidtwco` would be a good candidate since he know about this topic from rust-lang#117671

r​? `@RalfJung`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-NVPTX Target: the NVPTX LLVM backend for running rust on GPUs, https://llvm.org/docs/NVPTXUsage.html S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nvptx abi adjustment may keep PassMode::Direct for small aggregates in conv extern "C"