Fix MPI tests on Apple Silicon #1922

sloede · 2024-04-29T07:08:23Z

          Not everything works on ARM:

LoadError: User-defined reduction operators are currently not supported on non-Intel architectures.
See JuliaParallel/MPI.jl#404 for more details.

see https://github.com/trixi-framework/Trixi.jl/actions/runs/8865649949/job/24342153523?pr=1562#step:7:1124

We need to update our CI settings to reflect this - can we still use older versions/architectures for some MacOS tests?

Originally posted by @ranocha in #1562 (comment)

The text was updated successfully, but these errors were encountered:

sloede · 2024-04-29T07:09:37Z

See also #1838.

benegee · 2024-08-21T15:37:13Z

I just came across this error when trying to run Trixi on a Grace Hopper node.

In my case it was caused by the MPI reductions for stepsize computation:

Trixi.jl/src/callbacks_step/stepsize_dg3d.jl

Line 133 in 20ab97b

dt = MPI.Allreduce!(Ref(dt), min, mpi_comm())[]

A workaround it to use Base.min instead of

Trixi.jl/src/auxiliary/math.jl

Line 311 in 20ab97b

@inline min(args...) = @fastmath min(args...)

But of course this destroys the performance gain of the Fastmath version.

Maybe we could select the min implementation depending on the current architecture?

ranocha · 2024-08-21T15:58:03Z

Could we use the Base.min for the MPI reduction explicitly? I would not expect significant performance gains there with our version

benegee · 2024-08-22T09:58:58Z

Yes, this works! Also not too many lines are affected:
2393f83

ranocha · 2024-08-22T12:29:18Z

Could you please add some comments explaining why we need Base. there and make a PR?

ranocha · 2024-08-27T12:59:55Z

Ping @benegee - would be a nice PR

benegee · 2024-09-03T15:17:50Z

@vchuravy miraculously just created JuliaParallel/MPI.jl#871

This could be used to fix the follow-up issues that appeared in #2054.

vchuravy · 2024-09-04T14:11:01Z

I think my PR is ready for a first test in a user package.

benegee · 2024-09-12T14:47:36Z

Intermediate status

I split this into smaller issues and PRs:

LoopVectorization error in AMR on ARM #2075 is an all new error, which we encountered during our tests when switching to aarch64. So far I have no idea how to fix it.
Use Base.min / Base.max in MPI reductions #2054 (which solves my initial problem with custom min and max operations on ARM) could be merged in my opinion. Changing Trixi.min to Base.min seems to be a mild change only and is not performance critical here.
Use new MPI custom ops in mpi reduce #2066 is the first PR to resolve remaining issues with SVectors in MPI reductions for integrated quantities in the analysis callbacks. This PR is based on @vchuravy's magical new macro add macro to create custom Ops also on aarch64 JuliaParallel/MPI.jl#871. I cannot really judge this approach. To me it seems very elegant. It requires only few changes in our code, and it will be easy to define more complex custom reduction operators in analogy in the future. It could also be used to solve the min/max issue above. Of course the PR would have to get accepted for MPI.jl first.
Reinterpret SVector as pointer in mpi reduce #2067 is the second PR to resolve the SVector issue. This is based on @ranocha's and @vchuravy's suggestion to do a reinterpretation as Ptr{Float64}. To me this seems like the C approach. It just works and does not require any additional tricks. The code currently looks clumsy because of the distinction between scalars and a vectors. But some Julian wizard could probably come up with a single expression here.

I could use some advice here on how to proceed. @vchuravy @ranocha @sloede

ranocha · 2024-09-12T17:30:35Z

Please ping me for a review of the PR solving your current issue. We can discuss the rest later.
It would be good to hear from @vchuravy what he thinks about his MPI.jl PR to decide how to proceed with the other parts.

sloede added the bug Something isn't working label Apr 29, 2024

ranocha added testing parallelization Related to MPI, threading, tasks etc. labels Apr 29, 2024

benegee mentioned this issue Aug 30, 2024

Use Base.min / Base.max in MPI reductions #2054

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix MPI tests on Apple Silicon #1922

Fix MPI tests on Apple Silicon #1922

sloede commented Apr 29, 2024

sloede commented Apr 29, 2024

benegee commented Aug 21, 2024

ranocha commented Aug 21, 2024

benegee commented Aug 22, 2024

ranocha commented Aug 22, 2024

ranocha commented Aug 27, 2024

benegee commented Sep 3, 2024

vchuravy commented Sep 4, 2024

benegee commented Sep 12, 2024

ranocha commented Sep 12, 2024

Fix MPI tests on Apple Silicon #1922

Fix MPI tests on Apple Silicon #1922

Comments

sloede commented Apr 29, 2024

sloede commented Apr 29, 2024

benegee commented Aug 21, 2024

ranocha commented Aug 21, 2024

benegee commented Aug 22, 2024

ranocha commented Aug 22, 2024

ranocha commented Aug 27, 2024

benegee commented Sep 3, 2024

vchuravy commented Sep 4, 2024

benegee commented Sep 12, 2024

Intermediate status

ranocha commented Sep 12, 2024