Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

openmp for near2far calculation #868

Merged
merged 6 commits into from
May 14, 2019
Merged

openmp for near2far calculation #868

merged 6 commits into from
May 14, 2019

Conversation

stevengj
Copy link
Collaborator

@stevengj stevengj commented May 9, 2019

This adds an experimental --with-openmp flag to the configure script that turns on OpenMP compilation, and uses it for the near2far calculation for part 1 of #861 (only parallelizing if you are computing for multiple frequencies, however).

To use it, ./configure --with-openmp and set the OMP_NUM_THREADS environment variable before running Meep (or Python).

In the long run, it would be nice to use more comprehensive OpenMP parallelism (#228).

Although this parallelizes over the different frequencies because that was simplest, we could alternatively parallelize at over spatial points in the near or far field.

double freq = freq_min + i * dfreq;
#ifdef HAVE_OPENMP
# pragma omp parallel for
#endif
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For some reason, I am unable to compile with OpenMP via --with-openmp on my local machine using this #HAVE_OPENMP macro (i.e., the build is successful but the examples/binary_grating_n2f.py test uses just a single thread/process for the get_farfields calculation regardless of the value for OMP_NUM_THREADS). To fix this problem, I had to revert back to using just #pragma omp parallel for (i.e., removing the #ifdef and #endif lines) and CXX="g++ -fopenmp". Perhaps this is related to the openmp -related changes in configure.ac in this PR?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoop, I forgot to add OPENMP_CXXFLAGS to CXXFLAGS … try it again.

@oskooi
Copy link
Collaborator

oskooi commented May 10, 2019

This is working and giving linear speed up as demonstrated previously in #861 and again as shown below for both serial and MPI (np=2) on a single, shared-memory, 8-core (Kaby Lake 4.2GHz) machine. The time for the get_farfields calculations was obtained directly from the getting farfields category of the Field time usage statistics following #856.

n2f_openmp_semilogy_2

n2f_openmp_mpi_np2_semilogy

As part of this PR, it would be good to also provide documentation in two places: (1) the --with-openmp configure flag in Build from Source/Optional Dependencies and (2) for get_farfields and output_farfields in Python User Interface/Near to Far Field Spectra.

The related tutorial example Near to Far Field Spectra/Diffraction Spectrum of a Finite Binary Grating will be updated in a separate PR with these results.

@stevengj stevengj merged commit 6987396 into master May 14, 2019
@stevengj stevengj deleted the omp_near2far branch May 14, 2019 19:03
bencbartlett pushed a commit to bencbartlett/meep that referenced this pull request Sep 9, 2021
* try openmp for near2far loop

* add --with-openmp configure flag

* use --with and not --enable

* whoops need to add OPENMP_CXXFLAGS to CXXFLAGS

* docs

* typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants