Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speeding up near2far #861

Closed
2 tasks done
stevengj opened this issue May 7, 2019 · 4 comments
Closed
2 tasks done

speeding up near2far #861

stevengj opened this issue May 7, 2019 · 4 comments

Comments

@stevengj
Copy link
Collaborator

stevengj commented May 7, 2019

Two options:

For the first option, would be nice to try it out. Note that you will need to compile with -fopenmp

@stevengj
Copy link
Collaborator Author

stevengj commented May 7, 2019

I added a branch omp_near2far with the changed loop (871b4ca). You will need to manually add the -fopenmp flag by configuring with

./configure CXX="g++ -fopenmp"

or similar.

Then you set the number of threads with the environment variable OMP_NUM_THREADS. For example:

OMP_NUM_THREADS=4 meep somefile.ctl

@oskooi
Copy link
Collaborator

oskooi commented May 8, 2019

This seems to be working and giving rise to linear speedup as demonstrated in the figure below. The test involves timing the get_farfields calculations in python/examples/binary_grating_n2f.py involving 21 frequencies and 500 points for 5 values of OMP_NUM_THREADS (1-5) for two cases: (1) a unit cell with add_near2far involving nperiods=20 and (2) a super cell with 41 unit cells and nperiods=1 (i.e., default: no tiling). Note: since the total time for get_farfields is not yet directly available, pending #856, the values were computed by subtracting from the "Elapsed run time" all the components in "Field time usage" (e.g., "time stepping, "communicating", etc).

n2f_openmp_semilogy

The only unusual thing was that it worked for the serial version (i.e., CXX="g++ -fopenmp") but not with the OpenMPI compiler (i.e., CXX="mpic++ -fopenmp").

@stevengj
Copy link
Collaborator Author

stevengj commented May 9, 2019

If you use MPI, you need to set MPICXX="mpic++ -fopenmp" because of this line

@stevengj
Copy link
Collaborator Author

stevengj commented May 9, 2019

Actually, the Fortran Amos routines are just for complex arguments, which we don't need. For real arguments and integer order, Julia is using the POSIX bessel functions, which we should use too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants