Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enable single precision floating points for fields arrays #1544

Merged
merged 14 commits into from
Apr 14, 2021

Conversation

oskooi
Copy link
Collaborator

@oskooi oskooi commented Apr 1, 2021

Currently, all fields arrays (time domain and DFT) are double precision floating point. However, in the past Meep used to support single precision floating point for the fields arrays for reducing memory storage and improving runtime performance (due to the reduced memory bandwidth consumption). Unfortunately, this feature had bit rotted although its artifact still exists in src/meep.hpp:

meep/src/meep.hpp

Lines 31 to 43 in 0b460cd

/* We use the type realnum for large arrays, e.g. the fields.
For local variables and small arrays, we use double precision,
but for things like the fields we can often get away with
single precision (since the errors are not dominated by roundoff).
However, we will default to using double-precision for large
arrays, as the factor of two in memory and the moderate increase
in speed currently don't seem worth the loss of precision. */
#define MEEP_SINGLE 0 // 1 for single precision, 0 for double
#if MEEP_SINGLE
typedef float realnum;
#else
typedef double realnum;
#endif

This PR restores this feature by replacing numerous instances of double, complex<double> with realnum, complex<realnum>. Both the time-domain and the DFT fields arrays are set to the realnum type.

The PR compiles and all tests pass with MEEP_SINGLE set to 0 in src/meep.hpp (the default). This just means that this PR does not break anything.

When this feature is enabled, all but two of the nineteen C++ tests are passing (which is expected since some tests are comparing results against hard-coded values obtained using double precision):

PASS: aniso_disp
PASS: bench
PASS: bragg_transmission
PASS: convergence_cyl_waveguide
PASS: cylindrical
PASS: flux
PASS: harmonics
PASS: integrate
PASS: known_results
FAIL: near2far
PASS: one_dimensional
PASS: physical
PASS: stress_tensor
FAIL: symmetry
PASS: three_d
PASS: two_dimensional
PASS: 2D_convergence
PASS: h5test
PASS: pml
============================================================================
# TOTAL: 19
# PASS:  17
# SKIP:  0
# XFAIL: 0
# FAIL:  2
# XPASS: 0
# ERROR: 0

This PR also causes several of the Python tests to fail:

FAIL: tests/3rd_harm_1d.py
PASS: tests/absorber_1d.py
FAIL: tests/adjoint_solver.py
PASS: tests/antenna_radiation.py
PASS: tests/array_metadata.py
PASS: tests/bend_flux.py
PASS: tests/binary_grating.py
FAIL: tests/cavity_arrayslice.py
FAIL: tests/cavity_farfield.py
PASS: tests/chunk_layout.py
FAIL: tests/chunks.py
PASS: tests/cyl_ellipsoid.py
PASS: tests/dft_energy.py
PASS: tests/dft_fields.py
PASS: tests/diffracted_planewave.py
FAIL: tests/dispersive_eigenmode.py
FAIL: tests/divide_mpi_processes.py

I haven't yet performed any benchmarking to verify that this PR actually speeds up the time stepping.

Importantly, we'll need to investigate what effect reducing the floating point precision has on the adjoint solver since its unit test is failing.

@oskooi
Copy link
Collaborator Author

oskooi commented Apr 2, 2021

I modified the DFT fields to be always stored using double precision floating point. This way, it's only the precision of the time-domain fields that can be modified using the preprocessor flag.

@oskooi
Copy link
Collaborator Author

oskooi commented Apr 5, 2021

In order to ensure that the unit tests for the adjoint solver in meep/python/tests/adjoint_solver.py pass when using single-precision floating point for the fields arrays, the MaterialGrid related parameters (i.e., weights, beta, eta, etc.) in meepgeom.hpp/meepgeom.cpp/material_data.hpp have all been converted to double. Some of these parameters had type realnum which was causing the tests to fail.

Also, the API for the HDF5 routines write_chunk/read_chunk in h5file.cpp which are used to save/load the DFT fields has been modified to include an additional boolean parameter single_precision. This is because the DFT fields are now always stored using double precision and thus need to be written to disk as such independent of the value of realnum.

Note that all material (ε) related quantities of the structure_chunk class including chi1inv, condinv, chi3, etc. and susceptibility class including sigma and multilevel_susceptibility class including Gamma, N0, etc. which are multiplied by the fields arrays are stored as realnum type.

All tests are still passing when the macro MEEP_SINGLE is 0.

@stevengj
Copy link
Collaborator

stevengj commented Apr 7, 2021

If only for the sake of utilizing SIMD as much as possible, we should try to make every quantity in the inner loops of step_curl etcetera a meep::realnum, including e.g. dtdx and siginv.

(Everything else, e.g. material grids, DFTs, etcetera, we can leave as double. That simplifies the code, and in the case of DFTs may help accuracy since they are accumulated over many timesteps.)

src/meep.hpp Show resolved Hide resolved
@stevengj
Copy link
Collaborator

stevengj commented Apr 7, 2021

Eventually, it would be good to be able to switch precision at runtime, by making fields<T>, fields_chunk<T>, structure<T>, and structure_chunk<T> template classes parameterized by the realnum type T.

@stevengj
Copy link
Collaborator

stevengj commented Apr 7, 2021

Would be good to plot error vs. resolution (for a convergence test similar to the subpixel averaging papers) and reflection vs. PML thickness (similar to the PML papers), in both double and single precision. They should match, until you get to very high resolutions / long lengths where the single-precision results will become limited by roundoff errors. It would be nice to see at what point that occurs.

@oskooi
Copy link
Collaborator Author

oskooi commented Apr 8, 2021

The quantities of the inner loops of step_curl have been changed to type meep::realnum for consistency.

Also, the h5file class member functions write/read/write_chunk/read_chunk use single precision by default.

src/meep.hpp Outdated Show resolved Hide resolved
@oskooi
Copy link
Collaborator Author

oskooi commented Apr 11, 2021

The following is a plot of the error in a resonant mode of a ring resonator vs. resolution showing the two cases of single vs. double precision. (This is the same example from the subpixel smoothing tutorial.)

ring_resonator_float_vs_double

The two sets of results are nearly identical up to the fifth decimal digit at the largest resolution of 700. The resolution=700 run took 8.5 hours using 14 MPI processes for the double precision case and so I didn't bother extending the resolution range even further to identify the point at which the single and double precision begin to diverge. The main point here is that there is no loss in accuracy when using single precision.

In terms of benchmarking the runtime performance, I used the 3d OLED example which includes PMLs, DFT flux monitors, and Lorentzian susceptibilities. The timestepping rate for the single precision using 20 MPI processes was ~50% that of double precision.

For consistency, the remaining functions not addressed in this PR which should also be modified to return realnum rather than the current double are: get_array_slice, get_field, get_chi1inv, etc. Since these functions do not affect the performance as they are not generally called at each timestep over a large collection of grid points, they can be addressed in a separate PR (if necessary).

bencbartlett pushed a commit to bencbartlett/meep that referenced this pull request Sep 9, 2021
)

* remove cdouble type

* replace double,complex<double> with realnum,complex<realnum>

* always store DFT fields using double precision floating point

* fixes

* store all floating point parameters of MaterialGrid as doubles

* specify HDF5 read/write type format using single_precision parameter

* read_chunk/write_chunk in structure_dump.cpp use correct precision

* switch dft_ldos arrays to type complex<double> from complex<realnum>

* convert PML arrays sig,kap,siginv to realnum from double

* use single precision by default for h5file member functions write/read and related

* create two versions of read_chunk/write_chunk with input arrays of type float and double

* convert gyrotropic_susceptibility parameters to realnum

* minor formatting fixes

* update comment for single precision usage in meep.hpp
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants