Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable default decimation for nonlinear materials #2413

Merged
merged 4 commits into from
Feb 26, 2023
Merged

Conversation

stevengj
Copy link
Collaborator

@stevengj stevengj commented Feb 25, 2023

See #2411. Fixes #2408.

Copy link
Collaborator

@oskooi oskooi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

After this PR is merged, I will update the docs, #2411, and the related unit tests where we manually disabled DFT decimation since this is no longer necessary.

@oskooi
Copy link
Collaborator

oskooi commented Feb 26, 2023

For some reason, this PR is causing python/tests/test_adjoint_solver.py to stall both on CI and on my local workstation:

*** TESTING DFT OBJECTIVE ***
Starting forward run...
-----------
Initializing structure...
Splitting into 2 chunks by voxels
time for choose_chunkdivision = 9.9726e-05 s
Working in 2D dimensions.
Computational cell is 5 x 5 x 0 with resolution 30
     block, center = (0,0,0)
          size (1e+20,1,1e+20)
          axes (1,0,0), (0,1,0), (0,0,1)
          dielectric constant epsilon diagonal = (12,12,12)
     block, center = (0,0,0)
          size (1.5,1.5,0)
          axes (1,0,0), (0,1,0), (0,0,1)
time for set_epsilon = 0.0698151 s
-----------
MPB solved for frequency_1(2.2349,0,0) = 0.657635 after 15 iters
MPB solved for frequency_1(2.19109,0,0) = 0.645164 after 8 iters
MPB solved for frequency_1(2.19108,0,0) = 0.645161 after 1 iters
run 0 finished at t = 430.8333333333333 (25850 timesteps)
Starting adjoint run...
WARNING vol mismatch:, original_vol 2162, looped vol_sum 1081 
WARNING vol mismatch:, original_vol 2162, looped vol_sum 2185 
WARNING vol mismatch:, original_vol 2209, looped vol_sum 1128 

The simulation simply hangs at this step. An important clue is that the deadlock only occurs when this test is run with 2 or more chunks. The serial run is unaffected.

@stevengj stevengj merged commit a6a1192 into master Feb 26, 2023
@stevengj stevengj deleted the decimate_nonlinear branch February 26, 2023 22:01
@@ -204,10 +204,15 @@ dft_chunk *fields::add_dft(component c, const volume &where, const double *freq,
double freq_max = 0;
for (size_t i = 0; i < Nfreq; ++i)
freq_max = std::max(freq_max, std::abs(freq[i]));
if ((freq_max > 0) && (src_freq_max > 0))
if ((freq_max > 0) && (src_freq_max > 0) && !has_nonlinearities(false))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling fields::has_nonlinearities(bool parallel) with a false argument means that the or_to_all statement in fields.cpp:670 (below) is never invoked. That would mean fields::has_nonlinearities(bool parallel) returns different values for different chunks. This would seem to be a bug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because of the subsequent min_to_all statement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I replace !has_nonlinearities(false) with !has_nonlinearities(true), test_adjoint_solver deadlocks when run with two chunks (as reported previously). This seems to be due to the or_to_all call which currently is never invoked.

It's not obvious why or_to_all triggers a deadlock but min_to_all does not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why it deadlocks is that has_nonlinearities here only gets executed if (freq_max > 0) && (src_freq_max > 0) is true, because && is a short-circuiting operator.

In the DFT adjoint simulation, where the adjoint sources get placed with add_srcdata, which is only called on chunks where the sources are present, some of the chunks are missing information about the sources and have src_freq_max == 0. In consequence, those chunks are not calling has_nonlinearities, so it deadlocks if parallel == true and or_to_all is called.

In contrast, the min_to_all is always called on every chunk, because it is outside of these conditionals, so it does not deadlock.

Moreover, this means that the new min_to_all potentially fixes bugs even in linear simulations, since it was previously possible that some linear simulations over-estimated the decimation_factor on chunks for which no sources were present.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would have made the code more readable if, inside the block if (decimation_factor == 0) { ... } we had simply inserted:

if has_nonlinearities()
   decimation_factor = 1
else {
  // set decimation_factor automatically...
}

The bool argument to fields::has_nonlinearities() is not necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added the bool argument is that we need to call min_to_all anyway, even for linear simulations, to make sure we don't over-estimate the decimation_factor in cases where a chunk is missing the source term present only in another chunk. In that case, doing or_to_all is redundant, and is not cheap because it is a second MPI reduction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Third Harmonic Generation Tutorial gives different results in current release?
2 participants