-
Notifications
You must be signed in to change notification settings - Fork 641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
disable default decimation for nonlinear materials #2413
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
After this PR is merged, I will update the docs, #2411, and the related unit tests where we manually disabled DFT decimation since this is no longer necessary.
For some reason, this PR is causing
The simulation simply hangs at this step. An important clue is that the deadlock only occurs when this test is run with 2 or more chunks. The serial run is unaffected. |
@@ -204,10 +204,15 @@ dft_chunk *fields::add_dft(component c, const volume &where, const double *freq, | |||
double freq_max = 0; | |||
for (size_t i = 0; i < Nfreq; ++i) | |||
freq_max = std::max(freq_max, std::abs(freq[i])); | |||
if ((freq_max > 0) && (src_freq_max > 0)) | |||
if ((freq_max > 0) && (src_freq_max > 0) && !has_nonlinearities(false)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling fields::has_nonlinearities(bool parallel)
with a false
argument means that the or_to_all
statement in fields.cpp:670
(below) is never invoked. That would mean fields::has_nonlinearities(bool parallel)
returns different values for different chunks. This would seem to be a bug.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, because of the subsequent min_to_all
statement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I replace !has_nonlinearities(false)
with !has_nonlinearities(true)
, test_adjoint_solver
deadlocks when run with two chunks (as reported previously). This seems to be due to the or_to_all
call which currently is never invoked.
It's not obvious why or_to_all
triggers a deadlock but min_to_all
does not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason why it deadlocks is that has_nonlinearities
here only gets executed if (freq_max > 0) && (src_freq_max > 0)
is true, because &&
is a short-circuiting operator.
In the DFT adjoint simulation, where the adjoint sources get placed with add_srcdata
, which is only called on chunks where the sources are present, some of the chunks are missing information about the sources and have src_freq_max == 0
. In consequence, those chunks are not calling has_nonlinearities
, so it deadlocks if parallel == true
and or_to_all
is called.
In contrast, the min_to_all
is always called on every chunk, because it is outside of these conditionals, so it does not deadlock.
Moreover, this means that the new min_to_all
potentially fixes bugs even in linear simulations, since it was previously possible that some linear simulations over-estimated the decimation_factor
on chunks for which no sources were present.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would have made the code more readable if, inside the block if (decimation_factor == 0) { ... }
we had simply inserted:
if has_nonlinearities()
decimation_factor = 1
else {
// set decimation_factor automatically...
}
The bool
argument to fields::has_nonlinearities()
is not necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I added the bool
argument is that we need to call min_to_all
anyway, even for linear simulations, to make sure we don't over-estimate the decimation_factor
in cases where a chunk is missing the source term present only in another chunk. In that case, doing or_to_all
is redundant, and is not cheap because it is a second MPI reduction.
See #2411. Fixes #2408.