Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

disable default decimation for nonlinear materials #2413

Merged
merged 4 commits into from
Feb 26, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion src/dft.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -204,10 +204,15 @@ dft_chunk *fields::add_dft(component c, const volume &where, const double *freq,
double freq_max = 0;
for (size_t i = 0; i < Nfreq; ++i)
freq_max = std::max(freq_max, std::abs(freq[i]));
if ((freq_max > 0) && (src_freq_max > 0))
if ((freq_max > 0) && (src_freq_max > 0) && !has_nonlinearities(false))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calling fields::has_nonlinearities(bool parallel) with a false argument means that the or_to_all statement in fields.cpp:670 (below) is never invoked. That would mean fields::has_nonlinearities(bool parallel) returns different values for different chunks. This would seem to be a bug.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because of the subsequent min_to_all statement.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I replace !has_nonlinearities(false) with !has_nonlinearities(true), test_adjoint_solver deadlocks when run with two chunks (as reported previously). This seems to be due to the or_to_all call which currently is never invoked.

It's not obvious why or_to_all triggers a deadlock but min_to_all does not.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason why it deadlocks is that has_nonlinearities here only gets executed if (freq_max > 0) && (src_freq_max > 0) is true, because && is a short-circuiting operator.

In the DFT adjoint simulation, where the adjoint sources get placed with add_srcdata, which is only called on chunks where the sources are present, some of the chunks are missing information about the sources and have src_freq_max == 0. In consequence, those chunks are not calling has_nonlinearities, so it deadlocks if parallel == true and or_to_all is called.

In contrast, the min_to_all is always called on every chunk, because it is outside of these conditionals, so it does not deadlock.

Moreover, this means that the new min_to_all potentially fixes bugs even in linear simulations, since it was previously possible that some linear simulations over-estimated the decimation_factor on chunks for which no sources were present.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would have made the code more readable if, inside the block if (decimation_factor == 0) { ... } we had simply inserted:

if has_nonlinearities()
   decimation_factor = 1
else {
  // set decimation_factor automatically...
}

The bool argument to fields::has_nonlinearities() is not necessary.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added the bool argument is that we need to call min_to_all anyway, even for linear simulations, to make sure we don't over-estimate the decimation_factor in cases where a chunk is missing the source term present only in another chunk. In that case, doing or_to_all is redundant, and is not cheap because it is a second MPI reduction.

decimation_factor = std::max(1, int(std::floor(1 / (dt * (freq_max + src_freq_max)))));
else
decimation_factor = 1;

// with add_srcdata sources, it's possible that not all
// sources are present on all chunks, leading us to over-estimate
// the allowed decimation_factor -- take minimimum to be sure:
decimation_factor = min_to_all(decimation_factor);
}
data.decimation_factor = decimation_factor;

Expand Down
7 changes: 7 additions & 0 deletions src/fields.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,13 @@ void fields_chunk::use_real_fields() {
}
}

bool fields::has_nonlinearities(bool parallel) const {
bool nonlinear = false;
for (int i = 0; i < num_chunks; i++)
if (chunks[i]->is_mine()) nonlinear = nonlinear || chunks[i]->s->has_nonlinearities();
return parallel ? or_to_all(nonlinear) : nonlinear;
}

int fields::phase_in_material(const structure *snew, double time) {
if (snew->num_chunks != num_chunks)
meep::abort("Can only phase in similar sets of chunks: %d vs %d\n", snew->num_chunks,
Expand Down
9 changes: 8 additions & 1 deletion src/meep.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,9 @@ class susceptibility {
int get_id() const { return id; }
bool operator==(const susceptibility &s) const { return id == s.id; };

// Returns the 1st order nonlinear susceptibility (generic)
virtual bool has_nonlinearities() const { return next ? next->has_nonlinearities() : false; }

// Returns the 1st order (linear) susceptibility (generic)
virtual std::complex<realnum> chi1(realnum freq, realnum sigma = 1);

// update all of the internal polarization state given the W field
Expand Down Expand Up @@ -345,6 +347,8 @@ class multilevel_susceptibility : public susceptibility {
virtual susceptibility *clone() const { return new multilevel_susceptibility(*this); }
virtual ~multilevel_susceptibility();

virtual bool has_nonlinearities() const { return true; }

virtual void update_P(realnum *W[NUM_FIELD_COMPONENTS][2],
realnum *W_prev[NUM_FIELD_COMPONENTS][2], realnum dt, const grid_volume &gv,
void *P_internal_data) const;
Expand Down Expand Up @@ -616,6 +620,8 @@ class structure_chunk {
pml_profile_func pml_profile, void *pml_profile_data, double pml_profile_integral,
double pml_profile_integral_u);

bool has_nonlinearities() const;

void add_susceptibility(material_function &sigma, field_type ft, const susceptibility &sus);

void mix_with(const structure_chunk *, double);
Expand Down Expand Up @@ -1753,6 +1759,7 @@ class fields {
void reset();
void log(const char *prefix = "");
void change_m(double new_m);
bool has_nonlinearities(bool parallel = true) const;

// time.cpp
std::vector<double> time_spent_on(time_sink sink);
Expand Down
10 changes: 10 additions & 0 deletions src/structure.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -545,6 +545,16 @@ bool structure_chunk::has_chi1inv(component c, direction d) const {
return is_mine() && chi1inv[c][d] && !trivial_chi1inv[c][d];
}

bool structure_chunk::has_nonlinearities() const {
bool nonlinear = false;
if (!is_mine()) return false;
FOR_COMPONENTS(c) { nonlinear = nonlinear || chi2[c] || chi3[c]; }
FOR_FIELD_TYPES(ft) {
if (chiP[ft]) nonlinear = nonlinear || chiP[ft]->has_nonlinearities();
}
return nonlinear;
}

void structure::mix_with(const structure *oth, double f) {
if (num_chunks != oth->num_chunks)
meep::abort("You can't phase materials with different chunk topologies...\n");
Expand Down