From f579dc7ca1b02f7c8ac5656465025814f71553af Mon Sep 17 00:00:00 2001
From: Ardavan Oskooi <oskooi@users.noreply.github.com>
Date: Fri, 28 Jun 2019 10:02:04 -0700
Subject: [PATCH] FAQ on possible performance degradation on shared-memory
 systems (#939)

---
 doc/docs/Build_From_Source.md                              | 2 +-
 doc/docs/FAQ.md                                            | 4 ++++
 doc/docs/Installation.md                                   | 2 +-
 doc/docs/Synchronizing_the_Magnetic_and_Electric_Fields.md | 4 ++--
 4 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/doc/docs/Build_From_Source.md b/doc/docs/Build_From_Source.md
index 42b185eb1..289a3cb25 100644
--- a/doc/docs/Build_From_Source.md
+++ b/doc/docs/Build_From_Source.md
@@ -4,7 +4,7 @@
 
 The main effort in installing Meep lies in installing the various dependency packages. This requires some understanding of how to install software on Unix systems.
 
-It is also possible to install Meep on Windows systems. For Windows 10, you can install the [Ubuntu 16.04](https://www.microsoft.com/en-us/p/ubuntu-1604-lts/9pjn388hp8c9) or [18.04](https://www.microsoft.com/en-us/p/ubuntu/9nblggh4msv6) terminal as an app (via the [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/about) framwork) and then follow the instructions for [obtaining the Conda packages](Installation.md#conda-packages) (recommended) or [building from source](Build_From_Source.md#building-from-source). For Windows 8 and older versions, you can use the free Unix-compatibility environment [Cygwin](http://www.cygwin.org/) following these [instructions](http://novelresearch.weebly.com/installing-meep-in-windows-8-via-cygwin.html).
+It is also possible to install Meep on Windows systems. For Windows 10, you can install the [Ubuntu 16.04](https://www.microsoft.com/en-us/p/ubuntu-1604-lts/9pjn388hp8c9) or [18.04](https://www.microsoft.com/en-us/p/ubuntu/9nblggh4msv6) terminal as an app (via the [Windows Subsystem for Linux](https://docs.microsoft.com/en-us/windows/wsl/about) framework) and then follow the instructions for [obtaining the Conda packages](Installation.md#conda-packages) (recommended) or [building from source](Build_From_Source.md#building-from-source). For Windows 8 and older versions, you can use the free Unix-compatibility environment [Cygwin](http://www.cygwin.org/) following these [instructions](http://novelresearch.weebly.com/installing-meep-in-windows-8-via-cygwin.html).
 
 For those installing Meep on a supercomputer, a note of caution: most supercomputers have multiple compilers installed, and different versions of libraries compiled with different compilers. Meep is written in C++, and it is almost impossible to mix C++ code compiled by different compilers &mdash; pick one set of compilers by one vendor and stick with it consistently.
 
diff --git a/doc/docs/FAQ.md b/doc/docs/FAQ.md
index e6c274fc3..73f3300e9 100644
--- a/doc/docs/FAQ.md
+++ b/doc/docs/FAQ.md
@@ -405,6 +405,10 @@ Note: a simple approach to reduce the cost of the DTFT computation is to reduce
 
 You can always run the MPI parallel Meep on a shared-memory machine, and some MPI implementations take special advantage of shared memory communications. Meep currently also provides limited support for [multithreading](https://en.wikipedia.org/wiki/Thread_(computing)#Multithreading) via OpenMP on a single, shared-memory, multi-core machine to speed up *multi-frequency* [near-to-far field](Python_User_Interface.md#near-to-far-field-spectra) calculations involving `get_farfields` or `output_farfields`.
 
+### Why does the time-stepping rate fluctuate erratically for jobs running on a shared-memory system?
+
+Running jobs may experience intermittent slowdown on [shared-memory](https://en.wikipedia.org/wiki/Shared_memory) systems (Issue [#882](https://github.com/NanoComp/meep/issues/882)). This may possibly be due to [cache contention](https://en.wikipedia.org/wiki/Resource_contention) with other simultaneous jobs although though the cause has yet to be determined. The slowdown can be observed via increasing values of the time-stepping rate (units of "s/step") which is shown as part of the progress output.
+
 Usage: Other
 ------------
 
diff --git a/doc/docs/Installation.md b/doc/docs/Installation.md
index dc199a0dc..6adad25c4 100644
--- a/doc/docs/Installation.md
+++ b/doc/docs/Installation.md
@@ -52,7 +52,7 @@ Now, `python -c 'import meep'` should work, and you can try running some of the
 conda install -c conda-forge openblas=0.3.4
 ```
 
-Warning: The `pymeep` package is built to work with OpenBLAS, which means numpy should also use OpenBLAS. Since the default numpy is built with MKL, installing other packages into the environment may cause conda to switch to an MKL-based numpy. This can cause segmentation faults when calling MPB. To work around this, you can make sure the `no-mkl` conda package is installed, make sure you're getting packages from the `conda-forge` channel (they use OpenBLAS for everything), or as a last resort, run `import meep` before importing any other library that is linked to MKL. When installing additional packages into the `meep` environment, you should always try to install using the `-c conda-forge` flag. `conda` can ocassionally be too eager in updating packages to new versions which can leave the environment unstable. If running `conda install -c conda-forge <some-package>` attempts to replace `conda-forge` packages with equivalent versions from the `defaults` channel, you can force it to only use channels you specify (i.e., arguemnts to the `-c` flag) with the `--override-channels` flag.
+Warning: The `pymeep` package is built to work with OpenBLAS, which means numpy should also use OpenBLAS. Since the default numpy is built with MKL, installing other packages into the environment may cause conda to switch to an MKL-based numpy. This can cause segmentation faults when calling MPB. To work around this, you can make sure the `no-mkl` conda package is installed, make sure you're getting packages from the `conda-forge` channel (they use OpenBLAS for everything), or as a last resort, run `import meep` before importing any other library that is linked to MKL. When installing additional packages into the `meep` environment, you should always try to install using the `-c conda-forge` flag. `conda` can occasionally be too eager in updating packages to new versions which can leave the environment unstable. If running `conda install -c conda-forge <some-package>` attempts to replace `conda-forge` packages with equivalent versions from the `defaults` channel, you can force it to only use channels you specify (i.e., arguments to the `-c` flag) with the `--override-channels` flag.
 
 Installing parallel PyMeep follows the same pattern, but the package "build string" must be specified to bring in the MPI variant:
 
diff --git a/doc/docs/Synchronizing_the_Magnetic_and_Electric_Fields.md b/doc/docs/Synchronizing_the_Magnetic_and_Electric_Fields.md
index 994951426..14e686093 100644
--- a/doc/docs/Synchronizing_the_Magnetic_and_Electric_Fields.md
+++ b/doc/docs/Synchronizing_the_Magnetic_and_Electric_Fields.md
@@ -4,7 +4,7 @@
 
 In the finite-difference time-domain method, the electric and magnetic fields are stored at *different times* (and different positions in space), in a [leapfrog](https://en.wikipedia.org/wiki/Leapfrog_integration) fashion. At any given time-step $t$ during the simulation, the **E** and **D** fields are stored at time $t$, but the **H** and **B** fields are stored at time $t-\Delta t/2$ (where $\Delta t$ is the time-step size).
 
-This means that when you output the electric and magnetic fields from a given time step, for example, the fields actually correspond to times $\Delta t/2$ apart. For most purposes, this slight difference in time doesn't actually matter much, but it makes a difference when you compute quantities like the Poynting flux $\mathrm{Re}\{\mathbf{E}^*\times\mathbf{H}\}$ that combine electric and magnetic fields together, e.g. for the `output_poynting` (Python) or `output-poynting` (Scheme) function. If what you really want is the Poynting flux $\mathbf{S}(t)$ at time *t*, then computing $\mathrm{Re}\{\mathbf{E}(t)^*\times\mathbf{H}(t-\Delta t/2)\}$ is slightly off from this &mdash; the error is of order $O(\Delta t)$, or first-order accuracy. This is unfortunate, because the underlying FDTD method ideally can have second-order accuracy.
+This means that when you output the electric and magnetic fields from a given time step, for example, the fields actually correspond to times $\Delta t/2$ apart. For most purposes, this slight difference in time doesn't actually matter much, but it makes a difference when you compute quantities like the Poynting flux $\mathrm{Re}\{\mathbf{E}^*\times\mathbf{H}\}$ that combine electric and magnetic fields together, e.g. for the `output_poynting` (Python) or `output-poynting` (Scheme) function. If what you really want is the Poynting flux $\mathbf{S}(t)$ at time *t*, then computing $\mathrm{Re}\{\mathbf{E}(t)^*\times\mathbf{H}(t-\Delta t/2)\}$ is slightly off from this &mdash; the error is of order $O(\Delta t)$, or [first-order accuracy](https://en.wikipedia.org/wiki/Finite_difference_method#Accuracy_and_order). This is unfortunate, because the underlying FDTD method ideally can have second-order accuracy.
 
 To improve the accuracy for computations involving both electric and magnetic fields, Meep provides a facility to synchronize the **H** and **B** fields with the **E** and **D** fields in time. Technically, what it does is to compute the magnetic fields at time $t+\Delta t/2$ by performing part of a timestep, and then averaging those fields with the fields at time $t-\Delta t/2$. This produces the magnetic fields at time *t* to second-order accuracy $O(\Delta t^2)$, which is the best we can do in second-order FDTD. Meep also saves a copy of the magnetic fields at $t-\Delta t/2$, so that it can restore those fields for subsequent timestepping.
 
@@ -41,4 +41,4 @@ it will output the same quantities, but more accurately because the fields will
 
 Alternatively, if you want to synchronize the magnetic and electric fields in some context other than that of a step function, e.g. you are doing some computation like `integrate_field_function` (Python) or `integrate-field-function` (Scheme) outside of the timestepping, you can instead call two lower-level functions. Before doing your computations, you should call `meep.Simulation.fields.synchronize_magnetic_fields()` (Python) or `(meep-fields-synchronize-magnetic-fields fields)` (Scheme) to synchronize the magnetic fields with the electric fields, and after your computation you should call `meep.Simulation.fields.restore_magnetic_fields()` (Python) or `(meep-fields-restore-magnetic-fields fields)` (Scheme) to restore the fields to their unsynchronized state for timestepping. In the C++ interface, these correspond to `fields::synchronize_magnetic_fields` and `fields::restore_magnetic_fields`. If you *don't* call `meep.Simulation.fields.restore_magnetic_fields` or `meep-fields-restore-magnetic-fields` before timestepping, then the fields will be re-synchronized after *every* timestep, which will greatly increase the cost of timestepping.
 
-In future versions, we may decide to synchronize the fields automatically whenever you output something like the Poynting vector or do another field computation that involves both magnetic and electric fields, but currently you must do this manually. In any case, Meep does no additional work when you nest synchronization calls, so it is harmless to insert redundant field synchronizations. The `flux_in_box` (Python) or `flux-in-box` (Scheme) and `field_energy_in_box` (Python) or `field-energy-in-box` (Scheme) routines are already automatically synchronized, however.
\ No newline at end of file
+In future versions, the fields may be synchronized automatically whenever you output something like the Poynting vector or do another field computation that involves both magnetic and electric fields, but currently you must do this manually (Issue [#719](https://github.com/NanoComp/meep/issues/719)). In any case, Meep does no additional work when you nest synchronization calls, so it is harmless to insert redundant field synchronizations. The `flux_in_box` (Python) or `flux-in-box` (Scheme) and `field_energy_in_box` (Python) or `field-energy-in-box` (Scheme) routines are already automatically synchronized, however.
\ No newline at end of file