Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve minimization and fix bug in vqc #219

Closed
jderber-NOAA opened this issue Sep 30, 2021 · 5 comments · Fixed by #242
Closed

Improve minimization and fix bug in vqc #219

jderber-NOAA opened this issue Sep 30, 2021 · 5 comments · Fixed by #242
Assignees

Comments

@jderber-NOAA
Copy link
Contributor

jderber-NOAA commented Sep 30, 2021

Code is in my branch minimfixes.

The minimization has not worked well in certain cases. See issue #154. Various small fixes to the analysis code have been made to improve this convergence. They have been included in this issue for inclusion in the master. Included in these changes is a bug fix for the vqc that has caused some resets in the minimization process. Also include is a slight reformulation (should have identical results) of the cg minimization. This reformulation allows a Non-Linear (NL) diagnostic to be printed in each iteration. As long as NL << 1. the magnitude of the nonlinearity in the minimization process should not be an issue. More details follow below. Also included are some no impact optimization changes (removal of unnecessary code).

Changes:

  1. Removal of the reset_predictors_var. Routine removed from berror.f90 and call removed from setuprhsall.f90. Resetting the variances of the predictors in the outer iteration should not be done. It will result in an incompatibility of the x and y solutions after the first outer iteration. Also, there is no reason the background variances should change after an outer iteration. The background variances should be set from the previous analysis.

  2. Changes to set_predictors_var in berror.f90. Making the background error variances dependent on the number of observations in the current analysis does not make sense, as in point 1. Removing this dependency results in a significant simplification of the code. Note, the output background error variances remain dependent on the number of observations.

  3. Removal of redundant calls to genqsat within crtm_interface. qsat from ges_qsat used instead.

  4. Remove checks for temperature being less than 1.e-8 in genqsat. If you have temperatures that low, you have other problems.

  5. In observer.F90 comment out code that does nothing.

  6. Clean up and modify pcgsoi.f90
    a. Remove unused input parameter lanlerr from multb.
    b. Change order of calculations in b calculation to allow calculation of NL parameter.
    for the numerator of the b calculation, the dot product (vprecond(i)*(gradx%values(i)-xdiff%values(i)), grady) is calculated. For a quadratic J, the second term (vprecond(i)*xdiff%values(i),grady) should be equal to zero. So in the new code this quantity divided by the first term (vprecond(i)*gradx%values(i),grady) becomes a measure of the non-linearity (NL). When NL is close to 1. nonlinearity is large (or there is a coding error) and a reset is likely. Note before the bug in the vqc was fixed (see below) NL values were often close to 1. and resets occurred. With the removal of the errors, the NL values are small and resets are rare. NL is written each iteration to fort.220
    c. The maximum value for b is increased from 7. to 10.

  7. prt_guess.f90 - clean up formatting a bit and change mpi_allgather to mpi_gather since result only needed on processor 0.

  8. setupcldtot.f90 - remove external :: genqsat since not used.

  9. setupq.f90

When there are many hurricane recon observations in clouds, the reading code can produce supersaturated observation (note the original observations are RH and are limited by 100%) since the observed temperature (and thus qsat) are larger than the model values. This creates issues because the minimization is pushing the q values towards the observations, while the supersaturation constraint is pushing back. To get around this, the q values is limited to the saturated value. Note, the q value then can change with the outer iteration since qsat can change. This is put on a flag. If you want this on set limitqobs = .true.

  1. stpcalc.f90. Set up flag for when to collect penalties to reduce calculations (it makes a difference!) and rearrange the code to make it cleaner.

  2. stplimq.f90 Fix improper use of integer zero.

  3. stpw.f90 Bug fix for vqc. In original code, the vqc part calculates the vqc weight based on u2+v2. In the intw, u and v are treated separately. Changed to treat u and v parts separately to be consistent with intw. Removal of this error resulted in slightly different result and the removal of some (all?) resets.

  4. setupq, intlimq, stplimq and a few other places where q may be limited by qsat. Currently the code limits the values of q by the saturated values. Because t (and thus qsat) changes by iteration and because of the changes to setupq above, a change was included which allows the q to be larger than the saturated value. A namelist parameter "superfact" was added to allow the limiting factor to be superfact*qsat. The default for superfact is 1.0. If superfact is 1.0, the results are the same as before. This reduces the number of observations where the q observed value is limited by the saturated values in setupq and reduces the impact of the supersaturation limit in the constraint. However, changes are small since a 1% change in the supersaturated value is small.

@jderber-NOAA
Copy link
Contributor Author

jderber-NOAA commented Oct 13, 2021

Changes which can help minimization: 1,8c,11,14
Changes which can change results more than round off: 1,2,11,14,15
Changes for improved diagnostics: 8b
Changes to clean up code:2,3,4,5,6,7,8a,9,10,12,13

@jderber-NOAA
Copy link
Contributor Author

jderber-NOAA commented Oct 13, 2021

Below are two plots produced for one case with the master and minimfixes using both thinning of the ascat winds and not. Note that an additional 100 iterations was added to the second outer iteration because the biggest problems started in the later iterations for this case. In other cases, the problems can be seen earlier.

The master run with no thinning did not reset, but stopped after iteration 138 of the second outer iteration.

The master run with thinning reset 9 times at iterations 98,140,142,143,144,145,146,147 of the second outer iteration and the minimization stopped after iteration 147 of the second outer iteration.

Both minimfixes runs completed with no Resets and no premature terminations. I have not seen a Reset with any minimfixes run (when the new vqc is used). When the old vqc is used Resets still occur.

image
image

@jderber-NOAA
Copy link
Contributor Author

jderber-NOAA commented Oct 14, 2021

Regression tests passed. Most regression tests were different because of the small changes to the background error in of the satellite radiances (1 and 2 above) and round-off. Note that for tmpreg_hwrf_nmm_d2 and d3 the results were a bit different. It turns out this is because they are using the old vqc. When the old vqc turned on after 20 iterations the results diverged somewhat. This is because the old vqc is very nonlinear and the minimization algorithm does not work well using the old vqc. The differences in the q allows it to go down somewhat different paths. When the variational qc was turned off, the results were very similar (round-off differences).

I recommend that all gsi applications do not use the old vqc and instead use Jim's new vqc. The new vqc has nice characteristics in the minimization and does not create any real issues (once the bug above is fixed). Note that most regression tests use the old vqc.

Resets occur when using the old vqc, but I have not seen one with Jim's new vqc (with bug fix above)

@jderber-NOAA
Copy link
Contributor Author

The plots above have limitqobs = .true. and superfact = 1.01. The following plots use the default values.

image

@jderber-NOAA
Copy link
Contributor Author

Another case

image

Again the master terminated early after 169 iterations of the second outer iteration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant