Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VP solver: division by zero in FGMRES if starting with no ice #42

Closed
phil-blain opened this issue Jul 29, 2022 · 4 comments
Closed

VP solver: division by zero in FGMRES if starting with no ice #42

phil-blain opened this issue Jul 29, 2022 · 4 comments
Assignees
Milestone

Comments

@phil-blain
Copy link
Owner

Reproducer;

./cice.setup -m ppp6 -e intel -c ~/cice-dirs/cases/vp-icnone -s icnone,dynpicard,debug     

It is the first normalization by norm_residual that causes the abort.

@phil-blain phil-blain self-assigned this Aug 1, 2022
@phil-blain
Copy link
Owner Author

phil-blain commented Aug 9, 2022

  • norm_residual is the norm of arnoldi_basis_[xy], which is b[xy] - workspace_[xy].

    • b[xy] is all zero
    • workspace_[xy] is matvec(sol[xy]) and sol[xy] is the initial iterate which is zero because of icnone (no ice so velocity is zero everywhere...).
  • bx (and similarly for y) is bx(i,j) = bxfix(i,j) + taux + strintx (calc_bvec)

    • strintx is dPr/dx which is zero if velocity is zero
    • taux = vrel(i,j)*waterxU(i,j) is zero because vrel is zero
      • vrel is zero because ocean velocity is zero (calc_vrel_Cb)
    • bxfix is umassdti(i,j)*uvel_init(i,j) + forcexU(i,j)
      • uvel_init is zero
      • forcexU is forcex(i,j) = strairx(i,j) + strtltx(i,j) (dyn_prep2)
        • strtltx is zero if uocn is zero (dyn_prep2)
        • strairx is also all zero.
  • strairx from dyn_prep2 is strairxU in implicit_solver

  • strairxU is call grid_average_X2Y('F',strairxT,'T',strairxU,'U') if calc_strair = .true. (default)

    • strairxT is initialized to 0 in init_coupler_flux and init_flux_atm.
    • strairxT is "merged" across categories from strairxn (merge_fluxes) IF aicen_init(n) > puny (icepack_step_therm1) (which is false since aice_init = 0
    • strairxn is computed in icepack_atm_boundary if calc_strair AND aicen_init(n) > puny, which is false if starting with no ice

So all of these fields being zero seems correct and makes sense.

We have A x = b with b = 0 so we must have x = 0. It is thus the FMGRES algorithm which has to be tweaked to account for b = 0. This is e.g. what Scipy does:
https://github.com/scipy/scipy/blob/651a9b717deb68adde9416072c1e1d5aa14a58a1/scipy/sparse/linalg/_isolve/iterative.py#L620-L628

phil-blain added a commit that referenced this issue Aug 9, 2022
phil-blain added a commit that referenced this issue Aug 9, 2022
@phil-blain
Copy link
Owner Author

  • 6a030d6 is the fix for the current code
  • ef5858e is the fix for the new code

@phil-blain
Copy link
Owner Author

phil-blain commented Aug 10, 2022

52fd683 is a missing piece for a complete fix.

@phil-blain phil-blain added this to the Picard solver milestone Aug 10, 2022
phil-blain added a commit that referenced this issue Oct 5, 2022
If starting a run with with "ice_ic='none'" (no ice), the linearized
problem for the ice velocity A x = b will have b = 0, since all terms in
the right hand side vector will be zero:

- strint[xy] is zero because the velocity is zero
- tau[xy] is zero because the ocean velocity is also zero
- [uv]vel_init is zero
- strair[xy] is zero because the concentration is zero
- strtlt[xy] is zero because the ocean velocity is zero

We thus have a linear system A x = b with b=0, so we
must have x=0.

In the FGMRES linear solver, this special case is not taken into
account, and so we end up with an all-zero initial residual since
workspace_[xy] is also zero because of the all-zero initial guess
'sol[xy]', which corresponds to the initial ice velocity. This then
leads to a division by zero when normalizing the first Arnoldi vector.

Fix this special case by computing the norm of the right-hand-side
vector before starting the iterations, and exiting early if it is zero.
This is in line with the GMRES implementation in SciPy [1].

[1] https://github.com/scipy/scipy/blob/651a9b717deb68adde9416072c1e1d5aa14a58a1/scipy/sparse/linalg/_isolve/iterative.py#L620-L628

Close: #42
phil-blain added a commit that referenced this issue Oct 17, 2022
If starting a run with with "ice_ic='none'" (no ice), the linearized
problem for the ice velocity A x = b will have b = 0, since all terms in
the right hand side vector will be zero:

- strint[xy] is zero because the velocity is zero
- tau[xy] is zero because the ocean velocity is also zero
- [uv]vel_init is zero
- strair[xy] is zero because the concentration is zero
- strtlt[xy] is zero because the ocean velocity is zero

We thus have a linear system A x = b with b=0, so we
must have x=0.

In the FGMRES linear solver, this special case is not taken into
account, and so we end up with an all-zero initial residual since
workspace_[xy] is also zero because of the all-zero initial guess
'sol[xy]', which corresponds to the initial ice velocity. This then
leads to a division by zero when normalizing the first Arnoldi vector.

Fix this special case by computing the norm of the right-hand-side
vector before starting the iterations, and exiting early if it is zero.
This is in line with the GMRES implementation in SciPy [1].

[1] https://github.com/scipy/scipy/blob/651a9b717deb68adde9416072c1e1d5aa14a58a1/scipy/sparse/linalg/_isolve/iterative.py#L620-L628

Close: #42
apcraig pushed a commit to CICE-Consortium/CICE that referenced this issue Oct 20, 2022
* doc: fix typo in index (bfbflag)

* doc: correct default value of 'maxits_nonlin'

The "Table of namelist options" in the user guide lists 'maxits_nonlin'
as having a default value of 1000, whereas its actual default is 4, both
in the namelist and in 'ice_init.F90'. This has been the case since the
original implementation of the implicit solver in f7fd063 (dynamics: add
implicit VP solver (#491), 2020-09-22).

Fix the documentation.

* doc: VP solver is validated with OpenMP

When the implicit VP solver was added in f7fd063 (dynamics: add implicit
VP solver (#491), 2020-09-22), it had not yet been tested with OpenMP
enabled.

The OpenMP implementation was carefully reviewed and then fixed in
d1e972a (Update OMP (#680), 2022-02-18), which lead to all runs of the
'decomp' suite completing and all restart tests passing. The 'bfbcomp'
tests are still failing, but this is due to the code not using the CICE
global sum implementation correctly, which will be fixed in the next
commits.

Update the documentation accordingly.

* ice_dyn_vp: activate OpenMP in 'dyn_prep2' loop

When the OpenMP implementation was reviewed and fixed in d1e972a (Update
OMP (#680), 2022-02-18), the 'PRIVATE' clause of the OpenMP directive
for the loop where 'dyn_prep2' is called in 'implicit_solver' was
corrected in line with what was done in 'ice_dyn_evp', but OpenMP was
left unactivated for this loop (the 'TCXOMP' was not changed to a real
'OMP' directive).

Activate OpenMP for this loop. All runs and restart tests of the
'decomp_suite' still pass with this change.

* machines: eccc : add ICE_MACHINE_MAXRUNLENGTH to ppp[56]

* machines: eccc: use PBS-enabled OpenMPI for 'ppp6_gnu'

The system installation of OpenMPI at /usr/mpi/gcc/openmpi-4.1.2a1/ is
not compiled with support for PBS. This leads to failures as the MPI
runtime does not have the same view of the number of available processors
as the job scheduler.

Use our own build of OpenMPI, compiled with PBS support, for the
'ppp6_gnu'  environment, which uses OpenMPI.

* machines: eccc: set I_MPI_FABRICS=ofi

Intel MPI 2021.5.1, which comes with oneAPI 2022.1.2, seems to have an
intermittent bug where a call to 'MPI_Waitall' fails with:

    Abort(17) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Waitall: See the MPI_ERROR field in MPI_Status for the error code

and no core dump is produced. This affects at least these cases of the
'decomp' suite:

- *_*_restart_gx3_16x2x1x1x800_droundrobin
- *_*_restart_gx3_16x2x2x2x200_droundrobin

This was reported to Intel and they suggested setting the variable
'I_MPI_FABRICS' to 'ofi' (the default being 'shm:ofi' [1]). This
disables shared memory transport and indeeds fixes the failures.

Set this variable for all ECCC machine files using Intel MPI.

[1] https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/environment-variables-for-fabrics-control/communication-fabrics-control.html

* machines: eccc: set I_MPI_CBWR for BASEGEN/BASECOM runs

Intel MPI, in contrast to OpenMPI (as far as I was able to test, and see
[1], [2]), does not (by default) guarantee that repeated runs of the same
code on the same machine with the same number of MPI ranks yield the
same results when collective operations (e.g. 'MPI_ALLREDUCE') are used.

Since the VP solver uses MPI_ALLREDUCE in its algorithm, this leads to
repeated runs of the code giving different answers, and baseline
comparing runs with code built from the same commit failing.

When generating a baseline or comparing against an existing baseline,
set the environment variable 'I_MPI_CBWR' to 1 for ECCC machine files
using Intel MPI [3], so that (processor) topology-aware collective
algorithms are not used and results are reproducible.

Note that we do not need to set this variable on robert or underhill, on
which jobs have exclusive node access and thus job placement (on
processors) is guaranteed to be reproducible.

[1] https://stackoverflow.com/a/45916859/
[2] https://scicomp.stackexchange.com/a/2386/
[3] https://www.intel.com/content/www/us/en/develop/documentation/mpi-developer-reference-linux/top/environment-variable-reference/i-mpi-adjust-family-environment-variables.html#i-mpi-adjust-family-environment-variables_GUID-A5119508-5588-4CF5-9979-8D60831D1411

* ice_dyn_vp: fgmres: exit early if right-hand-side vector is zero

If starting a run with with "ice_ic='none'" (no ice), the linearized
problem for the ice velocity A x = b will have b = 0, since all terms in
the right hand side vector will be zero:

- strint[xy] is zero because the velocity is zero
- tau[xy] is zero because the ocean velocity is also zero
- [uv]vel_init is zero
- strair[xy] is zero because the concentration is zero
- strtlt[xy] is zero because the ocean velocity is zero

We thus have a linear system A x = b with b=0, so we
must have x=0.

In the FGMRES linear solver, this special case is not taken into
account, and so we end up with an all-zero initial residual since
workspace_[xy] is also zero because of the all-zero initial guess
'sol[xy]', which corresponds to the initial ice velocity. This then
leads to a division by zero when normalizing the first Arnoldi vector.

Fix this special case by computing the norm of the right-hand-side
vector before starting the iterations, and exiting early if it is zero.
This is in line with the GMRES implementation in SciPy [1].

[1] https://github.com/scipy/scipy/blob/651a9b717deb68adde9416072c1e1d5aa14a58a1/scipy/sparse/linalg/_isolve/iterative.py#L620-L628

Close: phil-blain#42

* ice_dyn_vp: add global_norm, global_dot_product functions

The VP solver uses a linear solver, FGMRES, as part of the non-linear
iteration. The FGMRES algorithm involves computing the norm of a
distributed vector field, thus performing global sums.

These norms are computed by first summing the squared X and Y components
of a vector field in subroutine 'calc_L2norm_squared', summing these
over the local blocks, and then doing a global (MPI) sum using
'global_sum'.

This approach does not lead to reproducible results when the MPI
distribution, or the number of local blocks, is changed, for reasons
explained in the "Reproducible sums" section of the Developer Guide
(mostly, floating point addition is not associative). This was partly
pointed out in [1] but I failed to realize it at the time.

Introduce a new function, 'global_dot_product', to encapsulate the
computation of the dot product of two grid vectors, each split into two
arrays (for the X and Y components).

Compute the reduction locally as is done in 'calc_L2norm_squared', but
throw away the result and use the existing 'global_sum' function when
'bfbflag' is active, passing it the temporary array used to compute the
element-by-element product.

This approach avoids a performance regression from the added work done
in 'global_sum', such that non-bfbflag runs are as fast as before.

Note that since 'global_sum' loops on the whole array (and not just ice
points as 'global_dot_product'), make sure to zero-initialize the 'prod'
local array.

Also add a 'global_norm' function implemented using
'global_dot_product'. Both functions will be used in subsequent commits
to ensure bit-for-bit reproducibility.

* ice_dyn_vp: use global_{norm,dot_product} for bit-for-bit output reproducibility

Make the results of the VP solver reproducible if desired by refactoring
the code to use the subroutines 'global_norm' and 'global_dot_product'
added in the previous commit.

The same pattern appears in the FGMRES solver (subroutine 'fgmres'), the
preconditioner 'pgmres' which uses the same algorithm, and the
Classical and Modified Gram-Schmidt algorithms in 'orthogonalize'.

These modifications do not change the number of global sums in the
fgmres, pgmres and the MGS algorithm. For the CGS algorithm, there is
(in theory) a slight performance impact as 'global_dot_product' is
called inside the loop, whereas previously we called
'global_allreduce_sum' after the loop to compute all 'initer' sums at
the same time.

To keep that optimization, we would have to implement a new interface
'global_allreduce_sum' which would take an array of shape
(nx_block,ny_block,max_blocks,k) and sum over their first three
dimensions before performing the global reduction over the k dimension.

We choose to not go that route for now mostly because anyway the CGS
algorithm is (by default) only used for the PGMRES preconditioner, and
so the cost should be relatively low as 'initer' corresponds to
'dim_pgmres' in the namelist, which should be kept low for efficiency
(default 5).

These changes lead to bit-for-bit reproducibility (the decomp_suite
passes) when using 'precond=ident' and 'precond=diag' along with
'bfbflag=reprosum'. 'precond=pgmres' is still not bit-for-bit because
some halo updates are skipped for efficiency. This will be addressed in
a following commit.

[1] #491 (comment)

* ice_dyn_vp: do not skip halo updates in 'pgmres' under 'bfbflag'

The 'pgmres' subroutine implements a separate GMRES solver and is used
as a preconditioner for the FGMRES linear solver. Since it is only a
preconditioner, it was decided to skip the halo updates after computing
the matrix-vector product (in 'matvec'), for efficiency.

This leads to non-reproducibility since the content of the non-updated
halos depend on the block / MPI distribution.

Add the required halo updates, but only perform them when we are
explicitely asking for bit-for-bit global sums, i.e. when 'bfbflag' is
set to something else than 'not'.

Adjust the interfaces of 'pgmres' and 'precondition' (from which
'pgmres' is called) to accept 'halo_info_mask', since it is needed for
masked updates.

Closes #518

* ice_dyn_vp: use global_{norm,dot_product} for bit-for-bit log reproducibility

In the previous commits we ensured bit-for-bit reproducibility of the
outputs when using the VP solver.

Some global norms computed during the nonlinear iteration still use the
same non-reproducible pattern of summing over blocks locally before
performing the reduction. However, these norms are used only to monitor
the convergence in the log file, as well as to exit the iteration when
the required convergence level is reached ('nlres_norm'). Only
'nlres_norm' could (in theory) influence the output, but it is unlikely
that a difference due to floating point errors would influence the 'if
(nlres_norm < tol_nl)' condition used to exist the nonlinear iteration.

Change these remaining cases to also use 'global_norm', leading to
bit-for-bit log reproducibility.

* ice_dyn_vp: remove unused subroutine and cleanup interfaces

The previous commit removed the last caller of 'calc_L2norm_squared'.
Remove the subroutine.

Also, do not compute 'sum_squared' in 'residual_vec', since the variable
'L2norm' which receives this value is also unused in 'anderson_solver'
since the previous commit. Remove that variable, and adjust the
interface of 'residual_vec' accordingly.

* ice_global_reductions: remove 'global_allreduce_sum'

In a previous commit, we removed the sole caller of
'global_allreduce_sum' (in ice_dyn_vp::orthogonalize). We do not
anticipate that function to be ued elsewhere in the code, so remove it
from ice_global_reductions. Update the 'sumchk' unit test accordingly.

* doc: mention VP solver is only reproducible using 'bfbflag'

The previous commits made sure that the model outputs as well as the log
file output are bit-for-bit reproducible when using the VP solver by
refactoring the code to use the existing 'global_sum' subroutine.

Add a note in the documentation mentioning that 'bfbflag' is required to
get bit-for-bit reproducible results under different decompositions /
MPI counts when using the VP solver.

Also, adjust the doc about 'bfbflag=lsum8' being the same as
'bfbflag=off' since this is not the case for the VP solver: in the first
case we use the scalar version of 'global_sum', in the second case we
use the array version.

* ice_dyn_vp: improve default parameters for VP solver

During QC testing of the previous commit, the 5 years QC test with the
updated VP solver failed twice with "bad departure points" after a few
years of simulation. Simply bumping the number of nonlinear iterations
(maxits_nonlin) from 4 to 5 makes these failures disappear and allow the
simulations to run to completion, suggesting the solution is not
converged enough with 4 iterations.

We also noticed that in these failing cases, the relative tolerance for
the linear solver (reltol_fmgres = 1E-2) is too small to be reached in
less than 50 iterations (maxits_fgmres), and that's the case at each
nonlinear iteration. Other papers mention a relative tolerance of 1E-1
for the linear solver, and using this value also allows both cases to
run to completion (even without changing maxits_nonlin).

Let's set the default tolerance for the linear solver to 1E-1, and let's
be conservative and bump the number of nonlinear iterations to 10. This
should give us a more converged solution and add robustness to the
default settings.
@phil-blain
Copy link
Owner Author

Closed in CICE-Consortium#774

phil-blain added a commit that referenced this issue Mar 7, 2023
* merge latest master (#4)

* Isotopes for CICE (CICE-Consortium#423)

Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: David Bailey <dbailey@ucar.edu>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>

* updated orbital calculations needed for cesm

* fixed problems in updated orbital calculations needed for cesm

* update CICE6 to support coupling with UFS

* put in changes so that both ufsatm and cesm requirements for potential temperature and density are satisfied

* Convergence on ustar for CICE. (CICE-Consortium#452) (#5)

* Add atmiter_conv to CICE

* Add documentation

* trigger build the docs

Co-authored-by: David A. Bailey <dbailey@ucar.edu>

* update icepack submodule

* Revert "update icepack submodule"

This reverts commit e70d1ab.

* update comp_ice.backend with temporary ice_timers fix

* Fix threading problem in init_bgc

* Fix additional OMP problems

* changes for coldstart running

* Move the forapps directory

* remove cesmcoupled ifdefs

* Fix logging issues for NUOPC

* removal of many cpp-ifdefs

* fix compile errors

* fixes to get cesm working

* fixed white space issue

* Add restart_coszen namelist option

* update icepack submodule

* change Orion to orion in backend

remove duplicate print lines from ice_transport_driver

* add -link_mpi=dbg to debug flags (#8)

* cice6 compile (#6)

* enable debug build. fix to remove errors

* fix an error in comp_ice.backend.libcice

* change Orion to orion for machine identification

* changes for consistency w/ current emc-cice5 (#13)

Update to emc/develop fork to current CICE consortium 

Co-authored-by: David A. Bailey <dbailey@ucar.edu>
Co-authored-by: Tony Craig <apcraig@users.noreply.github.com>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>
Co-authored-by: Mariana Vertenstein <mvertens@ucar.edu>
Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>

* Fixcommit (#14)

Align commit history between emc/develop and cice-consortium/master

* Update CICE6 for integration to S2S


* add wcoss_dell_p3 compiler macro

* update to icepack w/ debug fix

* replace SITE with MACHINE_ID

* update compile scripts

* Support TACC stampede (#19)

* update icepack

* add ice_dyn_vp module to CICE_InitMod

* update gitmodules, update icepack

* Update CICE to consortium master (#23)

updates include:

* deprecate upwind advection (CICE-Consortium#508)
* add implicit VP solver (CICE-Consortium#491)

* update icepack

* switch icepack branches

* update to icepack master but set abort flag in ITD routine
to false

* update icepack

* Update CICE to latest Consortium master (#26)


update CICE and Icepack

* changes the criteria for aborting ice for thermo-conservation errors
* updates the time manager
* fixes two bugs in ice_therm_mushy
* updates Icepack to Consortium master w/ flip of abort flag for troublesome IC cases

* add cice changes for zlvs (#29)

* update icepack and pointer

* update icepack and revert gitmodules

* Fix history features

- Fix bug in history time axis when sec_init is not zero.
- Fix issue with time_beg and time_end uninitialized values.
- Add support for averaging with histfreq='1' by allowing histfreq_n to be any value
  in that case.  Extend and clean up construct_filename for history files.  More could
  be done, but wanted to preserve backwards compatibility.
- Add new calendar_sec2hms to converts daily seconds to hh:mm:ss.  Update the
  calchk calendar unit tester to check this method
- Remove abort test in bcstchk, this was just causing problems in regression testing
- Remove known problems documentation about problems writing when istep=1.  This issue
  does not exist anymore with the updated time manager.
- Add new tests with hist_avg = false.  Add set_nml.histinst.

* revert set_nml.histall

* fix implementation error

* update model log output in ice_init

* Fix QC issues

- Add netcdf ststus checks and aborts in ice_read_write.F90
- Check for end of file when reading records in ice_read_write.F90 for
  ice_read_nc methods
- Update set_nml.qc to better specify the test, turn off leap years since we're cycling
  2005 data
- Add check in c ice.t-test.py to make sure there is at least 1825 files, 5 years of data
- Add QC run to base_suite.ts to verify qc runs to completion and possibility to use
  those results directly for QC validation
- Clean up error messages and some indentation in ice_read_write.F90

* Update testing

- Add prod suite including 10 year gx1prod and qc test
- Update unit test compare scripts

* update documentation

* reset calchk to 100000 years

* update evp1d test

* update icepack

* update icepack

* add memory profiling (#36)


* add profile_memory calls to CICE cap

* update icepack

* fix rhoa when lowest_temp is 0.0

* provide default value for rhoa when imported temp_height_lowest
(Tair) is 0.0
* resolves seg fault when frac_grid=false and do_ca=true

* update icepack submodule

* Update CICE for latest Consortium master (#38)


    * Implement advanced snow physics in icepack and CICE
    * Fix time-stamping of CICE history files
    * Fix CICE history file precision

* Use CICE-Consortium/Icepack master (#40)

* switch to icepack master at consortium

* recreate cap update branch (#42)


* add debug_model feature
* add required variables and calls for tr_snow

* remove 2 extraneous lines

* remove two log print lines that were removed prior to
merge of driver updates to consortium

* duplicate gitmodule style for icepack

* Update CICE to latest Consortium/main (#45)

* Update CICE to Consortium/main (CICE-Consortium#48)


Update OpenMP directives as needed including validation via new omp_suite. Fixed OpenMP in dynamics.
Refactored eap puny/pi lookups to improve scalar performance
Update Tsfc implementation to make sure land blocks don't set Tsfc to freezing temp
Update for sea bed stress calculations

* fix comment, fix env for orion and hera

* replace save_init with step_prep in CICE_RunMod

* fixes for cgrid repro

* remove added haloupdates

* baselines pass with these extra halo updates removed

* change F->S for ocean velocities and tilts

* fix debug failure when grid_ice=C

* compiling in debug mode using -init=snan,arrays requires
initialization of variables

* respond to review comments

* remove inserted whitespace for uvelE,N and vvelE,N

* Add wave-cice coupling; update to Consortium main (CICE-Consortium#51)


* add wave-ice fields
* initialize aicen_init, which turns up as NaN in calc of floediam
export
* add call to icepack_init_wave to initialize wavefreq and dwavefreq
* update to latest consortium main (PR 752)

* add initializationsin ice_state

* initialize vsnon/vsnon_init and vicen/vicen_init

Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: David Bailey <dbailey@ucar.edu>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>
Co-authored-by: Mariana Vertenstein <mvertens@ucar.edu>
Co-authored-by: Minsuk Ji <57227195+MinsukJi-NOAA@users.noreply.github.com>
Co-authored-by: Tony Craig <apcraig@users.noreply.github.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
phil-blain added a commit that referenced this issue Sep 29, 2023
…ICE-Consortium#856)

* merge latest master (#4)

* Isotopes for CICE (CICE-Consortium#423)

Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: David Bailey <dbailey@ucar.edu>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>

* updated orbital calculations needed for cesm

* fixed problems in updated orbital calculations needed for cesm

* update CICE6 to support coupling with UFS

* put in changes so that both ufsatm and cesm requirements for potential temperature and density are satisfied

* Convergence on ustar for CICE. (CICE-Consortium#452) (#5)

* Add atmiter_conv to CICE

* Add documentation

* trigger build the docs

Co-authored-by: David A. Bailey <dbailey@ucar.edu>

* update icepack submodule

* Revert "update icepack submodule"

This reverts commit e70d1ab.

* update comp_ice.backend with temporary ice_timers fix

* Fix threading problem in init_bgc

* Fix additional OMP problems

* changes for coldstart running

* Move the forapps directory

* remove cesmcoupled ifdefs

* Fix logging issues for NUOPC

* removal of many cpp-ifdefs

* fix compile errors

* fixes to get cesm working

* fixed white space issue

* Add restart_coszen namelist option

* update icepack submodule

* change Orion to orion in backend

remove duplicate print lines from ice_transport_driver

* add -link_mpi=dbg to debug flags (#8)

* cice6 compile (#6)

* enable debug build. fix to remove errors

* fix an error in comp_ice.backend.libcice

* change Orion to orion for machine identification

* changes for consistency w/ current emc-cice5 (#13)

Update to emc/develop fork to current CICE consortium 

Co-authored-by: David A. Bailey <dbailey@ucar.edu>
Co-authored-by: Tony Craig <apcraig@users.noreply.github.com>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>
Co-authored-by: Mariana Vertenstein <mvertens@ucar.edu>
Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>

* Fixcommit (#14)

Align commit history between emc/develop and cice-consortium/master

* Update CICE6 for integration to S2S


* add wcoss_dell_p3 compiler macro

* update to icepack w/ debug fix

* replace SITE with MACHINE_ID

* update compile scripts

* Support TACC stampede (#19)

* update icepack

* add ice_dyn_vp module to CICE_InitMod

* update gitmodules, update icepack

* Update CICE to consortium master (#23)

updates include:

* deprecate upwind advection (CICE-Consortium#508)
* add implicit VP solver (CICE-Consortium#491)

* update icepack

* switch icepack branches

* update to icepack master but set abort flag in ITD routine
to false

* update icepack

* Update CICE to latest Consortium master (#26)


update CICE and Icepack

* changes the criteria for aborting ice for thermo-conservation errors
* updates the time manager
* fixes two bugs in ice_therm_mushy
* updates Icepack to Consortium master w/ flip of abort flag for troublesome IC cases

* add cice changes for zlvs (#29)

* update icepack and pointer

* update icepack and revert gitmodules

* Fix history features

- Fix bug in history time axis when sec_init is not zero.
- Fix issue with time_beg and time_end uninitialized values.
- Add support for averaging with histfreq='1' by allowing histfreq_n to be any value
  in that case.  Extend and clean up construct_filename for history files.  More could
  be done, but wanted to preserve backwards compatibility.
- Add new calendar_sec2hms to converts daily seconds to hh:mm:ss.  Update the
  calchk calendar unit tester to check this method
- Remove abort test in bcstchk, this was just causing problems in regression testing
- Remove known problems documentation about problems writing when istep=1.  This issue
  does not exist anymore with the updated time manager.
- Add new tests with hist_avg = false.  Add set_nml.histinst.

* revert set_nml.histall

* fix implementation error

* update model log output in ice_init

* Fix QC issues

- Add netcdf ststus checks and aborts in ice_read_write.F90
- Check for end of file when reading records in ice_read_write.F90 for
  ice_read_nc methods
- Update set_nml.qc to better specify the test, turn off leap years since we're cycling
  2005 data
- Add check in c ice.t-test.py to make sure there is at least 1825 files, 5 years of data
- Add QC run to base_suite.ts to verify qc runs to completion and possibility to use
  those results directly for QC validation
- Clean up error messages and some indentation in ice_read_write.F90

* Update testing

- Add prod suite including 10 year gx1prod and qc test
- Update unit test compare scripts

* update documentation

* reset calchk to 100000 years

* update evp1d test

* update icepack

* update icepack

* add memory profiling (#36)


* add profile_memory calls to CICE cap

* update icepack

* fix rhoa when lowest_temp is 0.0

* provide default value for rhoa when imported temp_height_lowest
(Tair) is 0.0
* resolves seg fault when frac_grid=false and do_ca=true

* update icepack submodule

* Update CICE for latest Consortium master (#38)


    * Implement advanced snow physics in icepack and CICE
    * Fix time-stamping of CICE history files
    * Fix CICE history file precision

* Use CICE-Consortium/Icepack master (#40)

* switch to icepack master at consortium

* recreate cap update branch (#42)


* add debug_model feature
* add required variables and calls for tr_snow

* remove 2 extraneous lines

* remove two log print lines that were removed prior to
merge of driver updates to consortium

* duplicate gitmodule style for icepack

* Update CICE to latest Consortium/main (#45)

* Update CICE to Consortium/main (CICE-Consortium#48)


Update OpenMP directives as needed including validation via new omp_suite. Fixed OpenMP in dynamics.
Refactored eap puny/pi lookups to improve scalar performance
Update Tsfc implementation to make sure land blocks don't set Tsfc to freezing temp
Update for sea bed stress calculations

* fix comment, fix env for orion and hera

* replace save_init with step_prep in CICE_RunMod

* fixes for cgrid repro

* remove added haloupdates

* baselines pass with these extra halo updates removed

* change F->S for ocean velocities and tilts

* fix debug failure when grid_ice=C

* compiling in debug mode using -init=snan,arrays requires
initialization of variables

* respond to review comments

* remove inserted whitespace for uvelE,N and vvelE,N

* Add wave-cice coupling; update to Consortium main (CICE-Consortium#51)


* add wave-ice fields
* initialize aicen_init, which turns up as NaN in calc of floediam
export
* add call to icepack_init_wave to initialize wavefreq and dwavefreq
* update to latest consortium main (PR 752)

* add initializationsin ice_state

* initialize vsnon/vsnon_init and vicen/vicen_init

* Update CICE (CICE-Consortium#54)


* update to include recent PRs to Consortium/main

* fix for nudiag_set

allow nudiag_set to be available outside of cesm; may prefer
to fix in coupling interface

* Update CICE for latest Consortium/main (CICE-Consortium#56)

* add run time info

* change real(8) to real(dbl)kind)

* fix syntax

* fix write unit

* use cice_wrapper for ufs timer functionality

* add elapsed model time for logtime

* tidy up the wrapper

* fix case for 'time since' at the first advance

* add timer and forecast log

* write timer values to timer log, not nu_diag
* write log.ice.fXXX

* only one time is needed

* modify message written for log.ice.fXXX

* change info in fXXX log file

* Update CICE from Consortium/main (CICE-Consortium#62)


* Fix CESMCOUPLED compile issue in icepack. (CICE-Consortium#823)
* Update global reduction implementation to improve performance, fix VP bug (CICE-Consortium#824)
* Update VP global sum to exclude local implementation with tripole grids
* Add functionality to change hist_avg for each stream (CICE-Consortium#827)
* Update Icepack to #6703bc533c968 May 22, 2023 (CICE-Consortium#829)
* Fix for mesh check in CESM driver (CICE-Consortium#830)
* Namelist option for time axis position. (CICE-Consortium#839)

* reset timer after Advance to retrieve "wait time"

* add logical control for enabling runtime info

* remove zsal items from cap

* fix typo

---------

Co-authored-by: apcraig <anthony.p.craig@gmail.com>
Co-authored-by: David Bailey <dbailey@ucar.edu>
Co-authored-by: Elizabeth Hunke <eclare@lanl.gov>
Co-authored-by: Mariana Vertenstein <mvertens@ucar.edu>
Co-authored-by: Minsuk Ji <57227195+MinsukJi-NOAA@users.noreply.github.com>
Co-authored-by: Tony Craig <apcraig@users.noreply.github.com>
Co-authored-by: Philippe Blain <levraiphilippeblain@gmail.com>
Co-authored-by: Jun.Wang <Jun.Wang@noaa.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant