Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mosaic forcing clean #500

Draft
wants to merge 39 commits into
base: main
Choose a base branch
from

Conversation

davidclemenssewall
Copy link
Contributor

@davidclemenssewall davidclemenssewall commented Sep 24, 2024

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Enable the ingestion of forcing data from the MOSAiC Expedition and provide more flexibility for ice and ocean initialization.
  • Developer(s):
    David Clemens-Sewall
  • Suggest PR reviewers from list in the column to the right.
  • Please copy the PR test results link or provide a summary of testing completed below.
    Icepack base_suite with baseline comparison to main branch:
    215 measured results of 215 total results
    215 of 215 tests PASSED
  • How much do the PR code changes differ from the unmodified code?
    • bit for bit
    • different at roundoff level
    • more substantial
  • Does this PR create or have dependencies on CICE or any other models?
    • Yes
    • No
  • Does this PR add any new test cases?
    • Yes
    • No
      • But we probably should add a standard case for the MOSAiC Expedition, to do so we'll need to add the forcing data to the Icepack standard
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/.)
    • Yes
    • No, does the documentation need to be updated at a later time?
      • Yes, I think when we update the Icepack forcing data to include MOSAiC and add a MOSAiC test case the documentation should be updated.
      • No
  • Please document the changes in detail, including why the changes are made. This will become part of the PR commit log.
    The overall goal of these options added to the Icepack driver are to make it easier to use Icepack as a standalone tool forced with observations. These changes have been implemented specifically for the MOSAiC Expedition. But my hope is that they will be readily extendable to other field data (or reanalysis, etc), so that new datasets can simply be used in Icepack, rather than requiring code changes for every new forcing dataset.

Observational data presents the following complications:

  1. Missing values. Real observational datasets almost always have missing values (power loss, radiometer dome iced over, etc...). Furthermore, future versions of the observational datasets may fill some of these missing values (e.g., by reprocessing data that previously failed qc).
  2. Variable frequencies. Different data streams may be measured at different sampling frequencies. Data might be measured more often than the model's timestep (thus requiring aggregation) or less often (thus requiring interpolation).
  3. Different start and end time points. Different data streams may have different start or end times, and updated versions of datasets may extend the time domain.
  4. Merged observational datasets are a work-in-progress. The 3 challenges above all touch on the fact that for sea ice field campaigns, it can take years from the end of the campaign until a truly finalized observational dataset is available to force Icepack with (e.g., MOSAiC is 4 years and counting, SHEBA still does not have such a dataset). We still want to be able to use campaign data to support Icepack development during this time, without having to change the Icepack driver every time the forcing dataset is updated.

Additionally, for comparing with field data it is very useful to be able to easily adjust:

  1. The initial state of the sea ice and ocean mixed layer.
  2. Fixed values for the oceanic heat flux convergence (qdp) and sea surface salinity (sss).

To address these challenges the PR provides the following features:

  1. Precalculate the forcing values at each timestep on initialization (in init_forcing). The current paradigm for forcing data in Icepack is that when init_forcing is called, a subroutine for the specific dataset (e.g., atm_CFS) stores the forcing data in the *_data in essentially the same format that the raw data is present in, without timestamp information. It also hardcodes the maximum size that a forcing dataset can be to 8760. Then, at each timestep the get_forcing subroutine has a code block for each forcing dataset that contains the timestamp information and interpolates the forcing data to the given timestep. Under this paradigm, I could not figure out a graceful way to handle missing values, nor forcing data with higher sampling frequency than the Icepack timestep, nor changing start times of the forcing dataset. This PR adds a new option, precalc_forc, for use with the MOSAiC forcing. When precalc_forc = .true., the forcing data is aggregated/interpolated to the Icepack timestep in the call to init_forcing and stored the the *_data arrays. Then, the get_forcing subroutine merely indexes into the *_data arrays.
  2. Read forcing data from MDF-formatted netCDF files (Uttal et al. 2024; https://doi.org/10.5194/gmd-17-5225-2024, https://gitlab.com/mdf-makers/mdf-toolkit). I chose this format because its framework seemed well-aligned, but it is still in development so we can add variables to it that we need for the ice model (e.g., mixed layer depth).
  3. To handle variable frequencies and missing data, forcing data are first temporally-averaged for each timestep and then interpolated. For a give data variable (var_data), The MOSAiC_average subroutine takes the average of all forcing datapoints within each timestep +- 0.5 dt excluding missing values and stores the results in var_data. If there is no valid data within 0.5 dt of a given timestep (e.g., most timesteps if the model timestep is much smaller than the sampling period) then a missing value is placed in var_data(timestep). Then, the MOSAiC_interpolate subroutine linearly interpolates missing values in var_data.
  4. The option to explicitly set in the namelist the initial ice and snow thicknesses for the slab and ITD grid cells and the SST for the 3 ice grid cells.
  5. The option to explicitly set in the namelist fixed ocean mixed layer conditions qdp_fixed, sss_fixed, and hmix_fixed. These are overwritten by forcing data.

Outstanding issues:

  • Automatically detect cadence of the inputs and adjust accordingly (currently the code requires that the atmospheric forcing is at 1 minute cadence and that the oceanic forcing is at daily cadence)
  • It is not obvious to a user that setting the atmospheric forcing (atm_data_type = 'clim') can overwrite the oceanic heat flux convergence.
  • The atm_MOSAiC and ocn_MOSAiC subroutines are nearly identical except for the data variables. It would probably be cleaner to combine them into a single subroutine.
  • Perhaps input checking should be added to prevent users from setting precalc_forc = .true. with non MOSAiC forcings.
  • Update documentation to include:
    • How driver handles MDF formatted forcing
    • MOSAiC forcing
    • Initial ice and SST conditions in namelist
    • Fixed ocean mixed layer conditions in namelist
  • Add warning in log for if MOSAiC time domain includes leg 4-5 transition (observatory relocation)
  • Add MOSAiC test case
  • Add link to MOSAiC forcing data to wiki

davidclemenssewall and others added 28 commits May 8, 2023 13:55
Just updating for_igs to consortium main. Only conflict was in icedrv_init use section
@davidclemenssewall
Copy link
Contributor Author

@apcraig When I try running the base_suite on Windows Subsystem for Linux (-m conda -e linux), I fail the run stage for the conda_linux_smoke_col_1x1_alt03_debug_run1year and conda_linux_smoke_col_1x1_alt04_debug_run1year tests due to a floating-point exception. I've copied the relevant sections from the runlog below. Strangely, this error does not arise in the GH macos-testing. It also does not occur when I run these tests with the main branch locally on WSL. So it seems to be some combination of the platform and the code changes, although I cannot see how these code changes in the driver cause this error in the atmospheric stability. I was hoping you might have a suggestion on how I can fix this. Thanks!

Tail from the icepack runlog:

  Initial forcing data year =         2015
  Final   forcing data year =         2015
 Reading /root/icepack-dirs/input/Icepack_data/forcing/SHEBA/open_clos_lindsay.dat                                      


Program received signal SIGFPE: Floating-point exception - erroneous arithmetic operation.

Backtrace for this error:
#0  0x7f05dd5e951f in ???
#1  0x557a0eb656c2 in compute_stability_parameter
        at /root/code/CICE/Icepack/columnphysics/icepack_atmo.F90:992
#2  0x557a0eb67de2 in atmo_boundary_layer
        at /root/code/CICE/Icepack/columnphysics/icepack_atmo.F90:276
#3  0x557a0eb65d98 in __icepack_atmo_MOD_icepack_atm_boundary
        at /root/code/CICE/Icepack/columnphysics/icepack_atmo.F90:954
#4  0x557a0eaf4c59 in __icedrv_step_MOD_ocean_mixed_layer
        at /root/code/CICE/Icepack/configuration/driver/icedrv_step.F90:1251
#5  0x557a0ea69c25 in coupling_prep
        at /root/code/CICE/Icepack/configuration/driver/icedrv_RunMod.F90:303
#6  0x557a0ea6d57b in __icedrv_runmod_MOD_ice_step
        at /root/code/CICE/Icepack/configuration/driver/icedrv_RunMod.F90:221
#7  0x557a0ea6d6dc in __icedrv_runmod_MOD_icedrv_run
        at /root/code/CICE/Icepack/configuration/driver/icedrv_RunMod.F90:57
#8  0x557a0ea6944d in icedrv
        at /root/code/CICE/Icepack/configuration/driver/icedrv_MAIN.F90:51
#9  0x557a0ea69753 in main
        at /root/code/CICE/Icepack/configuration/driver/icedrv_MAIN.F90:28

@apcraig
Copy link
Contributor

apcraig commented Sep 24, 2024

I don't know anything about Windows systems, but I just tested your branch on derecho for the "alt04_debug_run1year". It passed with intel, fails with gnu with exactly the same traceback as above. So this is a problem in the code modifications. If you need me to help you debug, let me know.

@apcraig
Copy link
Contributor

apcraig commented Sep 24, 2024

Doing just some quick debugging, the value of hol at line 992 of icepack_atm.F90 is a NaN. Going up one line, tstar and qstar are both NaN in the computation of hol. There are several forcing options in Icepack, is it possible that your changes work for mosaic and a subset of the forcing, but not all forcing settings?

@davidclemenssewall
Copy link
Contributor Author

Thanks for the help! With that info I was able to track down the source of the 'run' error to be due to the initialization of the sstdat array in icedrv_flux.F90 from the sst array. One of the changes that I'm proposing with this PR is to allow the user to explicitly set the initial SST, which I think can be useful when looking initializing before freeze-up or during the melt season when the mixed layer is relatively fresh. But I had failed to propagate that change to the sstdat array.

The latest commit now passes all of the 'run' tests on WSL with the GNU compiler. But it is failing the baseline comparison with the main branch for all of the tests that use the alt0[1-4] namelist options (see failing tests below). It appears to me the shared factor in the tests that failed is that they contain ktherm = 1 conduct = 'bubbly'. Whereas the other tests either have ktherm = 2 or, if they use Bitz and Lipscomb thermodynamics they use MU71 conductivity.

Any suggestions would be welcome, I'll keep working on this.

Failing test results:

(base) root@dsewall-wl-psl:~/code/CICE/Icepack/testsuite.mosaic_forcing_clean# ./results.csh | grep FAIL
FAIL conda_linux_smoke_col_1x1_alt01_debug_run1year compare icepack.main different-data
FAIL conda_linux_smoke_col_1x1_alt02_debug_run1year compare icepack.main different-data
FAIL conda_linux_smoke_col_1x1_alt03_debug_run1year compare icepack.main different-data
FAIL conda_linux_smoke_col_1x1_alt04_debug_run1year compare icepack.main different-data
FAIL conda_linux_restart_col_1x1_alt01 compare icepack.main different-data
FAIL conda_linux_restart_col_1x1_alt02 compare icepack.main different-data
FAIL conda_linux_restart_col_1x1_alt03 compare icepack.main different-data
FAIL conda_linux_restart_col_1x1_alt04 compare icepack.main different-data
 FAIL - test failed
8 of 215 tests FAILED

…d qdp_data = -6.0 back to atm_climatological, note that now the atm_data_type can overwrite the ocean forcing
@davidclemenssewall
Copy link
Contributor Author

The most recent commit fixes this issue and all checks now pass with identical output.

@apcraig
Copy link
Contributor

apcraig commented Sep 27, 2024

Should we add a couple new tests to the test suite for mosaic? I assume this would require the mosaic forcing datasets be shared on zenodo? What's the plan for sharing that forcing dataset as well as any mosaic data "for comparisons" with respect to the Consortium?

@davidclemenssewall, there is a bunch of trailing whitespace in your code changes which we try to avoid. You can clean it up quickly by doing

sed -i 's/[ \t]*$//' $file

where $file is the file to be cleaned. I will also check and fix this before the next release. thanks,

tony.......

@davidclemenssewall
Copy link
Contributor Author

Fixed the trailing whitespace, thanks for the shell command!

I agree that we should have some tests with this. I think at minimum there should be a run with the 'atmmosaic' and 'ocnmosaic' initialized in November 2019 with namelist options to set the initial snow and ice thicknesses and run through July of 2020. I'm not sure if it makes sense to include some tests of different physics options with this forcing as well? I don't think any of the other forcing options provide daily updates to qdp, sss, and hmix so perhaps having the new congelation growth option would be nice to test.

The forcing datasets are currently archived at ADC: https://arcticdata.io/catalog/view/doi%3A10.18739%2FA2GX44W6J. We expect to be updating the MOSAiC forcing and evaluation (or comparison) datasets for the next couple years at least. So I'm not sure if it makes sense to add it to the existing tarball, or add it as another row in the table on this page: https://github.com/CICE-Consortium/Icepack/wiki/Icepack-Input-Data. It is also a little bit different from the existing forcing datasets in that users would be encouraged to use it for scientific results as well as code testing. Perhaps we should have a meeting to discuss?

@apcraig
Copy link
Contributor

apcraig commented Sep 27, 2024

Thanks @davidclemenssewall.

If we are going to have some mosaic tests in the test suite, then we'll need to have a reasonably stable forcing dataset that we can install on all the test machines. We'll want to put it on zenodo and have it be part of the Icepack inputdata. Happy to have a broader discussion about how to best proceed.

@davidclemenssewall, if you could propose one (or a few) test cases that we could add to the test suite, that would be great. You can use multiple "options" including 'atmmosaic' and 'ocnmosaic' as needed and turn on the appropriate physics. You can create some set_nml if needed and add them to the base_suite.ts if you'd like. For our testing, we only really need enough test cases to cover the code productively. We do not need to cover all the various physics combinations that a user might find reasonable. We need to be somewhat clever to cover "everything" while still limiting the number of tests as much as possible.

@eclare108213
Copy link
Contributor

Outstanding issues:

  • Automatically detect cadence of the inputs and adjust accordingly (currently the code requires that the atmospheric forcing is at 1 minute cadence and that the oceanic forcing is at daily cadence)

If this is easy to implement, let's do it. Otherwise I'd rather merge the PR earlier and improve it later.

  • It is not obvious to a user that setting the atmospheric forcing (atm_data_type = 'clim') can overwrite the oceanic heat flux convergence.

This could be noted in the documentation and also as a warning in the diagnostics log file when namelist values are written out.

  • The atm_MOSAiC and ocn_MOSAiC subroutines are nearly identical except for the data variables. It would probably be cleaner to combine them into a single subroutine.

Your choice.

  • Perhaps input checking should be added to prevent users from setting precalc_forc = .true. with non MOSAiC forcings.

Definitely.

I also recommend adding the forcing to our datasets and creating a few options for testing, including one that represents a 'production' configuration. I will still recommend that our codes be used in coupled configurations to capture important feedbacks, but we can adjust our stern warnings in the Icepack docs to acknowledge the MOSAiC configuration as a tool to be used thoughtfully.

@dabail10
Copy link
Contributor

dabail10 commented Oct 9, 2024

HI all. Let's discuss this at the meeting tomorrow. I can understand that it would be nice to have this as a part of the Icepack forcing dataset on zenodo. However, the National Science Foundation requires the datasets be put somewhere like the Arctic Data Center where they do reside currently. This would mean duplicating the data on zenodo. Is there a way to have a "link" from zenodo to the ADC?

Copy link

@anton-seaice anton-seaice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this David! This will be super useful

As Elizabeth noted, it needs some documentation.

And when adding the link to the forcing data to the wiki, I think the DOI link is best:

e.g. https://doi.org/10.18739/A2GX44W6J

character(char_len), public :: &
atm_data_format, & ! 'bin'=binary or 'nc'=netcdf
ocn_data_format, & ! 'bin'=binary or 'nc'=netcdf
bgc_data_format, & ! 'bin'=binary or 'nc'=netcdf
atm_data_type, & ! 'default', 'clim', 'CFS'
ocn_data_type, & ! 'default', 'SHEBA'
atm_data_type, & ! 'default', 'clim', 'CFS', 'MOSAiC'

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could use a more general name than MOSAIC for the new data type - its supports any 'MDF' file right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestions Anton! @eclare108213 and @apcraig, what would you think about making the atm_data_type and ocn_data_type for this 'MDF' instead of 'MOSAiC'? Then I would also propose that we make the default path for these forcing data: /icepack-dirs/input/Icepack_data/forcing/MDF.

Comment on lines +190 to +195
! fixed ocean mixed layer properties (are overwritten by forcing data)
real (kind=dbl_kind), public :: &
sss_fixed , & ! Sea surface salinity (PSU)
qdp_fixed , & ! Deep ocean heat flux (negative upward, W/m^2)
hmix_fixed ! Mixed layer depth (m)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My preference would be to do this in a different PR, so its clear what the scope of the change is (i.e. its not related to MOSAIC forcing data)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree that this would be best practice. But I admit that I'm loath to do so because the branch I originally made these modifications on is a bit stale and it doesn't feel like a productive use of time to sort through the merge conflicts and do all of the testing if the ultimate outcome for the code is the same. If others feel strongly about it I will try to get to it sometime next week. @apcraig @eclare108213 @dabail10 What do you think?

Comment on lines 1216 to 1218
call icedrv_system_abort(string=subname//&
' ERROR: only NetCDF input implemented for atm_MOSAiC', &
file=__FILE__,line=__LINE__)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's clearer if you put this at line 1087, if:

if (atm_data_format != 'nc') then ... abort

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. This will be fixed in the next push.

@davidclemenssewall
Copy link
Contributor Author

davidclemenssewall commented Oct 11, 2024

Outstanding issues:

  • Automatically detect cadence of the inputs and adjust accordingly (currently the code requires that the atmospheric forcing is at 1 minute cadence and that the oceanic forcing is at daily cadence)

If this is easy to implement, let's do it. Otherwise I'd rather merge the PR earlier and improve it later.

  • It is not obvious to a user that setting the atmospheric forcing (atm_data_type = 'clim') can overwrite the oceanic heat flux convergence.

This could be noted in the documentation and also as a warning in the diagnostics log file when namelist values are written out.

  • The atm_MOSAiC and ocn_MOSAiC subroutines are nearly identical except for the data variables. It would probably be cleaner to combine them into a single subroutine.

Your choice.

  • Perhaps input checking should be added to prevent users from setting precalc_forc = .true. with non MOSAiC forcings.

Definitely.

I also recommend adding the forcing to our datasets and creating a few options for testing, including one that represents a 'production' configuration. I will still recommend that our codes be used in coupled configurations to capture important feedbacks, but we can adjust our stern warnings in the Icepack docs to acknowledge the MOSAiC configuration as a tool to be used thoughtfully.

Latest commit addresses these with the exception of the warnings, documentation updates, and MOSAiC testing. Working on those currently...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants