Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nml settings #208

Merged
merged 7 commits into from
Oct 19, 2018
Merged

Nml settings #208

merged 7 commits into from
Oct 19, 2018

Conversation

duvivier
Copy link
Contributor

A number of namelist changes to address issue #123

  • Developer(s): Alice DuVivier

  • Please suggest code Pull Request reviewers in the column at right.

  • Are the code changes bit for bit, different at roundoff level, or more substantial?
    They should be bit for bit, but a few tests are failing that I can't figure out why and am hoping @apcraig can help here.

  • Please include the link to test results or paste the summary block from the bottom of the testing output below.
    https://github.com/CICE-Consortium/Test-Results/wiki/8512623310.cheyenne.intel.181015.231518

  • Does this PR create or have dependencies on Icepack or any other models? No

  • Is the documentation being updated with this PR? (Y/N) Y
    If not, does the documentation need to be updated separately at a later time? (Y/N) N/A

  • Other Relevant Details:
    These variable were already modified on the namelist, so don't need to be addressed (update namelist and documentation #123 list was in error):
    emissivity
    tfrz_option: mushy_layer --> mushy

Namelist re-naming/moving/adding:
default_season (Added)
sst_data_type --> ocn_data_type (Renamed)
sss_data_type --> bgc_data_type (Renamed)
restore_sst --> restore_ocn (Renamed)
shortwave: default --> ccsm3 (option renamed)
albedo_type: default --> ccsm3 (option renamed)
bgc_data_dir --> forcing_nml (moved consistent w/ Icepack)
sil_data_type, nit_data_type, and fe_data_type --> forcing_nml (moved consistent with icepack)

I did not change the bgc data types (Sil, nit, fe) because despite @njeffery's comment it was a lot less clear how this should be done cleanly given the way these are implemented in the code. I'm open to further discussion on this since it still feels a bit messy to me.

Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious to me why the one case is failing. roundrobin seems to be rather touchy.

Why reorder the forcing_nml namelist? Does this match something else better? These were ordered roughly (imperfectly) with general items first, then atmo, then ocean, then ice restoring. I recommend keeping the order the same as in Icepack, to the extent possible.

I think you can simply replace sil_data_type etc with bgc_data_type and then simplify the logic in the code (remove duplicates). This removes the ability to mix and match bgc data sets, but the tests that we have now don't mix and match... The one section of the code that wouldn't work would be where nit_data_type is allowed to be 'sss'. We could add logic so that the other blocks are activated if bgc_data_type is 'clim' or 'ISPOL' or 'sss', or we could abandon that (sss) functionality in the interest of simplicity. @njeffery ? HOWEVER since @njeffery has a couple of PRs waiting, let's wait on this change, to avoid getting even more conflicts in the others.

@duvivier
Copy link
Contributor Author

I also didn't know why just the one test is failing. I'm hoping @apcraig might lend some insight here.

I ordered the ice_in &forcing_nml section to roughly parallel the icepack_in &forcing_nml section (see below, very bottom of comment). They aren't one-to-one, but I could make them as close as possible. That's very minor, so just let me know what you think.

My concern with the {sil,nit,fe}_data_type switch to bgc_data_type was exactly as you describe. THe mixing and matching would go away and I'm not sure how critical or not that is and that @njeffery would likely need to be involved and I wasn't sure she had the time now. I think if we just keep it in the current format and maybe add an "improvement" issue for future modification that is the best step at this time. I think we've converged that this will wait.

&forcing_nml (from icepack_in for icepack)
formdrag = .false.
atmbndy = 'default'
calc_strair = .true.
calc_Tsfc = .true.
highfreq = .false.
natmiter = 5
ustar_min = 0.0005
emissivity = 0.95
fbot_xfer_type = 'constant'
update_ocn_f = .false.
l_mpond_fresh = .false.
tfrz_option = 'mushy'
oceanmixed_ice = .true.
restore_ocn = .false.
trestore = 90
precip_units = 'mks'
default_season = 'spring'
atm_data_type = 'clim'
ocn_data_type = 'SHEBA'
bgc_data_type = 'default'
fyear_init = 2015
ycycle = 1
data_dir = '/Users/ftuser/Desktop/CICE-Consortium/ICEPACK_DATA/'
atm_data_file = 'unknown_atm_data_file'
ocn_data_file = 'unknown_ocn_data_file'
bgc_data_file = 'unknown_bgc_data_file'
ice_data_file = 'open_clos_lindsay.dat'
atm_data_format = 'bin'
ocn_data_format = 'bin'
bgc_data_format = 'bin'

&forcing_nml (from ice_in from cice)
atmbndy = 'default'
calc_strair = .true.
calc_Tsfc = .true.
update_ocn_f = .false.
l_mpond_fresh = .false.
ustar_min = 0.0005
fbot_xfer_type = 'constant'
oceanmixed_ice = .true.
emissivity = 0.95
formdrag = .false.
highfreq = .false.
natmiter = 5
tfrz_option = 'mushy'
default_season = 'winter'
precip_units = 'mm_per_month'
fyear_init = 1997
ycycle = 1
restore_ocn = .false.
trestore = 90
restore_ice = .false.
atm_data_type = 'ncar'
ocn_data_type = 'default'
bgc_data_type = 'default'
sil_data_type = 'default'
nit_data_type = 'default'
fe_data_type = 'default'
atm_data_format = 'bin'
ocn_data_format = 'bin'
oceanmixed_file = 'unknown_oceanmixed_file'
atm_data_dir = '/glade/u/home/tcraig/cice_data/'
ocn_data_dir = '/unknown_ocn_data_dir'
bgc_data_dir = '/uknown_bgc_data_dir'

@duvivier
Copy link
Contributor Author

@apcraig The error I'm getting on the failed test is this:

WARNING: evp_prep calculates surface tilt
WARNING: stress from geostrophic currents,
WARNING: not data from ocean forcing file.
WARNING: Alter ice_dyn_evp.F if desired.
ocean mixed layer forcing data file =
oceanmixed_ice_depth.nc/unknown_oceanmixed_file

(abort_ice)ABORTED:
(abort_ice) error =
(ice_open_nc)ERROR: Cannot open oceanmixed_ice_depth.nc/unknown_oceanmixed_file
MPT ERROR: Rank 0(g:0) is aborting with error code 0.

Log is here:
/glade/u/home/duvivier/code/CICE/testsuite.cice_nml/cheyenne_intel_restart_gx1_40x4_droundrobin_short.cice_nml/logs/

In the log file from the CICE_master test suite roundrobin simulation I don't see anything at all about trying to read the oceanmixed_ice_depth.nc file and this didn't show up in any other tests. It looks to me like somehow the 'ocn_data_dir' is somehow automatically being set to this file. I don't think this is something I messed up in coding it up (if so I can't find it in the code I changed).

My best guess is that the option is set incorrectly here: configuration/scripts/options/set_nml.gx1 where we have 'ocn_data_dir = 'oceanmixed_ice_depth.nc'. I think maybe this should be:
ocn_data_dir = /glade/p/cesm/pcwg_dev/CICE_data/forcing/gx1/
oceanmixed_file = 'oceanmixed_ice_depth.nc'

That being said, there isn't an oceanmixed_ice_depth.nc file in the forcing directory structure listed above. I did do some poking around and found it here: /glade/p/cesm/pcwg_dev/other_files/ but this seems odd. If we need it for the test suite shouldn't I include it in the CICE_data regular structure/tarball and if it is needed, why aren't other tests blowing up.

Does this seem like the right path or is something else going on?

@duvivier
Copy link
Contributor Author

@eclare108213 I just updated so that the namelist order now matches the icepack namelist order.

I re-ran the test suite and it's failing on the same roundrobin test:
https://github.com/CICE-Consortium/Test-Results/wiki/34c5284600.cheyenne.intel.181016.104600

The error message is the same (see path below), so @apcraig hopefully you can help me unravel the issue I highlighted in the previous message to this thread.
/glade/u/home/duvivier/code/CICE/testsuite.cice_nml2/cheyenne_intel_restart_gx1_40x4_droundrobin_short.cice_nml2/logs/cice.runlog.181016-045352

@apcraig
Copy link
Contributor

apcraig commented Oct 18, 2018

Sorry, was on travel but back now. I had a quick look. This is the only gx1 test and it is the only one failing. I do see a possible typo,

bgc_data_dir = /uknown_bgc_data_dir

I don't know if that is creating some problems (missing n). The string "unknown" is a special string in a few places. That should probably be fixed.

Comparing the gx3 and gx1 set_nml files, there is a difference that the ocn_data is set for gx1 but not gx3. It actually does look like the ocean mixed layer forcing data file is not correct. The error message says,

ocean mixed layer forcing data file =
oceanmixed_ice_depth.nc/unknown_oceanmixed_file

So there is a problem with the namelist. The namelist has

oceanmixed_file = 'unknown_oceanmixed_file'
ocn_data_dir    = 'oceanmixed_ice_depth.nc'

I think ocn_data_dir is supposed to be a directory while oceanmixed_file should be the nc file or something like that? I think it comes down to these settings in set_nml.gx1,

ocn_data_format = 'nc'
ocn_data_type = 'ncar'
ocn_data_dir = 'oceanmixed_ice_depth.nc'

I note the set_nml.gx1 has not changed at all. Is that right?

And that something there is not set properly. I also see that ocn_data_dir is set to /unknown... where it used to be just unknown... in the default ice_in file. Is that correct?

My next step will be to run some cases with old and new sandboxes and try to understand what's different better and debug further. Let me know if you want me to do that.

@eclare108213
Copy link
Contributor

Just a quick clarification that we have an ocean forcing data file for gx1 but not for gx3.

@duvivier
Copy link
Contributor Author

@apcraig
Thanks for finding that typo. I'm still running into round robin issues.

I am still running into some testing issues with the roundrobin case.

  1. If I don't set a forcing files in the set_nml.gx1 I get an error and it exits
    (see log files in: /glade/u/home/duvivier/code/CICE/testsuite.cice_nml6/)
    ocean mixed layer forcing data file =
    /unknown_ocn_data_dir/unknown_oceanmixed_file

(abort_ice)ABORTED:
(abort_ice) error =
(ice_open_nc)ERROR: Cannot open /unknown_ocn_data_dir/unknown_oceanmixed_file
MPT ERROR: Rank 0(g:0) is aborting with error code 0.

  1. If I do set the correct forcing file and path in set_nml.gx1 I get a seg fault.
    (see log files in: /glade/u/home/duvivier/code/CICE/testsuite.cice_nml4/)
    ocean mixed layer forcing data file =
    /glade/p/cesm/pcwg_dev/other_files//oceanmixed_ice_depth.nc

forrtl: severe (174): SIGSEGV, segmentation fault occurred

@duvivier
Copy link
Contributor Author

@eclare108213 Yes, there is a forcing file for gx1 but not gx3. However we currently don't include this file as part of our ftp tarballs, so I don't think it should be necessary for tests.

I guess what I'm confused about is why this roundrobin test fails for me but not in the main test suite. I have checked the master test suite against which I'm comparing and it has the typo on the ice_in namelist:
oceanmixed_ice = .true.
ocn_data_dir = 'oceanmixed_ice_depth.nc'
oceanmixed_file = 'unknown_oceanmixed_file'
However, it doesn't crash because it can't open this file and there are no errors or warnings about it I can find in the log.

@apcraig
Copy link
Contributor

apcraig commented Oct 18, 2018

If you diff ice_in from master and your current run, there are several diffs and they probably matter. < is in the master, > is in the new case. These should be reviewed. I think the namelist setup has changed. In particular, the new namelist has "ocn_data_type=ncar" while the old one had "sst_data_type=default"?

<     albedo_type     = 'default'
---
>     albedo_type     = 'ccsm3'
57a58
>     bgc_data_type   = 'default'
70a72
>     default_season  = 'winter'
447a450
>     ocn_data_type   = 'ncar'
500c503
<     restore_sst     = .false.
---
>     restore_ocn     = .false.
518,519d520
<     sss_data_type   = 'default'
<     sst_data_type   = 'default'

@duvivier
Copy link
Contributor Author

@apcraig Yes, there are differences in the namelist since some of these are specific changes @eclare108213 requested in issue #123 .

Specifically,

  • Changing the nml order to reflect the icepack order (this shouldn't matter really)
  • Changing albedo_type of default to be named ccsm3 (looks like the change was done correctly in the nml comparison)
  • Adding default season to the namelist rather than having it l_winter or l_spring hardwired in the code in an ifdef statement (see ice_flux.F90). Since this wasn't a coupled case l_winter = .true. with the master version and in the modified namelist default_season = winter, so I think this looks good.
  • changing restore_sst to restore_ocn (looks like the change was done correctly in the nml comparison)
  • changing sss_data_type to bgc_data_type (looks like the change was done correctly in the nml comparison)
  • changing sst_data_type to sss_data_type

Of the changes you list everything looks okay except sst_data_type being set from 'default' in the master case to 'ncar' in the test I'm running. I'll look into this more.

@eclare108213
Copy link
Contributor

The erroneous data directory is getting set with the gx1 grid option:

set_nml.gx1:ocn_data_dir = 'oceanmixed_ice_depth.nc'

which obviously needs to be fixed to include both ocn_data_dir and ocn_data_file, correctly. This file also sets ocn_data_type = 'ncar', as it should.

I don't know why that's not crashing all of the gx1 runs, unless the logic in the code that handles ocn_data_dir and ocn_data_file is wrong.

So there might be several things that need to be fixed. My initial suspicion is that some parameter is not getting set properly, or there's a memory conflict, or something like that, which changes depending on the parallelization/decomposition. I suggest starting by turning on the debug option, to rule out that kind of thing. At some point it might be necessary to step through the code with a debugger. I can do this in my office (where I'm not, at the moment)...

And then we should take a good look at the options files to make sure that the configurations make sense.

@apcraig
Copy link
Contributor

apcraig commented Oct 18, 2018

I don't think this is a decomposition issue. This is the only gx1 test, it just happens to be roundrobin. But I don't believe the decomp is playing a role. I think the question is why is this broken now and not on the master. If "ocn_data_type=ncar" and "sst_data_type=default" are not the same setting, then that is probably playing a bit role.

I'm not clear whether the gx1 should or should not be using the ocn_data_type data. It could be that we have never run the ocn_data_type setting on any cases before and it would be broken on the master in the same way if we turned it on. If that's the case, it's really just a matter of getting the settings and filenames in the namelist correct.

@eclare108213
Copy link
Contributor

That makes sense. The gx1prod option should use the ocean forcing data file -- but it might not have been run since the bug was introduced into gx1 option file (in May).

@apcraig
Copy link
Contributor

apcraig commented Oct 19, 2018

I have confirmed that if in gx1, we set sst_data_type to 'ncar', the test case fails. We had ocn_data_type='ncar' and that was doing nothing to the namelist. I am also modifying the parse_namelist stuff so we catch errors in the namelist options if they occur.

@duvivier
Copy link
Contributor Author

@apcraig Just to clarify. It sounds like the error was that in the current consortium master we set ocn_data_type in the configuration/scripts/options/set_nml.gx1 and since that wasn't a valid namelist option it was ignored and the test ran fine. But when you modified this to be sst_data_type, which is a valid namelist option, and set it to 'ncar' the consortium master gx1 roundrobin fails. Is this all right?

I did not see sst_data_type in any of the set_nml.X options. ocn_data_type was already set in set_nml.gx1, set_nml.bgcNICE, and set_nml.bgcISPOL. I have not tested all these to make sure all pass and that ocn_data_type works properly.

Just to mention. I did try to go through and look at all the bgc_data_dir and {bgc,sil,nit,fe_data_type} options and this was a big, confusing mess. We don't provide half those files and it's unclear whether we should be able to mix and match datasets easily in CICE. It should be cleaned up, I just didn't want to mess too much with BGC stuff since I'm not sure exactly what other developments are in the works. Just my thoughts on this for the future.

@apcraig I just want to clarify next steps:

  • Are you wanting me to make the changes to parse_namelist, or you're adding this as a separate pull request?
  • Do you want me to change set_nml.gx1 so that ocn_data_type = 'default' for this PR?
  • Or should we just ignore this test for the time being? If so, I think all the relevant tests are passing and the PR should be ready to merge. I will assume one of you will do the merge since I think you should be responsible for the code merges (I can do documentation ones).

@eclare108213
Copy link
Contributor

After chatting with @njeffery and exploring the tracer issues a bit, I have a pretty good idea of what needs to be done for the bgc. There are extra options in the code, which we can leave in there and not support, or comment out, or even delete. Once we have the basic tests working, I think we should merge the changes so far, and then work on cleaning it up further.

@duvivier
Copy link
Contributor Author

Ok. @apcraig when you can weigh in on how to proceed I will try to get everything squared up tomorrow so it's ready to merge. @eclare108213 do you want this done before or after #207 ? Does it matter?

@eclare108213
Copy link
Contributor

It probably doesn't matter. If you have it ready to go, we should merge it and then deal with the consequences to #207.

@apcraig
Copy link
Contributor

apcraig commented Oct 19, 2018

I will create a PR shortly to bring in a few changes including an update to parse_namelist. I think we can merge this PR now knowing the gx1 case is failing. We need to fix the gx1 case, but lets take that separately. Maybe I will wait for this to merge, update my PR, and try to fix the gx1.

Separately, I'll let @eclare108213 take the lead on the bgc stuff. We'll want to merge that when we can.

@apcraig apcraig dismissed eclare108213’s stale review October 19, 2018 15:34

Got verbal approval to move ahead

@apcraig apcraig merged commit 01494c7 into CICE-Consortium:master Oct 19, 2018
@duvivier duvivier deleted the nml_settings branch October 19, 2018 15:42
@duvivier duvivier mentioned this pull request Nov 7, 2018
phil-blain added a commit to phil-blain/CICE that referenced this pull request Mar 21, 2022
When the 'default_season' namelist setting was added in 01494c7 (Nml
settings (CICE-Consortium#208), 2018-10-19) to replace 'l_winter' and 'l_spring', a
call to 'broadcast_scalar' was forgotten, such that the 'default_season'
value from the namelist is only use on the first MPI process; all other
processes get the hardcoded default value 'winter' defined in
'ice_init::input_data', resulting in different initialization across the
grid for several variables if anything other than 'winter' is used in
the namelist.

Fix that by broadcasting 'default_season' to all MPI procs.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Mar 22, 2022
When the 'default_season' namelist setting was added in 01494c7 (Nml
settings (CICE-Consortium#208), 2018-10-19) to replace 'l_winter' and 'l_spring', a
call to 'broadcast_scalar' was forgotten, such that the 'default_season'
value from the namelist is only use on the first MPI process; all other
processes get the hardcoded default value 'winter' defined in
'ice_init::input_data', resulting in different initialization across the
grid for several variables if anything other than 'winter' is used in
the namelist.

Fix that by broadcasting 'default_season' to all MPI procs.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Oct 6, 2022
When the 'default_season' namelist setting was added in 01494c7 (Nml
settings (CICE-Consortium#208), 2018-10-19) to replace 'l_winter' and 'l_spring', a
call to 'broadcast_scalar' was forgotten, such that the 'default_season'
value from the namelist is only used on the first MPI process; all other
processes get the hardcoded default value 'winter' defined in
'ice_init::input_data', resulting in different initialization across the
grid for several variables if anything other than 'winter' is used in
the namelist.

Fix that by broadcasting 'default_season' to all MPI procs.
apcraig pushed a commit that referenced this pull request Oct 11, 2022
When the 'default_season' namelist setting was added in 01494c7 (Nml
settings (#208), 2018-10-19) to replace 'l_winter' and 'l_spring', a
call to 'broadcast_scalar' was forgotten, such that the 'default_season'
value from the namelist is only used on the first MPI process; all other
processes get the hardcoded default value 'winter' defined in
'ice_init::input_data', resulting in different initialization across the
grid for several variables if anything other than 'winter' is used in
the namelist.

Fix that by broadcasting 'default_season' to all MPI procs.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Feb 2, 2024
When the 'default_season' namelist setting was added in 01494c7 (Nml
settings (CICE-Consortium#208), 2018-10-19) to replace 'l_winter' and 'l_spring', a
call to 'broadcast_scalar' was forgotten, such that the 'default_season'
value from the namelist is only use on the first MPI process; all other
processes get the hardcoded default value 'winter' defined in
'ice_init::input_data', resulting in different initialization across the
grid for several variables if anything other than 'winter' is used in
the namelist.

Fix that by broadcasting 'default_season' to all MPI procs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants