Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added DMI/nuopc driver #377

Merged
merged 13 commits into from
Nov 22, 2019
Merged

Conversation

TillRasmussen
Copy link
Contributor

@TillRasmussen TillRasmussen commented Nov 8, 2019

For detailed information about submitting Pull Requests (PRs) to the CICE-Consortium,
please refer to: https://github.com/CICE-Consortium/About-Us/wiki/Resource-Index#information-for-developers

PR checklist

  • Short (1 sentence) summary of your PR:
    Added drivers for DMI CICE/NUOPC. cice cap not included. In addition a bug in the calculation of anglet was discovered and fixed

  • Developer(s):
    Till Rasmussen, DMI

  • Suggest PR reviewers from list in the column to the right.
    Elizabeth, Tony

  • Please copy the PR test results link or provide a summary of testing completed below.

    ENTER INFORMATION HERE

  • How much do the PR code changes differ from the unmodified code?

    • bit for bit
    • different at roundoff level
    • [x ] more substantial
      The bugfix solves an issue when the angle is averaged to angleT. It is the result of the discontinuity of averaging angles. In this case they jump from ~-pi to ~pi (instead of ~pi). Previous code resulted in 0. New code converts angles to cos (angle) and sin(angle) as these are continouous, AngleT is most likely only used in coupling interfaces.
  • Does this PR create or have dependencies on Icepack or any other models?

    • Yes
    • [ x] No
  • Does this PR add any new test cases?

    • Yes
    • [ x] No
  • Is the documentation being updated? ("Documentation" includes information on the wiki or in the .rst files from doc/source/, which are used to create the online technical docs at https://readthedocs.org/projects/cice-consortium-cice/. A test build of the technical docs will be performed as part of the PR testing.)

    • Yes
    • [ x] No, does the documentation need to be updated at a later time?
      • Yes
      • [x ] No
  • Please provide any additional information or relevant details below:

@TillRasmussen TillRasmussen changed the title Added DMI driver Added DMI/nuopc driver Nov 8, 2019
Copy link
Contributor

@eclare108213 eclare108213 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TillRasmussen please fill out the template, and explain the ANGLET bug in ice_grid.F90. It looks like this will be answer-changing. Thanks!

@apcraig
Copy link
Contributor

apcraig commented Nov 8, 2019

@TillRasmussen, let me know when you are ready for this to be reviewed. thanks!

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Till, did you forget to add the NUOPC cap? I do not see that code in the PR.

Copy link
Contributor

@apcraig apcraig left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, not sure what happened with icepack, but it's not pointing to the latest version. I think what you want to do is remove icepack as a local change. I'm not sure how to do that at this point, but the easiest thing is probably just to pull the consortium master into your icepack directory, add that and commit and push it again. I assume you may have to keep doing that until this PR is merged. But if that seems too frustrating, email me and we can probably find another way.

@apcraig
Copy link
Contributor

apcraig commented Nov 15, 2019

I ran a full test suite on izumi. The tests ran and passed, but most were not bit-for-bit with the current master. You can see results here with hash #cbc49686812c9 at https://github.com/CICE-Consortium/Test-Results/wiki/cice_by_hash_forks. The box tests seem to be bit-for-bit but other grids mostly not. I think that would be expected in this case.

I'll probably run the full suite on another machine and then I can try to do a qc test as well. Or should I wait for some further updates? I ran the code as defined on @TillRasmussen master as of this afternoon. At a minimum, we know we need an update to the icepack version, but that should not impact CICE runs. I think we can proceed knowing that there will be no answer changes in icepack even if it's revised and assuming any other changes that come in will be in the dmi driver at this point. I'll try to move forward with the testing tomorrow.

@TillRasmussen
Copy link
Contributor Author

@apcraig I did not add the cice_cap as I would like to remove the dependency of ESMF from the compilation of cice. It is located here https://github.com/TillRasmussen/DMI_HYCOM_CICE along with the rest of the esmf code that I use. Do you know if there is a way of having the same file in two repositories. The main reason for doing it like this is that I will be able to compile my cice libary without linking to esmf. If necessary I can also add the cice cap

@apcraig
Copy link
Contributor

apcraig commented Nov 18, 2019

@TillRasmussen that makes a lot of sense. There is no requirement that the coupling cap be stored in the CICE repository for any model. As you point out, there may be good reasons to keep the cap separate. There are probably a few ways to keep the cap in both repos, but that ultimately involves some combination of manually syncing the files when they change and modifying the build so one or the other may be skipped. I don't think there is a way to point to the same file in two repos. You could create a repo just for that nuopc cap file and then make that repo a submodule of both your DMI_HYCOM_CICE and the CICE repos. If that submodule was dropped in a separate directory, it would be easier to skip it during the build. But again, that's not a very clean solution. A final compromise might be to copy the file manually into the CICE Consortium area but rename it so it doesn't get compiled. Maybe you have a file called cice_nuopc_cap.F90 in your DMI_HYCOM_CICE repo. Maybe you could copy that to driver/nuopc/dmi/cice_nuopc_cap.F90.for_information_only. Then it would be visible and probably won't build. It would be nice for others to see your cap, but what you're doing, keeping it in a separate repo is fine too.

@TillRasmussen
Copy link
Contributor Author

I think that I will copy the file into the cice repo. I would like to solve the other issues first related to the bug

@apcraig
Copy link
Contributor

apcraig commented Nov 19, 2019

I ran a qc test on gordon and it passed.

/p/home/apcraig/cice-consortium/cice.tr.nuopc>./cice.t-test.py $WORKDIR/CICE_RUNS/gordon_intel_smoke_gx1_44x1_medium_qc.191115qcb $WORKDIR/CICE_RUNS/gordon_intel_smoke_gx1_44x1_medium_qc.191115qct
INFO:main:Running QC test on the following directories:
INFO:main: /p/work1/apcraig/CICE_RUNS/gordon_intel_smoke_gx1_44x1_medium_qc.191115qcb
INFO:main: /p/work1/apcraig/CICE_RUNS/gordon_intel_smoke_gx1_44x1_medium_qc.191115qct
INFO:main:Number of files: 1825
INFO:main:2 Stage Test Passed
INFO:main:Quadratic Skill Test Passed for Northern Hemisphere
INFO:main:Southern Hemisphere data is bit-for-bit
WARNING:main:Error loading necessary Python modules in plot_data function
WARNING:main:Error loading necessary Python modules in plot_data function
WARNING:main:Error loading necessary Python modules in plot_data function
INFO:main:
INFO:main:Quality Control Test PASSED

Based on full test suites on izumi and gordon as well as the QC test, I think we can accept this change. I will approve this PR, but will wait for @eclare108213's review as well as confirmation that no other changes are desired from @TillRasmussen before merging. It seems like @TillRasmussen may still want to make a few more changes? This could be included in this PR or at a later time. If this PR is updated with non-trivial changes, we may have to retest before merging. I'm happy to help with that if that's the case.

@eclare108213
Copy link
Contributor

@apcraig thanks! @TillRasmussen let us know when you've finished changes to this PR, and we'll do a final review. Thx

@TillRasmussen
Copy link
Contributor Author

@apcraig and @eclare108213 I don't know if it easier to:
a/ add cice_cap.F90 as cice_cap.F90_for_info in a new pull request afterwards?
b/ Can I wait unti after the acceptance of the pull request to update my icepack?

if both are yes then I dont have more changes

@apcraig
Copy link
Contributor

apcraig commented Nov 19, 2019

@TillRasmussen, I'm fine waiting on both a) and b). a) has no impact on the Consortium. As noted above, it would be nice to share your cap, but not necessary. On b), I'm not sure what is being done with icepack, but testing with this change passes with the current icepack, so I think that's fine. If there are changes needed in icepack later, we can review them at that point.

@eclare108213, if you are comfortable, feel free to merge the PR. Or I can do it later as well.

@phil-blain
Copy link
Member

I think Icepack would be reverted to an earlier commit if this is merged now.
@TillRasmussen please try the instructions here #372 (comment) to put icepack in the right state and update this branch.

@apcraig
Copy link
Contributor

apcraig commented Nov 19, 2019

Thanks @phil-blain, I had forgotten about the icepack situation, so wasn't sure what was being asked. We definitely want that fixed before we merge.

@apcraig
Copy link
Contributor

apcraig commented Nov 21, 2019

@TillRasmussen , as @phil-blain points out, the process for updating icepack is not difficult and should just take a couple minutes. I will provide a similar approach below that is maybe a little simpler to understand,

checkout/cd to your cice sandbox (ie. git clone https://github.com/tillrasmussen/cice; cd cice; git checkout master; git submodule update --init)
cd icepack
git checkout 5af4649
cd ..
git add icepack
git commit -m "reset icepack version"
git push origin master

I could also create a PR to your repo (or possibly push directly to your repo if I have permission) if that would be easier for you. I guess we could also just merge this and then I could quickly fix it via a separate PR.

@TillRasmussen
Copy link
Contributor Author

@phil-blain, @apcraig. Thank you for your suggestions. Icepack should now be updated.

@apcraig
Copy link
Contributor

apcraig commented Nov 21, 2019

Looks good. I think we can merge. Will give @eclare108213 a chance for final review. Just to summarize again for @eclare108213 , full test suites on izumi and gordon with 4 compilers each pass, but most answers change compared to the current master. A qc test on gordon passed. The latest change just resets the icepack version to the current default. Testing was done with a slightly older version of Icepack but I don't think there will be any impact. Those differences were not associated with source code changes but scripts. If there are concerns, I could retest, but I don't think it's necessary.

@eclare108213 eclare108213 merged commit fcbea1d into CICE-Consortium:master Nov 22, 2019
@eclare108213 eclare108213 mentioned this pull request Dec 6, 2019
26 tasks
@apcraig apcraig mentioned this pull request Dec 9, 2019
phil-blain added a commit to phil-blain/CICE that referenced this pull request Oct 21, 2020
In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (CICE-Consortium#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocs, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.
phil-blain added a commit to phil-blain/CICE that referenced this pull request Nov 6, 2020
In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (CICE-Consortium#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocs, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.

(cherry picked from commit 2197290)
phil-blain added a commit to phil-blain/CICE that referenced this pull request Dec 17, 2020
In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (CICE-Consortium#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocs, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.

(cherry picked from commit 2197290)
phil-blain added a commit to phil-blain/CICE that referenced this pull request Mar 29, 2021
In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (CICE-Consortium#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocs, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.

(cherry picked from commit 2197290)
phil-blain added a commit to phil-blain/CICE that referenced this pull request Mar 29, 2021
…ICE-Consortium#377)

In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (CICE-Consortium#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocks, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.
apcraig pushed a commit that referenced this pull request Apr 2, 2021
* ice_domain: revert changes to 'max_blocks' computation from fcbea1d (#377)

In fcbea1d (Added DMI/nuopc driver and fixed ANGLET (#377), 2019-11-22), the
computation in 'ice_domain::init_domain_blocks' that estimates 'max_blocks' if
the value '-1' is given in the namelist was changed without explanations.

The old computation was computing the number of blocks in the X and Y
directions, taking any necessary padding into account (by substracting 1 to
n[xy]_global, using integer division and adding 1), multiplying them to compute
the total number of blocks, and integer-dividing by `nprocs` to estimate the number
of blocks per processor.

The new computation does a similar computation, but it's unclear what it is
computing exactly. Since it uses floating point division and only casts the
result to an integer at the end of the computation, it systematically computes
a `max_blocks` value that is smaller than the old computation.

This leads to a `max_blocks` value that is systematically too small for the
cartesian decomposition when `block_size_x(y)` does not divide `nx(y)_global`
evenly.

Go back to the old computation.

Also, adjust the documentation to make it clearer that it's possible that
the `max_blocks` value computed by the model might not be appropriate.

* ice_distribution: check 'max_blocks' is enough for all distributions

The subroutines 'create_distrb_cart', 'create_distrb_rake' and
'create_distrb_spacecurve', in contrast to the other distribution-creating
subroutines in module ice_distribution, do not check if the index they are
about to access in the `blockIndex` array of the distribution they are creating
is smaller then `max_blocks`.

This results in an out-of-bound access when `max_blocks` is too small, and
potential segementation faults.

Add checks for these three distributions. Additionnally, for the cartesian
distribution, compute the required minimum value for `max_blocks`, which can
be done easily in this case, abort early, and inform the user of the required
minimum value.

* cicecore: add 'debug_blocks' namelist variable to debug block decomposition

As mentioned in the documentation, subroutines 'ice_blocks::create_blocks' and
'ice_distribution::create_local_block_ids' can print block information to
standard out if the local variable `dbug` is modified to ".true.".

For convenience, replace these local variables with a namelist variable,
'debug_blocks', and add a 'cice.setup' option to activate this new
variable in the namelist.

Adjust the documentation accordingly.

* ice_domain: improve 'max_blocks' computation

The value of 'max_blocks' as currently computed can still be too small
if the number of procs does not divide the number of blocks evenly.

Use the same 'minus one, integer divide, plus one' trick as is done for
the number of blocks in the X and Y directions in order to always
compute a 'max_blocks' value that is large enough.

This estimates the same value for 'max_blocks' as the 'cice_decomp.csh'
script computes:

    @ bx = $nxglob / ${blckx}
    if ($bx * ${blckx} != $nxglob) @ bx = $bx + 1
    @ by = $nyglob / ${blcky}
    if ($by * ${blcky} != $nyglob) @ by = $by + 1

    @ m = ($bx * $by) / ${task}
    if ($m * ${task} != $bx * $by) @ m = $m + 1
    set mxblcks = $m
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants