Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C/CD not bit-for-bit with different decomps #700

Closed
apcraig opened this issue Mar 10, 2022 · 1 comment
Closed

C/CD not bit-for-bit with different decomps #700

apcraig opened this issue Mar 10, 2022 · 1 comment

Comments

@apcraig
Copy link
Contributor

apcraig commented Mar 10, 2022

Only seems to be a problem in some cases. Suggests a subtle issue. Full test suite on cheyenne bfbcomp results showed the following (same results for pgi and intel). Many tests pass but not all. In particular gx1 fails

PASS cheyenne_gnu_smoke_gx3_4x2_cmplogrest_diag1_gridc_reprosum_run10day bfbcomp cheyenne_gnu_smoke_gx3_8x4_diag1_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_4x1_cmplogrest_diag1_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x4_diag1_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt01_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_6x2_alt01_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt02_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_alt02_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt03_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_12x2_alt03_droundrobin_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt04_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_4x4_alt04_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt05_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_4x4_alt05_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt06_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_alt06_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_bgcz_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_bgcz_gridc_reprosum_run10day
FAIL cheyenne_gnu_smoke_gx1_18x1_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx1_15x2_gridc_reprosum_run10day different-data
FAIL cheyenne_gnu_smoke_gx1_18x1_cmplogrest_gridc_reprosum_run10day_seabedprob_thread bfbcomp cheyenne_gnu_smoke_gx1_15x2_gridc_reprosum_run10day_seabedprob different-data
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_fsd12_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_14x2_fsd12_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridc_isotope_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_11x2_gridc_isotope_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridc_icdefault_reprosum_run10day_snwgrain_snwitdrdg_thread bfbcomp cheyenne_gnu_smoke_gx3_8x4_gridc_icdefault_reprosum_run10day_snwgrain_snwitdrdg
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridc_reprosum_run10day_thread_zsal bfbcomp cheyenne_gnu_smoke_gx3_8x3_gridc_reprosum_run10day_zsal
PASS cheyenne_gnu_smoke_gbox128_8x1_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_8x2_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxnodyn_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_12x2_boxnodyn_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxadv_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_9x2_boxadv_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxrestore_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_14x2_boxrestore_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox80_8x1_box2001_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox80_4x5_box2001_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox80_8x1_boxslotcyl_cmplogrest_gridc_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox80_11x3_boxslotcyl_gridc_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_4x2_cmplogrest_diag1_gridcd_reprosum_run10day bfbcomp cheyenne_gnu_smoke_gx3_8x4_diag1_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_4x1_cmplogrest_diag1_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x4_diag1_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt01_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_6x2_alt01_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt02_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_alt02_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt03_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_12x2_alt03_droundrobin_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt04_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_4x4_alt04_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt05_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_4x4_alt05_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_alt06_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_alt06_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_bgcz_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_8x2_bgcz_gridcd_reprosum_run10day
FAIL cheyenne_gnu_smoke_gx1_18x1_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx1_15x2_gridcd_reprosum_run10day different-data
FAIL cheyenne_gnu_smoke_gx1_18x1_cmplogrest_gridcd_reprosum_run10day_seabedprob_thread bfbcomp cheyenne_gnu_smoke_gx1_15x2_gridcd_reprosum_run10day_seabedprob different-data
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_fsd12_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_14x2_fsd12_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridcd_isotope_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gx3_11x2_gridcd_isotope_reprosum_run10day
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridcd_icdefault_reprosum_run10day_snwgrain_snwitdrdg_thread bfbcomp cheyenne_gnu_smoke_gx3_8x4_gridcd_icdefault_reprosum_run10day_snwgrain_snwitdrdg
PASS cheyenne_gnu_smoke_gx3_8x1_cmplogrest_gridcd_reprosum_run10day_thread_zsal bfbcomp cheyenne_gnu_smoke_gx3_8x3_gridcd_reprosum_run10day_zsal
PASS cheyenne_gnu_smoke_gbox128_8x1_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_8x2_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxnodyn_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_12x2_boxnodyn_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxadv_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_9x2_boxadv_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox128_8x1_boxrestore_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox128_14x2_boxrestore_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox80_8x1_box2001_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox80_4x5_box2001_gridcd_reprosum_run10day
PASS cheyenne_gnu_smoke_gbox80_8x1_boxslotcyl_cmplogrest_gridcd_reprosum_run10day_thread bfbcomp cheyenne_gnu_smoke_gbox80_11x3_boxslotcyl_gridcd_reprosum_run10day

which is the following test suite,

smoke          gx1    15x2        reprosum,run10day
smoke          gx1    15x2        seabedprob,reprosum,run10day
smoke          gx1    18x1        reprosum,run10day,cmplogrest,thread             smoke_gx1_15x2_reprosum_run10day
smoke          gx1    18x1        seabedprob,reprosum,run10day,cmplogrest,thread  smoke_gx1_15x2_reprosum_run10day_seabedprob

@apcraig
Copy link
Contributor Author

apcraig commented Mar 14, 2022

Two issues were found and will be PR'ed to the cgridDEV branch soon.

  1. The logic for keeping gridcells and land blocks with C/CD is not inclusive enough. In init_domain_distribution, gridcells on the B grid are kept if KMTG > puny at that gridcell. A short term fix for C/CD is to keep any gridcells where KMTG and any neighbors are greater than zero. This seems to fix the problem, but we need to look further into why C/CD needs a greater stencil and what the right implementation is. A good test case for this was
FAIL cheyenne_gnu_smoke_gx3_1x1x5x4x580_gridc_reprosum_run2day bfbcomp cheyenne_gnu_smoke_gx3_1x1x100x116x1_gridc_reprosum_run2day different-data
  1. maskhalo_dyn = .true. results in non bit-for-bit results for a limited number of cases with C/CD. Temporarily, maskhalo_dyn is disabled in subroutine evp if grid_ice is not "B". Again, we need to investigate this further and better understand what the right haloed mask is for C/CD and then implement it. A couple test cases for this was
FAIL cheyenne_gnu_smoke_gx1_18x1_cmplogrest_gridc_reprosum_run10day bfbcomp cheyenne_gnu_smoke_gx1_32x1_gridc_reprosum_run10day different-data
FAIL cheyenne_gnu_smoke_gx1_32x1x16x12x40_cmplogrest_dwblockall_gridc_histdbg_reprosum_run1day bfbcomp cheyenne_gnu_smoke_gx1_32x1x16x16x32_cmplogrest_dwblockall_gridc_histdbg_reprosum_run1day different-data

apcraig added a commit to apcraig/CICE that referenced this issue Mar 14, 2022
  - modify gridcell + land block elimination for C/CD, using a larger stencil, needs to be investigate further
  - turn off maskhalo_dyn for C/CD, needs to be investigated further
- Update omp_suite and gridsys_suite to extend testing
- Add new histdbg option to turn on some output at each timestep to help debugging
@apcraig apcraig closed this as completed Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants