Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mod_def files do not reproduce in unstructured cases #700

Closed
JessicaMeixner-NOAA opened this issue May 4, 2022 · 6 comments · Fixed by #983
Closed

mod_def files do not reproduce in unstructured cases #700

JessicaMeixner-NOAA opened this issue May 4, 2022 · 6 comments · Fixed by #983
Assignees
Labels
bug Something isn't working

Comments

@JessicaMeixner-NOAA
Copy link
Collaborator

Describe the bug
When running the develop branch twice (as of May 3, 2022), the mod_def files for the following tests:

ww3_tp2.17/./work_c                     (1 files differ)
ww3_tp2.17/./work_b                     (1 files differ)
ww3_tp2.17/./work_a                     (1 files differ)
ww3_tp2.6/./work_ST0                     (1 files differ)
ww3_tp2.6/./work_ST4                     (1 files differ)
ww3_tp2.6/./work_pdlib                     (1 files differ)

do not reproduce.

To Reproduce
Run the matrix_cmake_ncep from the develop branch twice and use matrix.comp to compare.

Expected behavior
The mod_def files should be the same in two consecutive runs.

@JessicaMeixner-NOAA JessicaMeixner-NOAA added the bug Something isn't working label May 4, 2022
@JessicaMeixner-NOAA
Copy link
Collaborator Author

FYI @aliabdolali @thesser1 @maryanderson64

In the past, similar issues were caused by an uninitialized variable.

@JessicaMeixner-NOAA
Copy link
Collaborator Author

This is still an issue as of
commit 84c79ce
Date: Wed May 18 11:12:32 2022 -0400

Has there been any progress into replicating, identifying which commit introduced the issue or fixing this issue @aliabdolali ?

@JessicaMeixner-NOAA
Copy link
Collaborator Author

Any updates on this @aliabdolali @thesser1 ?

@aronroland
Copy link
Collaborator

@JessicaMeixner-NOAA for explicit cases we are looking into this bug and the only remaining problem is the ICE part.

For implicit cases the Block Gauss-Seidel solver cannot provide CPU coherency by definition so we will move to the Jacobi solver for the implicit part.

There is no timeline nor project for the implicit part.

@JessicaMeixner-NOAA
Copy link
Collaborator Author

@aronroland I understand issues with the output from model runs with implicit versus explicit, but this is about the mod_def output of ww3_grid. This should be reproducible no matter if it's implicit or explicit, correct?

@JessicaMeixner-NOAA
Copy link
Collaborator Author

I've been looking at this issue to make sure it didn't feed into later MPI reproducibility issues and I have found that we write the variable
IOBDP
in w3iogrdmd : https://github.com/NOAA-EMC/WW3/blob/develop/model/src/w3iogrmd.F90#L756
but this variable is not defined. I'm trying to set this variable to 0 to get reproducible mod_defs. Early tests seem successful, but more testing is needed. We could remove this variable from being written/read in the mod_defs but this requires updating all mod_defs so setting it to zero for now is a less impactful update. Given that all answers are the same even now, I'm assuming there's no impact by simply setting this to zero on the model output.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Development

Successfully merging a pull request may close this issue.

2 participants