create a shr_log_error method #63

jedwards4b · 2025-01-31T16:57:52Z

We decided to just print the error to both the PET log and the component log and then return up the stack with errors.

… well as component log

billsacks

Thank you for these changes, @jedwards4b ! I'm generally happy with this. Other than my one comment asking for a better comment, the main thing I'd like to ask for is some more clarity on the usage of rc in these routines. It may be enough to just add some documentation of this – particularly, that shr_log_error sets rc to ESMF_FAILURE – but I'll share some thoughts that made this confusing to me and might suggest some rework of this. I'd be happy to talk more about this if it would help.

Thought 1: It looks like the original usage of rc to shr_abort_abort was to provide an error code to be passed to mpi_abort. Now, though, it seems to be a mix of things: it's possible that some uses of this routine still are using it for that original behavior, but now its usage in shr_log_error suggests that it should be interpreted as an ESMF return code. Maybe it's okay to have those two different uses, particularly since ESMF_SUCCESS is 0, but I think it would at least help to document the intended meaning of rc in both shr_abort_abort and shr_log_error.

Thought 2: If part of the idea here is to replace previous code that wrote error messages to ESMF PET files, and if we have an actual ESMF error code in these cases, then it could be better to use ESMF_LogSetError or ESMF_LogFoundError instead of ESMF_LogWrite, since these will translate the integer error code into a meaningful error message. However, this will only work if the rc being operated on is an ESMF error code – not, for example, if it's the return code from a netcdf call (though ESMF_LogFoundNetCDFError could be used for that purpose).

So I wonder if it would make sense to change this interface to have multiple separate rc arguments:

Optional input, esmf_rc: if present, this gives the ESMF return code that indicates the error; it will be printed via ESMF_LogSetError or ESMF_LogFoundError (I'm not clear on the difference between those two at this point)
Optional input, netcdf_rc: if present, this gives the NetCDF return code that indicates the error; it will be printed via ESMF_LogFoundNetCDFError
Optional output, rc: This will be set to ESMF_FAILURE; this is just a convienence so that you don't need to remember to have a separate line setting rc = ESMF_FAILURE before returning. (Maybe unnecessary.)

I don't feel very certain about these thoughts, so definitely open to discussion (or to you just moving ahead with whatever makes sense to you).

billsacks · 2025-02-01T00:27:08Z

src/shr_log_mod.F90

@@ -117,4 +118,47 @@ subroutine shr_log_getLogUnit(unit)

  end subroutine shr_log_getLogUnit

+  subroutine shr_log_error(string, rc, line, file)
+    use esmf, only : ESMF_LOGWRITE, ESMF_LOGMSG_ERROR, ESMF_FINALIZE, ESMF_END_ABORT, ESMF_FAILURE, ESMF_SUCCESS
+    ! Consistent stopping mechanism


I think this comment should say something like: "Log the given message to all places a user may want to look for an error message: the ESMF log file, the log unit given by shr_log_unit, and standard error". Actually, there was a good comment at the top of the deleted print_error_to_logs function that could be restored as a starting point for this comment.

I have updated the comment as requested.

jedwards4b · 2025-02-06T22:49:20Z

The user is expected to have set a useful error message in string. It may be something generated from esmf or netcdf or directly from the calling function. Both the error message and the code are written to the logs by this routine. Setting rc to the generic ESMF_FAILURE on exit from this routine assures that
the code will abort at the top of the stack.

billsacks

Thanks for the updated comment. I thought more about the rc meaning/handling after reviewing the CDEPS PR. I had originally thought that this would replace the implementation of chkerr, but now I'm seeing that's maybe not the intent, and maybe the intent is to still generally use chkerr after ESMF calls (which will call ESMF_LogFoundError to translate rc to a string message). In that case, I'm okay with the implementation / handling of rc here, so my one remaining request is that you add documentation to shr_log_error that it sets rc to ESMF_FAILURE.

create a shr_log_error method with messages to ESMF_LOG (PET file) as…

f4554f5

… well as component log

jedwards4b requested a review from billsacks January 31, 2025 16:57

jedwards4b self-assigned this Jan 31, 2025

billsacks requested changes Feb 6, 2025

View reviewed changes

update comment

400ace0

jedwards4b requested a review from billsacks February 6, 2025 22:49

billsacks mentioned this pull request Feb 8, 2025

shr_log_error and return up stack ESCOMP/CDEPS#321

Open

billsacks requested changes Feb 8, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

create a shr_log_error method #63

create a shr_log_error method #63

jedwards4b commented Jan 31, 2025

billsacks left a comment

billsacks Feb 1, 2025

jedwards4b Feb 6, 2025

jedwards4b commented Feb 6, 2025

billsacks left a comment

create a shr_log_error method #63

Are you sure you want to change the base?

create a shr_log_error method #63

Conversation

jedwards4b commented Jan 31, 2025

billsacks left a comment

Choose a reason for hiding this comment

billsacks Feb 1, 2025

Choose a reason for hiding this comment

jedwards4b Feb 6, 2025

Choose a reason for hiding this comment

jedwards4b commented Feb 6, 2025

billsacks left a comment

Choose a reason for hiding this comment