Should NetCDF files created by ESMValTool follow the CF Conventions? #2643
Replies: 11 comments 21 replies
-
Yes, they should, definitely. Do you have some examples that don't and in which way they deviate? |
Beta Was this translation helpful? Give feedback.
-
I agree with @katjaweigel . It has never been our policy that output from diagnostics has to be CF compliant. The idea was to simply save the "numbers" used to create a plot. |
Beta Was this translation helpful? Give feedback.
-
Actually thinking about it, this might need a bit of differentiation? I don't see the need for CF compliancy in files that are further used internally (like files saved by one diagnostic when using the ancestory dynamic) or files that are only auxiliary files that are only documenting the values of plots. However diagnostics that output files specifically made to use later on with external tools etc I could imagine CF compliancy being useful. So what is the current goal of netcdf files produced? They might be wildly different. |
Beta Was this translation helpful? Give feedback.
-
Well, the documentation for the Core says "The CF Conventions as well as additional standards imposed by CMIP should be followed whenever possible." For diagnostic contributions to the tool there is no such recommendation (not even speaking about a rule). I agree with @bettina-gier, I also do not see the point of such a requirement for diagnostics. We could think about encouraging people but no need for making this a requirement. |
Beta Was this translation helpful? Give feedback.
-
iris accepts and outputs only CF-compliant data, so I'd be curious too where the issue is (as in, what actual data is not CF-compliant and still goes through our pipeline and gets outputted in non-CF-compliant format) |
Beta Was this translation helpful? Give feedback.
-
Touchy if you asked me, that means that data would not be usable with iris (but maybe with cf-python, definitely with xarray) so one needs to think a bit before |
Beta Was this translation helpful? Give feedback.
-
Hmm... this sounds like the whole world is using exclusively Python and Iris... ;-) The main purpose of the netCDF output from the diagnostics was documentation to help with reproducibility, so one could, for example, check the exact numbers shown in a plot when trying to reproduce a figure. In most cases, it was never intended that someone else could easily read the output with a specific software package (e.g. Python/Iris). Unless this should change, I do no see a good reason to put an additional constraint on diagnostic developers. |
Beta Was this translation helpful? Give feedback.
-
Hi all, sorry to barge in un-invited, FYI, the IPCC task group on data support and climate change assessments (TG-Data) recommends IPCC authors use community standards for data formats and metadata attributes when creating IPCC content. If the diagnostics that are discussed in this issue are likely to be displayed in an IPCC report graphic, this would strenghten the case for CF-compliance. Thanks again for your work on ESMValTool, |
Beta Was this translation helpful? Give feedback.
-
I investigated this in a bit more detail and wrote a small script to analyze the situation. For v2.5 this led to the following results:
Here is the code of the script, in case anyone wants to dive into this in more detail. import sys
from pathlib import Path
import iris
from cfchecker.cfchecks import CFChecker, CFVersion
def find_files(dirname):
return Path(dirname, 'work').glob('**/*.nc')
def _check_with_iris(filename):
try:
iris.load(filename)
except Exception:
return False
return True
def check_with_iris(dirname):
return all(_check_with_iris(f) for f in find_files(dirname))
def check_with_cfcheck(dirname):
checker = CFChecker(version=CFVersion(), silent=True)
for filename in find_files(dirname):
try:
checker.checker(str(filename))
except Exception:
print("Unable to check file.")
return False
totals = checker.get_total_counts()
err_levels = (
'FATAL',
'ERROR',
'WARN',
)
errs = sum(totals[k] for k in err_levels)
if errs:
# Only print info for files that actually have a problem
for filename, file_result in checker.all_results.items():
result = file_result['global']
msg = []
for level in err_levels:
msg.extend(f' - {level}: {problem}'
for problem in result[level])
if msg:
msg.insert(0, ' global netCDF attributes:')
for var, result in file_result['variables'].items():
var_msg = []
for level in err_levels:
var_msg.extend(f' - {level}: {problem}'
for problem in result[level])
if var_msg:
var_msg.insert(0, f" netCDF variable '{var}':")
msg.extend(var_msg)
if msg:
msg.insert(0, f' {filename}')
print('\n'.join(msg), flush=True)
return not errs
def check_results(dirnames, check):
success = 0
fail = 0
for dirname in dirnames:
print(f"Checking {dirname}:", flush=True)
if check(dirname):
print(f"{dirname}: Success", flush=True)
success += 1
else:
print(f"{dirname}: Fail", flush=True)
fail += 1
print(f"Success: {success}, Fail: {fail}")
if __name__ == '__main__':
if sys.argv[1] == 'iris':
check_func = check_with_iris
recipe_dirs = sys.argv[2:]
else:
check_func = check_with_cfcheck
recipe_dirs = sys.argv[1:]
check_results(recipe_dirs, check_func) |
Beta Was this translation helpful? Give feedback.
-
if we want to have us a discussion about cf-checker codes, it's prob best to ping @RosalynHatcher who is the developer of the cf-checker used a lot in various places - it wouldn't hurst to have that checker run part of our workflow too, should the user desire 👍 |
Beta Was this translation helpful? Give feedback.
-
Great to see this discussion. With my techy-scientist hat on, I think that file conventions are important and in general the effort of setting sensible metadata outweighs the pain of trying to guess what is in a file. So personally I'd like to see adherence to CF conventions in output & intermediate files encouraged in the development workflow, provided that there are tools & guidance to make it easy to get this right (e.g. interfaces that make it easier to write compliant files for all of the primary languages we support). With my "backward incompatibility discussion" hat on, this would make a good case study as it raised lots of interesting questions. E.g. if we did decide that this was desirable, then:
#everyoneloveshats 🎩 👒 🎩 |
Beta Was this translation helpful? Give feedback.
-
While developing a tool for comparing the results of running an ESMValTool recipe across different versions of the tool (#2613), I noticed that many diagnostics produce NetCDF files that do not comply with the CF Conventions. Does anyone have an opinion on this? Should the NetCDF files produced by diagnostics follow the CF Conventions, or is it not important? @ESMValGroup/esmvaltool-developmentteam
Beta Was this translation helpful? Give feedback.
All reactions