Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update develop-ref after MET#2975 #2977

Merged
merged 135 commits into from
Sep 19, 2024
Merged

Conversation

github-actions[bot]
Copy link
Contributor

Adds one new output file: climatology_1.0deg/series_analysis_GFS_CLIMO_1.0DEG_CONST_CLIMO.nc
Created by @JohnHalleyGotway

Howard Soh and others added 30 commits February 2, 2024 16:58
* Per #2395, add new columns to VL1L2, VAL1L2, and VCNT line types for wind direction statistics. Work still in progress.

* Per #2395, write the new VCNT columns to the output and document the additions to the VL1L2, VAL1L2, and VCNT columns.

* Per #2395, add the definition of new statistics to Appendix G.

* Per #2395, update file version history.

* Per #2395, tweak warning message about zero wind vectors and update grid-stat and point-stat to log calls to the do_vl1l2() function.

* Per #2395, refine the weights for wind direction stats, ignoring the undefined directions.

* Update src/tools/core/stat_analysis/aggr_stat_line.cc

* Update src/tools/core/stat_analysis/parse_stat_line.cc

* Update src/tools/core/stat_analysis/aggr_stat_line.cc
… broken the logic of the update_truth.yml GHA workflow. Instead of submitting a PR to merge develop into develop-ref directly, use an intermediate update_truth_for_develop branch.
* Per #2280, update to support probability threshold strings like ==8, where 8 is the number of ensemble members, to create probability bins centered on the n/8 for n = 0 ... 8.

* Per #2280, update docs about probability threshold settings.

* Per #2280, use a loose tolerance when checking for consistent bin widths.

* Per #2280, add a new unit test for grid_stat to demonstrate processing the output from gen_ens_prod.

* Per #2280, when verifying NMEP probability forecasts, smooth the obs data first.

* Per #2280, only request STAT output for the PCT line type to match unit_grid_stat.xml and minimize the new output files.

* Per #2280, update config option docs.

* Per #2280, update config option docs.
…ullptr

Feature 2673 sonarqube beta4 nullptr
…eturn

Feature 2673 sonarqube beta4 return
JohnHalleyGotway and others added 26 commits August 9, 2024 09:29
* Per #2924, track SL1L2 and SAL1L2 MAE scores with separate variables since they are no longer the same value. I renamed the existing 'mae' as 'smae' and added a new 'samae' variable. Renaming the existing lets me use the compiler help find all references to it throughout the code.

* Per #2924, update the User's Guide climatology details and equations.

* Per #2924, some changes to aggr_stat_line.cc and series_analysis.cc to satisfy some SonarQube code smells.
…to clarify that data specified in the fcst dictionary is read from the -single input files.
* Per #2924, Update the MPR and ORANK output line types to just write duplicate existing climo values, update the header tables and MPR/ORANK documentation tables.

* Per #2924, update get_n_orank_columns() logic

* Per #2924, update the Stat-Analysis parsing logic to parse the new MPR and ORANK climatology columns.

* Per #2924, making some changes to the vx_statistics library to store climo data... but more work to come. Committing this first set of changes that are incomplete but do compile.

* Per #2924, this big set of changes does compile but make test produces a segfault for ensemble-stat

* Per #2924, fix return value for is_keeper_obs()

* Per #2924, move fcst_info/obs_info into the VxPairBase pointer.

* Per #2924, update Ensemble-Stat to set the VxPairBase::fcst_info pointer

* Per #2924 udpate handling of fcst_info and obs_info pointers in Ensemble-Stat

* Per #2924, update the GSI tools to handle the new fcst climo columns.

* Per #2924, add backward compatibility logic so that when old climo column names are requested, the new ones are used.

* Per #2924, print a DEBUG(2) log message if old column names are used.

* Per #2924, switch the unit tests to reference the updated MPR column names rather than the old ones.

* Per #2924, working progress. Not fully compiling yet

* Per #2924, another round of changes. Removing MPR:FCST_CLIMO_CDF output column. This compiles but not sure if it actually runs yet

* Per #2924, work in progress

* Per #2924, work in progress. Almost compiling again.

* Per #2924, get it compiling

* Per #2924, add back in support for SCP and CDP which are interpreted as SOCP and OCDP, resp

* Per #2924, update docs about SCP and CDP threshold types

* Per #2924, minor whitespace changes

* Per #2924, fix an uninitialized pointer bug by defining/calling SeepsClimoGrid::init_from_scratch() member function. The constructor had been calling clear() to delete pointers that weren't properly initialized to nullptr. Also, simplify some map processing logic.

* Per #2924, rename SeepsAggScore from seeps to seeps_agg for clarity and to avoid conflicts in member function implementations.

* Per #2924, fix seeps compilation error in Point-Stat

* Per #2924, fix bug in the boolean logic for handling the do_climo_cdp NetCDF output option.

* Per #2924, add missing exit statement.

* Per #2924, tweak threshold.h

* Per #2924, define one perc_thresh_info entry for each enumerated PercThreshType value

* Per #2924, simplify the logic for handling percentile threshold types and print a log message once when the old versions are still used.

* Per #2924, update the string comparison return value logic

* Per #2924, fix the perc thresh string parsing logic by calling ConcatString::startswith()

* Per #2924, switch all instances of CDP to OCDP. Gen-Ens-Prod was writing NetCDF files with OCDP in the output variable names, but Grid-Stat was requesting that the wrong variable name be read. So the unit tests failed.

* Per #2924, add more doc details

* Per #2924, update default config file to indicate when climo_mean and climo_stdev can be set seperately in the fcst and obs dictionaries.

* Per #2924, update the MET tools to parse climo_mean and climo_stdev separately from the fcst and obs dictionaries.

* Per #2924, backing out new/modified columns to minimize reg test diffs

* Per #2924, one more section to be commented out later.

* Per #2924, replace several calls to strncmp() with ConcatString::startswith() to simplify the code

* Per #2924, strip out some more references to OBS_CLIMO_... in the unit tests.

* Per #2924, delete accidental file

* Per #2924 fix broken XML comments

* Per #2924, fix comments

* Per #2924, address SonarQube findings

* Per #2924, tweak a Point-Stat and Grid-Stat unit test config file to make the output more comparable to develop.

* Per #2924, fix bug in the logic of PairDataPoint and PairDataEnsemble, when looping over the 3-dim array do not return when checking the climo and fcst values. Instead we need to continue to the next loop iteration.

* Per #2924, address more SonarQube code smells to reduce the overall number in MET for this PR.

* Per #2924, correct the logic for parsing climo data from MPR lines.

* Per #2924, update MPR and ORANK line types to update/add FCST/OBS_CLIMO_MEAN/STDEV/CDF columns.

* Per #2924, cleanup grid_stat.cc source code by making calls to DataPlane::is_empty() and Grid::nxy().

* Per #2924, remove unneeded ==0

* Per #2924, working on PR2.

* Per #2924, update User's Guide with notional example of specifying climo_mean and climo_stdev separately in the fcst and obs dicts.

* Per #2924, adding a new unit test. It does NOT yet run as expected. Will debug on seneca

* Per #2924, pass the description string to the read_climo_data_plane*() function to provide better log messages

* Per #2924, more work on consistent log messages

* Per #2924, tweak the configuration to define both field, climo_mean, and climo_stdev in both the fcst and obs dictionaries

* Per #2924, tweak the unit_climatology_mixed.xml test

* Per #2924, only whitespace changes.

* Per #2924, missed swapping MET #2924 changes in 3 test files

* Per #2924, delete accidentally committed file

* Per #2924, delete accidentally committed files

* Per #2924, add support for GRIB1 time range indicator value of 123 used for the corresponding METplus Use Case. Note that there are 22 other TRI values not currently supported.
#2947)

* Adds caveat regarding longitudes appearing in DEBUG statements with a different sign to the FAQ.

* Update appendixA.rst

Missing paren
* Per #2938, define CRC_Array::add_uniq(...) member function which is now used in PB2NC

* Per #2938, replace n_elements() with n() to make the code more concise. Refine log/warning message when multiple message center times are encountered.
* Per #1371, add -input command line argument and add support for ALL for the CTC, MCTC, SL1L2, and PCT line types.

* Per #1371, rename the -input command line option as -aggregate instead

* Per #1371, work in progress

* Per #1371, just comments

* Per #1371, working on aggregating CTC counts

* Per #1371, work in progress

* Per #1371, update timing info using time stamps in the aggr file

* Per #1371, close the aggregate data file

* Per #1371, define set_event() and set_nonevent() member functions

* Per #1371, add logic to aggregate MCTC and PCT counts

* Merging changes from develop

* Per #1371, work in progress aggregating all the line statistics types. Still have several issues to address

* Per #1371, switch to using get_stat() functions

* Per #1371, work in progress. More consolidation

* Per #1371, correct expected output file name

* Per #1371, consistent regridding log messages and fix the Series-Analysis PairDataPoint object handling logic.

* Per #1371, check the return status when opening the aggregate file.

* Per #1371, fix prc/pjc typo

* Per #1371, fix the series_analysis PCT aggregation logic and add a test to unit_series_analysis.xml to demonstrate.

* Per #1371, resolve a few SonarQube findings

* Per #1371, make use of range-based for loop, as recommeded by SonarQube

* Per #1371, update series-analysis to apply the valid data threshold properly using the old aggregate data and the new pair data.

* Per #1371, update series_analysis to buffer data and write it all at once instead of storing data value by value for each point.

* Per #1371, add useful error message when required aggregation variables are not present in the input -aggr file.

* Per #1371, print a Debug(2) message listing the aggregation fields being read.

* Per #1371, correct operator+= logic in met_stats.cc for SL1L2Info, VL1L2Info, and NBRCNTInfo. The metadata settings, like fthresh and othresh, were not being passed to the output.

* Per #1371, the DataPlane for the computed statistics should be initialized to a field of bad data values rather than the default value of 0. Otherwise, 0's are reported for stats a grid points with no data when they should really be reported as bad data!

* Per #1371, update logic of the compute_cntinfo() function so that CNT statistics can be derived from a single SL1L2Info object containing both scalar and scalar anomaly partial sums. These changes enable CNT:ANOM_CORR to be aggregated in the Series-Analysis tool.

* Per #1371, fix logic of climo log message.

* Per #1371, this is actually related to MET #2924. In compute_pctinfo() used obs climo data first, if provided. And if not, use fcst climo data.

* Per #1371, fix indexing bug (+i instead of +1) when check the valid data count. Also update the logic of read_aggr_total() to return a count of 0 for bad data.

* Per #1371, add logic to aggregate the PSTD BRIERCL and BSS statistics in the do_climo_brier() function. Tested manually to confirm that it works.

* Per #1371, switch to using string literals to satisfy SonarQube

* Per #1371, update series_analysis tests in unit_climatology_1.0deg.xml to demonstrate aggregating climo-based stats.

* Per #1371, remove extra comment

* Per #1371, skip writing the PCT THRESH_i columns to the Series-Analysis output since they are not used

* Per #1371, fix the R string literals to remove \t and \n escape sequences.

* Per #1371, update the read_aggr_data_plane() suggestion strings.

* Per #1371, ignore unneeded PCT 'THRESH_' variables both when reading and writing ALL PCT columns.

* Per #1371, update the test named series_analysis_AGGR_CMD_LINE to include data for the F42 lead time that had previously been included for the same run in the develop branch. Note however that the timestamps in the output file for the develop branch (2012040900_to_2012041100) were wrong and have been corrected here (2012040900_to_2012041018) to match the actual data.

* Per #1371, update the -aggr note to warn users about slow runtimes
* Per #2948, updating versions of ecbuild, eckit, and atlas

* Per #2948, Adding MET_CXX_STANDARD

* Per #2948, updated wording for MET_CXX_STANDARD description

* Per #2948, updating script to work with two versions of ecbuild, eckit, and atlas

* Per #2948, without this change, there are compilation problems if the user wants to compile wihtout python

* Per #2948, fixing logic for MET_CXX_STANDARD

* Per #2928, adding missing end bracket

* Per #2948, fixed the logic for compiling versions of ecbuild, eckit, and atlas

* Per 948, fixed syntax for setting CXXFLAGS

* Per #2948, adding new Makefile.in files and configure and changing METbaseimage 3.2 to 3.3.

* Per #2948, updating version of met base tag from 3.2 to 3.3

* Per #2948, adding --enable-all MET_CXX_STANDARD=11 job

* Update compilation_options.yml

* Per #2948, added a job10 for MET_CXX_STANDARD=14

* Per #2948, added brief documentation for the MET_CXX_STANDARD option

---------

Co-authored-by: Julie Prestopnik <jpresto@seneca.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
* #1729 Allow to change to differnt grid size if the raw size is 0

* Added build_grid_by_grid_string and build_grid_by_grid_string

* #1729 Calls build_grid_by_grid_string

* #1729 Added set_attr_grid at the -field option

* #1729 Set obs_type to TYPE_NCCF if the file_type is given at the config file

* #1729 Support set_sttr_grid and changed Error messages to Warning

* #1729 FIxed SonmarQube findings

* #1729 Initial release for unit test

* #1729 Added update_missing_values

* #1729 Deleted a shadowed local variable

* #2673 Added more is_eq

* #2673 Added get_exe_duration

* 2673 Reducded nested statements

* 2673 Fixed SonarGube findings

* 2673 Fixed SonarQube findings

* 2673 Fixed SonarQube findings

* #1729 Added aan unittest plot_data_plane_set_attr_grid

* #1729 Added aan unittest point2grid_cice_set_attr_grid

* #1729 Added changed back the verbose level

* #1729 Corrected typo

---------

Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
* #2936 Support 1D lat/lon values

* #2936 Initial release

* #2936 Cast the data type to avoid a compile warning

* #2936 Added an unittest point2grid_gfs_1D_lat_lon

---------

Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
* #2968 Corrected set_attr_grid for point2grid_cice_set_attr_grid

* #2968 Compare the DataPlane size and the variable data size

* #2968 nx and ny are not ignored with set_attr_grid

* #2968 Compare the DataPlane size and the variable data size

---------

Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
* added single quotes around env var/val pairs in export statements in cmd only mode

* updated logic in unit() to check exec return value against expected return value; created TEST xml file to test this feature

* deleted TEST_ xml, added test with retval 1 to unit_ascii2nc

---------

Co-authored-by: Natalie Babij <nbabij@seneca.rap.ucar.edu>
* Per #2887, update NumArray::vals() to return a reference to the vector rather a pointer to doubles.

* Per #2887, switch over the whole ContingencyTable class heirarchy from storing integer counts to storing double-precision weights.

* Add ContingencyTable::is_integer() member function to check whether the table contains all integers

* Per #2887, update parse_stat_line.cc to get it to compile after changing PCT to store thresholds in a std::vector.

* Per #2887, update PCTInfo::clear() logic.

* Per #2887, update ctc_by_row() logic to create reproducible results with the develop branch.

* Per #2887, update logic of define_prob_bins() to add a final >=1.0 threshold if needed. While ==0.1 works fine, I found that ==0.05 did not because the last >=1.0 threshold was missing likely do to floating point precision issues. This change should fix that problem.

* Per #2887, update roc_auc() function to match the develop branch

* Per #2887, fix bug if computation of far()

* Per #2887, replaced all ==0 integer equality checks with calls to is_eq() instead and fix a couple of equations to snuff out diffs in some CTS statistics.

* Per #2887, address some of the 34 SonarQube code smells flagged for this PR. Note that the compute_ci.h/.cc changes are necessary and good since we should be computing CI's using doubles instead of integer counts.

* Per #2887, update run_sonarqube.sh to specify the target CXX standard as 11. The hope is that that will limit the findings to only those features available in the C++11 standard.

* Per #2887, update to SonarQube version 6.1.0.4477 released on 6/27/2024.

* Per #2887, updating build_met_sonarqube.sh to specify --std=c++11 since c++17 is used by default
…efine_prob_bins() utility function so that ==n probability thresholds result in the correct number of probability thresholds. We were adding an unncessary 10-th bin (from 1.07143 to 1.0) for the ==7 probability threshold type.
The docs directory was moved up to the top-level of the repository but this workflow was not updated. Changing the ignore setting so that doc-only updates do not trigger the full METplus testing workflow.
* testing AREA and AUTO changes

* Keywords B thru L

* thru R

* adding quotes back in for lower case items

* S thru the end of the document

* Removing double quotes around 3 key words

* Per #2023, adding a label name for the Attributes section

* Per #2023, adding an internal link for the MODE tool Attributes section.

* Adding quotes around Valid basins entries

* more double quote updates

* more complex updates with Julie P help

* removing double quotes

* fixing typos

* removing double quotes

* unbolding SURFACE and putting it in double quotes

* fixing grammar

* grammar

* fixing typo

* fixing typo

---------

Co-authored-by: Julie Prestopnik <jpresto@ucar.edu>
* Per #2924, remove GenEnsProd config file comment about parsing desc separately from each obs.field entry because the obs dictionary does not exist in the GenEnsProd config file.

* Per #2924, update list of needed config entry names

* Per #2924, remove const from the parent() member function so that we can perform lookups for the parent.

* Per #2924, update the signature for and logic of the utility functions that retrieve the climatology data. Rather than requiring all the climo_mean and climo_stdev dictionary entries to be defined at the same config file context level, parse each one individually. This enables the METplus wrappers to only partially override this dictionary and still rely on the default values provided in MET's default configuration files.

* Per #2924, update all calls to the climatology utility functions based on the new function signature. Also update the tools to check the number of climo fields separately for the forecast and observation climos.

* Per #2924, update the parsing logic for the climatology regrid dictionary. Use config.fcst.climo_mean.regrid first, config.fcst.regrid second, and config.climo_mean.regrid third. Notably, DO NOT use config.regrid. This is definitely the problem with having regrid specified at mutliple config file context levels. It makes the logic for which to use when very messy.

* Per #2924, forgot to add an else to print an error

* Per #2924, remove extraneous semicolon

* Per #2924, move 'fcst.regrid' into 'fcst.climo_mean.regrid'. Defining the climatology regridding logic inside fcst is problematic because it applies to the forecast data as well and you end up with the verification grid being undefined. So the climo regridding logic must be defined in 'climo_mean.regrid' either within the 'fcst' and 'obs' dictionaries or at the top-level config context.

* Per #2924, based on PR feedback from @georgemccabe, add the Upper_Left, Upper_Right, Lower_Right, and Lower_Left interpolation methods to the list of valid options for regridding, as already indicated in the MET User's Guide.

* Per #2924, update the logic of parse_conf_regrid() to (hopefully) make it work the way @georgemccabe expects it to. It now uses pointers to both the primary and default dictionaries and parses each entry individually.

* Per #2924, need to check for non-null pointer before using it

* Per #2924, revise the climo_name dictionary lookup logic when parsing the regrid dictionary.

* Per #2924, update logic for handling RegridInfo

* Per #2924, remove the default regridding information from the 'Searching' log message to avoid confusion.

---------

Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
* Per #2924, remove GenEnsProd config file comment about parsing desc separately from each obs.field entry because the obs dictionary does not exist in the GenEnsProd config file.

* Per #2924, update list of needed config entry names

* Per #2924, remove const from the parent() member function so that we can perform lookups for the parent.

* Per #2924, update the signature for and logic of the utility functions that retrieve the climatology data. Rather than requiring all the climo_mean and climo_stdev dictionary entries to be defined at the same config file context level, parse each one individually. This enables the METplus wrappers to only partially override this dictionary and still rely on the default values provided in MET's default configuration files.

* Per #2924, update all calls to the climatology utility functions based on the new function signature. Also update the tools to check the number of climo fields separately for the forecast and observation climos.

* Per #2924, update the parsing logic for the climatology regrid dictionary. Use config.fcst.climo_mean.regrid first, config.fcst.regrid second, and config.climo_mean.regrid third. Notably, DO NOT use config.regrid. This is definitely the problem with having regrid specified at mutliple config file context levels. It makes the logic for which to use when very messy.

* Per #2924, forgot to add an else to print an error

* Per #2924, remove extraneous semicolon

* Per #2924, move 'fcst.regrid' into 'fcst.climo_mean.regrid'. Defining the climatology regridding logic inside fcst is problematic because it applies to the forecast data as well and you end up with the verification grid being undefined. So the climo regridding logic must be defined in 'climo_mean.regrid' either within the 'fcst' and 'obs' dictionaries or at the top-level config context.

* Per #2924, based on PR feedback from @georgemccabe, add the Upper_Left, Upper_Right, Lower_Right, and Lower_Left interpolation methods to the list of valid options for regridding, as already indicated in the MET User's Guide.

* Per #2924, update the logic of parse_conf_regrid() to (hopefully) make it work the way @georgemccabe expects it to. It now uses pointers to both the primary and default dictionaries and parses each entry individually.

* Per #2924, need to check for non-null pointer before using it

* Per #2924, revise the climo_name dictionary lookup logic when parsing the regrid dictionary.

* Per #2924, update logic for handling RegridInfo

* Per #2924, remove the default regridding information from the 'Searching' log message to avoid confusion.

* Per #2924, escape sequences, like \n, cannot be used inside R-string literals.

* Per #2924, update the logic of check_climo_n_vx()

* Per #2924, revise logic in read_climo_data_plane_array(). Check the number of climo fields provided. If there's 0, just return since no data has been requested. If there's 1, use it regardless of the number of input fields. If there's more than 1, just use the requested i_vx index value.

* Per #2924, update Series-Analysis to set both i_fcst and i_obs when looping over the series entries.

* Per #2924, no real change. Just whitespace.

* Unrelated to #2924, superficial changes to formatting of method_name strings for consistency.

* Per #2924, add a new series_analysis test that ERRORS OUT prior to this PR but works after the changes in this PR.

---------

Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
@JohnHalleyGotway JohnHalleyGotway added this to the MET-12.0.0 milestone Sep 19, 2024
@JohnHalleyGotway JohnHalleyGotway merged commit 33e81f0 into develop-ref Sep 19, 2024
1 check passed
@JohnHalleyGotway JohnHalleyGotway deleted the update_develop_53a2b5a7 branch September 19, 2024 20:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

Enhance MET to support separate climatology datasets for both the forecast and observation inputs
9 participants