Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature #2609 tc_diag_missing_data #2680

Merged
merged 42 commits into from
Sep 13, 2023

Conversation

JohnHalleyGotway
Copy link
Collaborator

@JohnHalleyGotway JohnHalleyGotway commented Sep 11, 2023

Expected Differences

Please note that these changes constitute only one part of the changes needed in TC-Diag for MET version 12.0.0. This PR includes the following changes:

  • Update default TC-Diag config file to explicitly list expected lead times of 0, 6, 12, ..., 126 and add new one_time_per_file_flag config option.

  • Update vx_log library to easily turn on/off warning messages, and use that in TC-Diag.

  • Add Met2dDataFile::data_planes(...) member function to retrieve a vector of requested fields from a single input file, rather than reading one field at a time.

  • Minor edits in derived 2D data file type libraries to clean up whitespace.

  • Add get_series_entries(...) utility function to retrieve a list of fields rather than one at a time.

  • Update control flow logic in TC-Diag tool.

  • Do these changes introduce new tools, command line arguments, or configuration file options? [Yes]

    If yes, please describe:

    Adds one_time_per_file_flag configuration option for TC-Diag.

  • Do these changes modify the structure of existing or add new output data types (e.g. statistic line types or NetCDF variables)? [No]

    If yes, please describe:

Pull Request Testing

  • Describe testing already performed for these changes:

    Added one_time_per_file_flag = TRUE; to the existing unit test. The existing unit test should run fine but process additional time steps and run faster.

For the develop branch (without new one_time_per_file_flag option and with only 5 valid lead times of data):

musial6:test_unit johnhg$ perl/unit.pl xml/unit_tc_diag.xml 
TEST: tc_diag_IAN  - pass -  80.224 sec

For the feature_2609_tc_diag_missing_data branch (with one_time_per_file_flag = TRUE; and with all expected 0 - 126 hours of lead time):

musial6:test_unit johnhg$ perl/unit.pl xml/unit_tc_diag.xml 
TEST: tc_diag_IAN  - pass -  56.030 sec

The updated output file contains 22 times instead of only 5:

time = 22 ;

There's only valid data for the first 5 time steps, but the rest are present and contain missing data. Previously TC-Diag would error out due to missing data. But it no longer does.

  • Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:

@jvigh: Please review the config file changes and documentation updates.
@georgemccabe: Please review the code changes and be aware of the new one_time_per_file_flag config option. Please see dtcenter/METplus#2340.

  • Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? [Yes]

  • Do these changes include sufficient testing updates? [Yes]
    Existing unit test still works fine.

  • Will this PR result in changes to the test suite? [Yes]

    If yes, describe the new output and/or changes to the existing output:

    Modifies the output of the unit_tc_diag.xml as described above.

  • Please complete this pull request review by [Tues 9/12/23].

Pull Request Checklist

See the METplus Workflow for details.

  • Review the source issue metadata (required labels, projects, and milestone).
  • Complete the PR definition above.
  • Ensure the PR title matches the feature or bugfix branch name.
  • Define the PR metadata, as permissions allow.
    Select: Reviewer(s) and Development issue
    Select: Milestone as the version that will include these changes
    Select: Coordinated METplus-X.Y Support project for bugfix releases or MET-X.Y.Z Development project for official releases
  • After submitting the PR, select the ⚙️ icon in the Development section of the right hand sidebar. Search for the issue that this PR will close and select it, if it is not already selected.
  • After the PR is approved, merge your changes. If permissions do not allow this, request that the reviewer do the merge.
  • Close the linked issue and delete your feature or bugfix branch from GitHub.

jprestop and others added 30 commits June 27, 2023 12:01
* Adding free disk space script

* Added lines for free disk space

* Take 2 Adding to each job instead of job_controls

* Added fake space as to include ci-run-unit in this commit message

* Added fake space as to include ci-run-unit in this commit message and to resolve syntax error

* Added fake space as to include ci-run-unit in this commit message and to resolve syntax error

* Copied and pasted from METplus to resolve syntax error ci-run-unit

* Added extra space to trigger tests ci-run-unit

* Added extra space to trigger tests and attempted to fix all syntax error ci-run-unit

* Removing free disk space from check for differences

* Added bogus space ci-run-unit

* Removing unnecessary spaces
…center/MET into bugfix_2578_rotated_latlon_main_v11.1
…n_v11.1

Bugfix 2578 rotated latlon main v11.1
Update UPP website link in 2 places from DTC to EPIC owned website.
Update mode.rst to modify multivar_intensity to multivar_intensity_flag
#2603
* Modified parsing to handle changed webpage format, added command line option to specify the existing stations file to use, general cleanup

* Modified documentation to show the new command line option '-e <file>' for the script build_ndbc_stations_from_web.py

* Stations file created on July 10, 2023.  The web content seems to be changing on a daily basis (drifting bouys most likely)
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: Seth Linden <linden@seneca.rap.ucar.edu>
Co-authored-by: jprestop <jpresto@ucar.edu>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: John and Cindy <halleygotway@Halleys-Mac-mini.local>
Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Co-authored-by: davidalbo <dave@ucar.edu>
Co-authored-by: Lisa Goodrich <lisag@ucar.edu>
Co-authored-by: metplus-bot <97135045+metplus-bot@users.noreply.github.com>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <jvigh@ucar.edu>
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: Seth Linden <linden@seneca.rap.ucar.edu>
Co-authored-by: jprestop <jpresto@ucar.edu>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: John and Cindy <halleygotway@Halleys-Mac-mini.local>
Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Co-authored-by: davidalbo <dave@ucar.edu>
Co-authored-by: Lisa Goodrich <lisag@ucar.edu>
Co-authored-by: metplus-bot <97135045+metplus-bot@users.noreply.github.com>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <jvigh@ucar.edu>
Co-authored-by: Tracy Hertneky <39317287+hertneky@users.noreply.github.com>
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
* Removing disable-new-tags rpath option for GNU compilers 12.3 and above as this option produces an unknown option error

* Adding configuration file for narya

* Resolving sed error 'invalid command code m'

* Modified code to determine OS and take action to build NetCDF-CXX and MET accordingly

* Added documentation about the MAKE_ARGS option

* Made suggested changes by George

* Adding back inadvertently removed LIB_Z to LDFLAGS in HDF5 compilation

* Remove unnecessary indentation
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: Seth Linden <linden@seneca.rap.ucar.edu>
Co-authored-by: jprestop <jpresto@ucar.edu>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: John and Cindy <halleygotway@Halleys-Mac-mini.local>
Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Co-authored-by: davidalbo <dave@ucar.edu>
Co-authored-by: Lisa Goodrich <lisag@ucar.edu>
Co-authored-by: metplus-bot <97135045+metplus-bot@users.noreply.github.com>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <jvigh@ucar.edu>
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
…args to control searching all files lists, erroring out, and printing warnings. The default settings maintain existing functionality. Update tc_diag code to suppress errors and warnings due to missing data. Still do need to suppress additional warnings from the vx_data libraries.
…, and use it in the vx_series_data library to control to limit the missing data warning messages from tc-diag.
…of the logger and reset it back to that state when done reading data. Get rid of the search_all_files flag since it'll always be true, even in tc_diag's use of it.
…l need to actually implement logic to handle this in tc_diag.
@JohnHalleyGotway
Copy link
Collaborator Author

JohnHalleyGotway commented Sep 12, 2023

FYI, as expected, differences were flagged in the output of this GHA testing workflow run. After downloading the logs artifact and diff artifact, I see only one difference flagged in comp_dir.log:

COMPARING tc_diag/tc_diag_AL092022_GFSO_2022092400_cyl_grid_parent.nc
file1: /data/output/met_test_truth/tc_diag/tc_diag_AL092022_GFSO_2022092400_cyl_grid_parent.nc
file2: /data/output/met_test_output/tc_diag/tc_diag_AL092022_GFSO_2022092400_cyl_grid_parent.nc
ERROR: NetCDF headers differ:
3,4c3,4
<       track_line = 5 ;
<       time = 5 ;
---
>       track_line = 49 ;
>       time = 22 ;

This TC-Diag NetCDF output file difference is expected.

Feel free to inspect the old/new files are you'd like.

Copy link
Collaborator

@georgemccabe georgemccabe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These changes look good. I see there is a new variable called one_time_per_file_flag which will need to be supported via METplus wrappers and that this is noted in the METplus GitHub issue. I approve.


The TC-Diag tool should be configured to filter the input track data (**-deck**) down to the subset of tracks that correspond to the gridded data files provided (**-data**). The filtered tracks should contain data for only *one initialization time* but may contain tracks for multiple models.

The configuration options listed above are used to filter the input track data down to those that should be processed in the current run. These options are common to multiple MET tools and are described in :numref:`config_options_tc`.

.. code-block:: none
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnHalleyGotway There's something weird here. There's an empty code block, but then the lead times that follow are split across multiple lines in a weird way. Should these actually be inside the code block?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I think I understand -- to be inside the code block, the lead = [ ]; stuff needs to be indented.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jvigh, great catch! Thanks for your review. I pushed a fix for that indent problem.


one_time_per_file_flag = TRUE;

The **one_time_per_file_flag** entry controls the logic for reading data from input files. Users should set this to true or false to describe how data is stored in the gridded input files specified with the **-data** command line option. Set this to true if each input file contains all of the data for a single initialization time and for a single valid time. If the input files contain data for multiple initialization or valid times, or if data for one valid time is spread across multiple files, set this to false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The -data snippet is spread across two lines in the rendered doc. Is there any way to protect this term from spreading across the line break? I couldn't find out how after a few minutes of searching . . .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would seem that RTD treats the leading - and the following data as part of separate words when wrapping text. I'm guessing it treats the same as hyphenated words like "check-in" where the - character is a good place to wrap. You'll notice that as you adjust the browser window width, the wrapping changes dynamically.

I reworded the sentence to address this by moving the the -data away from the end of the line when the browser window is at max width. I could switch the formatting from bold -data to this in-line code block -data instead. That'd probably fix it but wouldn't be consistent with the formatting in the rest of the User's Guide.

But we could choose to modify the formatting for all -something command line options to always use that in-line code block formatting... just not in this PR.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, playing around with the formatting of my comment above by modifying the browser window width, formatting as -data does NOT fix it. It still wraps at the - as it would for hyphenated words. Probably not something worth worrying about any more than this.

Copy link
Contributor

@jvigh jvigh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JohnHalleyGotway Overall, things look good. I left a couple single-line comments regarding minor formatting issues. Once these are approved, I approve this PR.

@JohnHalleyGotway JohnHalleyGotway merged commit 7063744 into develop Sep 13, 2023
6 checks passed
@JohnHalleyGotway JohnHalleyGotway deleted the feature_2609_tc_diag_missing_data branch September 13, 2023 16:46
@JohnHalleyGotway JohnHalleyGotway mentioned this pull request Sep 15, 2023
15 tasks
JohnHalleyGotway pushed a commit that referenced this pull request Sep 15, 2023
Co-authored-by: jprestop <jpresto@ucar.edu>
Co-authored-by: Seth Linden <linden@seneca.rap.ucar.edu>
Co-authored-by: John Halley Gotway <johnhg@ucar.edu>
Co-authored-by: Daniel Adriaansen <dadriaan@ucar.edu>
Co-authored-by: John and Cindy <halleygotway@Halleys-Mac-mini.local>
Co-authored-by: rgbullock <bullock@ucar.edu>
Co-authored-by: Randy Bullock <bullock@seneca.rap.ucar.edu>
Co-authored-by: Dave Albo <dave@seneca.rap.ucar.edu>
Co-authored-by: Howard Soh <hsoh@seneca.rap.ucar.edu>
Co-authored-by: George McCabe <23407799+georgemccabe@users.noreply.github.com>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@seneca.rap.ucar.edu>
Co-authored-by: Seth Linden <linden@ucar.edu>
Co-authored-by: lisagoodrich <33230218+lisagoodrich@users.noreply.github.com>
Co-authored-by: davidalbo <dave@ucar.edu>
Co-authored-by: Lisa Goodrich <lisag@ucar.edu>
Co-authored-by: metplus-bot <97135045+metplus-bot@users.noreply.github.com>
Co-authored-by: j-opatz <59586397+j-opatz@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan Vigh <jvigh@ucar.edu>
Co-authored-by: Tracy Hertneky <39317287+hertneky@users.noreply.github.com>
Co-authored-by: David Albo <dave@ucar.edu>
Co-authored-by: Dan Adriaansen <dadriaan@ucar.edu>
fix 2518 dtypes appf docs (#2519)
fix 2531 compilation errors (#2533)
fix #2531 compilation_errors_configure (#2535)
fix #2514 develop clang (#2563)
fix #2575 develop python_convert (#2576)
Fix Python environment issue (#2407)
fix definitions of G172 and G220 based on comments in NOAA-EMC/NCEPLIBS-w3emc#157. (#2406)
fix #2380 develop override (#2382)
fix #2408 develop empty config (#2410)
fix #2390 develop compile zlib (#2404)
fix #2412 develop climo (#2422)
fix #2437 develop convert (#2439)
fix for develop, for #2437, forgot one reference to the search_parent for a dictionary lookup.
fix #2452 develop airnow (#2454)
fix #2449 develop pdf (#2464)
fix #2402 develop sonarqube (#2468)
fix #2426 develop buoy (#2475)
fix 2596 main v11.1 rpath compilation (#2614)
fix #2514 main_v11.1 clang (#2628)
fix #2644 develop percentile (#2647)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: 🏁 Done
Development

Successfully merging this pull request may close these issues.

Refine TC-Diag logic for handling missing data
7 participants