Enhance Multivariate MODE to generate object statistics for each input field requested by the user. #1283

dwfncar · 2020-03-24T15:46:24Z

Describe the Enhancement

I am repurposing this existing issue to better define the multi-variate MODE enhancements that are needed.

While the existing MODEMultivarConfig_default file exists in the repository, it is NOT actually read by Multi-Variate MODE! In all cases, MODEConfig_default is read instead. Need to understand why, and either update MODE to read it in the right circumstances or just remove it from the repo.
If MODEMultivarConfig_default is retained, modify the nc_pairs_flag setting in it to specify it as a dictionary, just like it is specified in MODEConfig_default. This is needed to make the logic of the MODE wrapper clear and consistent.
Make large updates to the multi-variate logic, as described below.

As of MET version 10.1.0, multi-variate MODE generates output NetCDF files named f_super.nc and o_super.nc. However, it does not actually write out any object statistics.
Recommend that it continue to write out the super object information, but write out both fields to the standard MODE NetCDF output file instead (i.e. *_obj.nc).
Add a new config option that is an array of booleans with the same length as the number of forecast and observation fields: multivar_intensity_flag = [ TRUE, TRUE, FALSE ]; (default will be FALSE for all)
After generating the super objects, loop through the multivar_intensity_flag booleans:
- If all multivar_intensity_flag booleans are false, run MODE once on the binary 0/1 super objects to generate geometric object statistics and write them to the output "*_obj.txt" file.
- Otherwise, for each multivar_intensity_flag boolean flag that is true, run MODE once substituting in the raw intensity values for the corresponding fcst/obs fields. The object definition has already been performed, but we're just regenerating the statistics using the intensities for those fields. Write the output to the "*_obj.txt" output file. For any FALSE booleans, the intensity statistics will contain missing values.
- So the "*_obj.txt" may now contain output for multiple fields with different intensity values and potentially total interest values. Need to differentiate between them in the output in the FCST_VAR and OBS_VAR columns. Recommend including the forecast and observation field names in those output columns.
- Be sure to check the nc_pairs_flag option and write intensity values to the NetCDF file, as requested by the user.
- Add a new config option that allows the user to change the forecast and observation field names. Currently the super object output name is ‘super’, but users may be running multivariate MODE on multiple complex events (dryline/blizzard), and it will be beneficial to have a way to differentiate between the output
  - Add multivar_name=""; to change the name of the multivariate mode output as listed in the ascii file output, for example. (default will be "super")
  - For example, if we define multivar_name = "blizzard"; and we request intensity output for visibility, our ascii file(s) should append this to the FCST_VAR and OBS_VAR names e.g. blizzard_vis/vis_blizzard (preferred order?)
    - If intensity isn’t requested for any field, the name would be blizzard for the geometric object statistics?
Other considerations
- Determine whether there should be separate MODE output files for each field where intensity statistics are requested, or all output contained in a single file
  - Do what makes sense, when thinking about parallelization
- No need to keep mode runs from the individual fields - cuts down on processing time
- Keep this info in memory for continuing the mvmode process

Time Estimate

Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.

Sub-Issues

Consider breaking the enhancement down into sub-issues.

Add a checkbox for each sub-issue here.

Relevant Deadlines

List relevant project deadlines here or state NONE.

Funding Source

2700042 WPC

Define the Metadata

Assignee

Select engineer(s) or no engineer required
Select scientist(s) or no scientist required

Labels

Select component(s)
Select priority
Select requestor(s)

Projects and Milestone

Select Repository and/or Organization level Project(s) or add alert: NEED PROJECT ASSIGNMENT label
Select Milestone as the next official version or Future Versions

Define Related Issue(s)

Consider the impact to the other METplus components.

METplus, MET, METdatadb, METviewer, METexpress, METcalcpy, METplotpy
Related to MET Issues:
Add support for Multi-Variate MODE #1184
MODE-Multivar boolean logic from config file #1282
MODE-Multivar processing chain #1284

Enhancement Checklist

See the METplus Workflow for details.

The text was updated successfully, but these errors were encountered:

davidalbo · 2023-05-03T18:29:19Z

A visual would be good @hertneky.

davidalbo · 2023-05-04T22:41:07Z

@hertneky @JohnHalleyGotway .. reality check. Is this correct?

For an individual input:
Run the mode convolution algorithm at the convolution threshold
Run mode convolution again at the merge threshold (when merge flag = BOTH or THRESH), independent of the first run.

So you could for example end up with 2 objects from the first run, but 1 from the second run (that contains both of the first run objects within it.

hertneky · 2023-05-05T20:58:14Z

The first part sounds good to me.

I am having trouble with the last sentence. If 2 super objects are identified in the 1st run, then you would still have 2 super objects in the 2nd run, but they may or may not be merged (identified as a cluster). Am I understanding that correctly?

davidalbo · 2023-05-05T21:36:01Z

Hi @hertneky, This is before the superobject step, it's the individual input data.. The inner ones are the 'simple' objects (is that the right terminology?) that come from using the conv_thresh, the outer one is the simple object that comes from using a more generous threshold, the merge_thresh. For example:

{ name="WIND"; level="Z10"; conv_radius = 5; conv_thresh = >=8.9408; merge_thresh = >=6.7056; merge_flag = THRESH; },

Two objects when >=8.9408, one object when >= 6.7056.

davidalbo · 2023-05-05T21:52:26Z

As opposed to this where the objects get bigger but do not actually mege.

hertneky · 2023-05-08T16:06:09Z

@davidalbo Yes these images are correct for merging. I just want to reiterate that the merge_thresh will only be used for merging. They don't make the super objects bigger in the end. The super objects are the same after the 2nd pass, but may or may not be considered part of a cluster. After the determining the super object clusters using the super_merge_thresh_object, that larger super_merge_thresh_object is not saved in any output.

davidalbo · 2023-05-10T18:27:11Z

@hertneky understood. I've finally gotten the 'super_merge_thresh_object's to get created and I put in some code to output them as netCDF and look at them, and they seem reasonable. I did run into a problem with the first input in your original example. For the forecast the field is PrecipFlag, and you did not specify a merge_thresh or merge_flag. The code uses a default in that case, which is merge_flag=THRESH and merge_thresh = ">=1.25". This was causing problems because then my code tried to make bigger objects using these settings, which makes no sense. I added merge_flag=NONE explicitly for those fields, and that fixed it. I'd have thought the default would be merge_flag=NONE, but... does not seem to be the case.

hertneky · 2023-05-11T16:05:02Z

@davidalbo So when merge_flag is not set, it defaults to THRESH with a value of >=1.25. Is that just for the multivariate case or does that work for single MODE as well? I would have thought the default would be NONE if not specified as well. It doesn't make sense to me to have it default to THRESH with some arbitrary value, but perhaps there was some reason behind that. I don't know.
CSNOW is binary 0/1 and PrecipFlag is categorical. So a merge thresh >=1.25 would seem problematic. We did discuss that if no merge thresh were set for a variable, then the conv_thresh would be used to create the super merge object. I see various scenarios for merge_thresh:

user sets merge_flag to NONE for all variables - no merging of the super objects is performed
user sets merge_flag to THRESH for all variables - merging of super objects is performed using merge_thresh of each field (assumes user has selected a threshold for merge_thresh for each field)
user sets merge_flag for some of the variables, but not all - for variables where merge_flag was not set, set it to the same as the other variables? If using THRESH, set the merge_thresh = conv_thresh for that variable

davidalbo · 2023-05-11T16:13:40Z

@hertneky What you say for at least 2 and 3 is what I've been implementing. Right now I'm at the exact place where I'm trying to do the merging for option 2 or 3. I think I need to be reminded exactly what the merge algorithm should do, and how that compares to what happens for merging in the non-multivar mode situation. This is so I can use existing code as much as possible. I might need a tag up, but you can try a comment if you want.

hertneky · 2023-05-11T16:26:49Z

@davidalbo A tag-up may be needed. So the algorithm..

Uses the same multivar_logic as creating the super-object(s) to create super-merge-object(s)
Super-objects within super-merge-objects are merged (called cluster objects)
- For singular MODE, objects within merge-objects are merged (called cluster objects)
Objects not merged are simple objects
The remaining logic is the same for matching simple/cluster objects (if matching is requested)
The big difference is that with mvMODE, the merge_thresh of multiple variables is used to create the super-merge-object. If this doesn't answer your question and you need something more specific, we can tag-up.
I am also on the hook for creating some diagrams for documentation purposes. I will do that next.

davidalbo · 2023-05-11T16:41:00Z

@hertneky is it true that simple objects produced by mode have unique integer numbers associated with them? It must be true based on the output that gets produced. I need some way to tell if a super object is within a particular super merge object. The code as is right now represents super objects as binary yes/no on a grid, so there is no information about individual objects. If you agree that unique integers must be in there somewhere, I'll start backing up in the design and looking for those integers.

hertneky · 2023-05-11T21:18:02Z

Yes, to visualize this for example, one of the outputs in the nc file is fcst/obs_obj_id. These are IDs assigned to each simple object. If there are 4 objects, then the unique IDs are 1-4 (left image).
For merged objects, the output is called fcst/obs_cluster_id. Objects in a cluster due to merging will share the same ID, single objects not merged have a value of -1 (right image).
Colorbars at bottom show the values. In the right image, the large object was merged with the tiny object next to it an both objects share the ID of 1 and the bottom 2 objects were not merged and have a value of -1.

This output is from the individual MODE run for visibility. This should also be applied to the super objects.

In the object ASCII output, the IDS are also listed (to the right of the images). The 1st 4 lines correspond to the images of the FCST objects. The 1st column is the simple object IDs and the 2nd column is the clusters. Objects 1/2 were clustered (merged), objects 3/4 were not part of a cluster (denoted by CF000).

davidalbo · 2023-05-25T20:02:39Z

do_fcst_convolution();
do_fcst_thresholding();
do_fcst_filtering();
do_fcst_splitting();

davidalbo · 2023-05-30T16:59:58Z

Notes for discussion today may 30:

when merge_flag is not set, it defaults to THRESH with a value of >=1.25. This seems wrong and should be change for both multivar mode and single var mode to... 'no merging'. Correct?
Answer: Keep default as is. When it is multivariate mode, check that eaCh field has merge_flag explicitly set and error exit if not.

For the '2nd pass' (individual inputs masked to be inside super objects), which of these steps can be skipped?
do_fcst_convolution();
do_fcst_thresholding();
do_fcst_filtering();
do_fcst_splitting();
Solution: Only do the splitting. Needed to copy some data into place as is (within the engine code) to get it to work

Did I implement this, which is also part of the 2nd pass, and is it correct?:
Various scenarios for merge_thresh in multivar mode:

user sets merge_flag to NONE for all variables - no merging of the super objects is performed
user sets merge_flag to THRESH for all variables - merging of super objects is performed using merge_thresh of each field (assumes user has selected a threshold for merge_thresh for each field)
user sets merge_flag for some of the variables, but not all - for variables where merge_flag was not set, set it to the same as the other variables? If using THRESH, set the merge_thresh = conv_thresh for that variable

I'd like to talk through the expected behavior for the case where all the intensity flags are set FALSE. It would be a mode-like run for simple objects that have no intensities, they are just binary yes/no.

davidalbo · 2023-05-31T15:44:23Z

@hertneky what about merging when settings are all FALSE for intensities? It can be done and should give the same results as the 'not all FALSE' option. Does it make sense scientifically to merge?

hertneky · 2023-05-31T17:21:33Z

@davidalbo It still makes sense to merge objects, even if no object intensity info is requested. The merging should be the same as if the user requested intensities.

davidalbo · 2023-06-02T19:26:45Z

@JohnHalleyGotway @hertneky I'm modifying the unit tests. With intensity flags set TRUE, I'm getting output file names like the following, all in one output directory:

mode_Fcst_ALPHA(*,*)_Obs_ALPHA(*,*)_300000L_20120410_180000V_060000A_obj.txt

The unit test mode config file has:

fcst = {

   field = [

      {
         name  = "ALPHA";
         level = "(*,*)";
         merge_flag = NONE;
      },

Previously (the way the unit test is set up now) the output files went to numbered subdirectories with file names, for example, this is the same file with the old name:

00/mode_300000L_20120410_180000V_060000A_obj.txt

I'm wondering about (*,*) in file names.

davidalbo · 2023-06-02T19:32:05Z

@JohnHalleyGotway eventually I'm going to want two unit tests, one with [TRUE, TRUE, TRUE] one with [FALSE, FALSE, FALSE]. The expected output files need to be put somewhere for the unit test to be able to compare them to the test output. The output file names and organization has changed, so all of that needs updating. THis is the one piece that's not in the MET repo, is that correct?

davidalbo · 2023-06-05T16:13:53Z

@hertneky Could you have different multivar_logic for forecasts and for obs? Maybe a different number of input fields for obs than forecasts? What would that do to the intensity flags? If intensity flag 2 and 3 are TRUE, right now it is fcst field 2 is compared to obs field 2, fcst field 3 compared to obs field 3. Is it good enough for now to require equal number of fcst and obs inputs, and the same multivar_logic for fcst and obs? And the comparisons as indicated?

…nfig_constants.h. Make each error message consistently begin with 1 newline and end with 2. Use lookup_bool_array() defined in dictionary.cc and remove the redefinition in mode_conf_info.cc

…for internal consistency.

Co-authored-by: John Halley Gotway <johnhg@ucar.edu> Co-authored-by: Dave Albo <dave@seneca.rap.ucar.edu>

hertneky · 2023-07-05T19:25:50Z

Release Acceptance Testing Summary

Version: METplus-5.1.0-rc1
Date: July 5, 2023
Location: cheyenne
Status (PASS/FAIL): FAIL
Description: Ran blizzard test case. Two issues: 1) When setting MODE_MULTIVAR_INTENSITY = FALSE,TRUE,TRUE in the conf, the result is 'WARNING: multivar_frontend() -> empty multivar intensity array, setting to all FALSE' (in the log, I see... export METPLUS_MULTIVAR_INTENSITY="multivar_intensity = [FALSE, TRUE, TRUE];") and 2) Setting MODE_FCST/OBS_MULTIVAR_LEVEL = L0 does not seem to replace the NA in the filename or within the FCST_LEV column in the output (e.g. mode_Fcst_bliz_NA_Obs_bliz_NA_HRRR_vs_ANALYSIS_PrecipFlag_L0_210000L_20210201_210000V_000000A_obj.txt; I do not see this being exported in the log)
Test conducted in: /glade/work/hertneky/tmp/met_rc_test

georgemccabe · 2023-07-05T20:25:42Z

@hertneky, we discussed this on Slack but for traceability on this thread here is what we discovered:

The MET variable name is actually multivar_intensity_flag, not multivar_intensity as described in the MET documentation. PR Bugfix #2235 rename multivar_itensity to multivar_intensity_flag METplus#2236 will fix this bug.
It appears the rename of multivar_units to multivar_level was made after METplus 5.1.0-rc1 was created, so it did not properly set the variable. This can also be tested using the branch from the above PR.

hertneky · 2023-07-06T21:31:01Z

Release Acceptance Testing Summary

Version: METplus-5.1.0-rc1
Date: July 6, 2023
Location: cheyenne
Status (PASS/FAIL): PASS
Description: The issues mentioned in the previous testing summary have been fixed.

dwfncar added the type: task An actionable item of work label Mar 24, 2020

dwfncar self-assigned this Mar 24, 2020

dwfncar assigned dwfncar and lindsayrblank and unassigned dwfncar Mar 24, 2020

dwfncar mentioned this issue Mar 27, 2020

MODE-Multivar boolean logic from config file #1282

Closed

17 tasks

lindsayrblank added component: application code labels Mar 27, 2020

lindsayrblank added this to the MET 9.1 milestone Mar 27, 2020

lindsayrblank mentioned this issue Mar 27, 2020

Multivariate MODE Config File #1290

Closed

17 tasks

JohnHalleyGotway modified the milestones: MET 9.1, MET 10.0 Jun 8, 2020

JohnHalleyGotway assigned rgbullock Jun 12, 2020

JohnHalleyGotway added the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Sep 10, 2020

JohnHalleyGotway unassigned rgbullock Sep 10, 2020

JohnHalleyGotway assigned rgbullock and unassigned dwfncar Nov 5, 2020

JohnHalleyGotway removed the alert: NEED CYCLE ASSIGNMENT Need to assign to a release development cycle label Nov 5, 2020

JohnHalleyGotway changed the title ~~MODE-Multivar super objects~~ Enhance MODE-Multivar to run entirely in memory instead of spawning individual processes for each call to MODE. Dec 3, 2020

JohnHalleyGotway changed the title ~~Enhance MODE-Multivar to run entirely in memory instead of spawning individual processes for each call to MODE.~~ Enhance Multivariate MODE to run entirely in memory instead of spawning individual processes for each call to MODE. Dec 3, 2020

This was referenced Dec 3, 2020

Fix Multivariate MODE to log it's output to a single log file. #1593

Closed

Feature 1184 dryline #1594

Closed

davidalbo mentioned this issue Jun 2, 2023

Add support for new multivariate MODE settings dtcenter/METplus#2197

Closed

21 tasks

davidalbo linked a pull request Jun 12, 2023 that will close this issue

Feature #1283 mvmode mods #2569

Merged

15 tasks

JohnHalleyGotway mentioned this issue Jun 12, 2023

Enhancement: Enhance the MODE wrapper to handle updated multivar configuration options dtcenter/METplus#2210

Closed

21 tasks

JohnHalleyGotway added a commit that referenced this issue Jun 16, 2023

Per #1283, ci-skip-all, no real code changes. Just whitespace tweaks …

0e05d4f

…for internal consistency.

JohnHalleyGotway added a commit that referenced this issue Jun 16, 2023

Feature #1283 mvmode mods (#2569)

9f9b4cd

Co-authored-by: John Halley Gotway <johnhg@ucar.edu> Co-authored-by: Dave Albo <dave@seneca.rap.ucar.edu>

JohnHalleyGotway closed this as completed Jun 16, 2023

davidalbo mentioned this issue Jul 20, 2023

Multivariate MODE Improvements #1567

Closed

21 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance Multivariate MODE to generate object statistics for each input field requested by the user. #1283

Enhance Multivariate MODE to generate object statistics for each input field requested by the user. #1283

dwfncar commented Mar 24, 2020 •

edited by davidalbo

Loading

davidalbo commented May 3, 2023

davidalbo commented May 4, 2023

hertneky commented May 5, 2023

davidalbo commented May 5, 2023 •

edited

Loading

davidalbo commented May 5, 2023

hertneky commented May 8, 2023

davidalbo commented May 10, 2023

hertneky commented May 11, 2023

davidalbo commented May 11, 2023

hertneky commented May 11, 2023

davidalbo commented May 11, 2023

hertneky commented May 11, 2023

davidalbo commented May 25, 2023

davidalbo commented May 30, 2023 •

edited

Loading

davidalbo commented May 31, 2023

hertneky commented May 31, 2023

davidalbo commented Jun 2, 2023 •

edited

Loading

davidalbo commented Jun 2, 2023

davidalbo commented Jun 5, 2023

hertneky commented Jul 5, 2023 •

edited

Loading

georgemccabe commented Jul 5, 2023 •

edited

Loading

hertneky commented Jul 6, 2023

Enhance Multivariate MODE to generate object statistics for each input field requested by the user. #1283

Enhance Multivariate MODE to generate object statistics for each input field requested by the user. #1283

Comments

dwfncar commented Mar 24, 2020 • edited by davidalbo Loading

Describe the Enhancement

Time Estimate

Sub-Issues

Relevant Deadlines

Funding Source

Define the Metadata

Assignee

Labels

Projects and Milestone

Define Related Issue(s)

Enhancement Checklist

davidalbo commented May 3, 2023

davidalbo commented May 4, 2023

hertneky commented May 5, 2023

davidalbo commented May 5, 2023 • edited Loading

davidalbo commented May 5, 2023

hertneky commented May 8, 2023

davidalbo commented May 10, 2023

hertneky commented May 11, 2023

davidalbo commented May 11, 2023

hertneky commented May 11, 2023

davidalbo commented May 11, 2023

hertneky commented May 11, 2023

davidalbo commented May 25, 2023

davidalbo commented May 30, 2023 • edited Loading

davidalbo commented May 31, 2023

hertneky commented May 31, 2023

davidalbo commented Jun 2, 2023 • edited Loading

davidalbo commented Jun 2, 2023

davidalbo commented Jun 5, 2023

hertneky commented Jul 5, 2023 • edited Loading

Release Acceptance Testing Summary

georgemccabe commented Jul 5, 2023 • edited Loading

hertneky commented Jul 6, 2023

Release Acceptance Testing Summary

dwfncar commented Mar 24, 2020 •

edited by davidalbo

Loading

davidalbo commented May 5, 2023 •

edited

Loading

davidalbo commented May 30, 2023 •

edited

Loading

davidalbo commented Jun 2, 2023 •

edited

Loading

hertneky commented Jul 5, 2023 •

edited

Loading

georgemccabe commented Jul 5, 2023 •

edited

Loading