Skip to content

Commit

Permalink
Update develop-ref after #1734 and #1738 (#1740)
Browse files Browse the repository at this point in the history
* Start on write netcdf pickle alternative.

* Write dataplane array.

* Start on read of netcdf as pickle alternative.

* Create attribute variables.

* Use global attributes for met_info attrs.

* Add grid structure.

* Read metadata back into met_info.attrs.

* Convert grid.nx and grid.ny to int.

* Rename _name key to name.

* Removed pickle write.

* Fixed write_pickle_dataplane to work for both numpy and xarray.

* Use items() to iterate of key, value attrs.

* Write temporary text file.

* Renamed scripts.

* Changed script names in Makefile.am.

* Replaced pickle with tmp_nc.

* Fixed wrapper script names.

* Test for attrs in met_in.met_data.

* Initial version of read_tmp_point module.

* Added read_tmp_point.py to install list.

* Start on Python3_Script::read_tmp_point.

* Write MPR tmp ascii file.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Renamed to read_tmp_ascii to use for point point and MPR.

* Define Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::import_read_tmp_ascii_py.

* Append MET_BASE/wrappers to sys.path.

* Finished implementation of Python3_Script::import_read_tmp_ascii_py.

* Call Python3_Script::read_tmp_ascii in python_handler.

* Revised python3_script::read_tmp_ascii with call to run, PyRun_String.

* Return PyObject* from Python3_Script::run.

* Restored call to run_python_string for now.

* Per #1429, enhance error message from DataLine::get_item(). (#1682)

* Feature 1429 tc_log second try (#1686)

* Per #1429, enhance error message from DataLine::get_item().

* Per #1429, I realize that the line number actually is readily available in the DataLine class... so include it in the error message.

* Feature 1588 ps_log (#1687)

* Per #1588, updated pair_data_point.h/.cc to add detailed Debug(4) log messages, as specified in the GitHub issue. Do still need to test each of these cases to confirm that the log messages look good.

* Per #1588, switch very detailed interpolation details from debug level 4 to 5.

* Per #1588, remove the Debug(4) log message about duplicate obs since it's been moved up to a higher level.

* Per #1588, add/update detailed log messages when processing point observations for bad data, off the grid, bad topo, big topo diffs, bad fcst value, and duplicate obs.

* #1454 Disabled plot_data_plane_CESM_SSMI_microwave and plot_data_plane_CESM_sea_ice_nc becaues of not evenly spaced

* #1454 Moved NC attribute name to nc_utils.h

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected sanity checking for lat/lon projection based on the percentage of the delta instead of fixed tolerance

* #1454 Corrected data.delta_lon

* #1454 Change bact to use diff instead of absolute value of diff

* 454 Deleted instea dof commenting out

* 454 Deleted instea dof commenting out

* Feature 1684 bss and 1685 single reference model (#1689)

* Per #1684, move an instance of the ClimoCDFInfo class into PairBase. Also define derive_climo_vals() and derive_climo_prob() utility functions.

* Add to VxPairDataPoint and VxPairDataEnsemble functions to set the ClimoCDFInfo class.

* Per #1684, update ensemble_stat and point_stat to set the ClimoCDFInfo object based on the contents of the config file.

* Per #1684, update the vx_statistics library and stat_analysis to make calls to the new derive_climo_vals() and derive_climo_prob() functions.

* Per #1684, since cdf_info is a member of PairBase class, need to handle it in the PairDataPoint and PairDataEnsemble assignment and subsetting logic.

* Per #1684, during development, I ran across and then updated this log message.

* Per #1684, working on log messages and figured that the regridding climo data should be moved from Debug(1) to at least Debug(2).

* Per #1684 and #1685, update the logic for the derive_climo_vals() utility function. If only a single climo bin is requested, just return the climo mean. Otherwise, sample the requested number of values.

* Per #1684, just fixing the format of this log message.

* Per #1684, add a STATLine::get_offset() member function.

* Per #1684, update parse_orank_line() logic. Rather than calling NumArray::clear() call NumArray::erase() to preserve allocated memory. Also, instead of parsing ensemble member values by column name, parse them by offset number.

* Per #1684, call EnsemblePairData::extend() when parsing ORANK data to allocate one block of memory instead of bunches of litte ones.

* Per #1684 and #1685, add another call to Ensemble-Stat to test computing the CRPSCL_EMP from a single climo mean instead of using the full climo distribution.

* Per #1684 and #1685, update ensemble-stat docs about computing CRPSS_EMP relative to a single reference model.

* Per #1684, need to update Grid-Stat to store the climo cdf info in the PairDataPoint objects.

* Per #1684, remove debug print statements.

* Per #1684, need to set cdf_info when aggregating MPR lines in Stat-Analysis.

* Per #1684 and #1685, update PairDataEnsemble::compute_pair_vals() to print a log message indicating the climo data being used as reference:

For a climo distribution defined by mean and stdev:
DEBUG 3: Computing ensemble statistics relative to a 9-member climatological ensemble.

For a single deterministic reference:
DEBUG 3: Computing ensemble statistics relative to the climatological mean.

* Per #1691, add met-10.0.0-beta4 release notes. (#1692)

* Updated Python documentation

* Per #1694, add VarInfo::magic_str_attr() to construct a field summary string from the name_attr() and level_attr() functions.

* Per #1694, fixing 2 issues here. There was a bug in the computation of the max value. Had a less-than sign that should have been greater-than. Also, switch from tracking data by it's magic_str() to simply using VAR_i and VAR_j strings. We *could* have just used the i, j integers directly, but constructing the ij joint histogram integer could have been tricky since we start numbering with 0 instead of 1. i=0, j=1 would result in 01 which is the same as integer of 1. If we do want to switch to integers, we just need to make them 1-based and add +1 all over the place.

* Per #1694, just switching to consistent variable name.

* Just consistent spacing.

* Added python3_script::import_read_tmp_ascii.

* Restored read_tmp_ascii call.

* Added lookup into ascii module.

* Adding files for ReadTheDocs

* Adding .yaml file for ReadTheDocs

* Updated path to requirements.txt file

* Updated path to conf.py file

* Removing ReadTheDocs files and working in separate branch

* Return PyObject* from read_tmp_ascii.

* Put point_data in global namespace.

* Remove temporary ascii file.

* Added tmp_ascii_path.

* Removed read_obs_from_pickle.

* Trying different options for formats (#1702)

* Per #1706, add bugfix to the develop branch. Also add a new job to unit_stat_analysis.xml to test out the aggregation of the ECNT line type. This will add new unit test output and cause the NB to fail. (#1708)

* Feature 1471 python_grid (#1704)

* Per #1471, defined a parse_grid_string() function in the vx_statistics library and then updated vx_data2d_python to call that function. However, this creates a circular dependency because vx_data2d_python now depends on vx_statistics.

* Per #1471, because of the change in dependencies, I had to modify many, many Makefile.am files to link to the -lvx_statistics after -lvx_data2d_python. This is not great, but I didn't find a better solution.

* Per #1471, add a sanity check to make sure the grid and data dimensions actually match.

* Per #1471, add 3 new unit tests to demonstrate setting the python grid as a named grid, grid specification string, or a gridded data file.

* Per #1471, document python grid changes in appendix F.

* Per #1471, just spacing.

* Per #1471, lots of Makefile.am changes to get this code to compile on kiowa. Worringly, it compiled and linked fine on my Mac laptop but not on kiowa. Must be some large differences in the linker logic.

Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>

* Committing a fix for unit_python.xml directly to the develop branch. We referenced  in a place where it's not defined.

* Add *.dSYM to the .gitignore files in the src and internal_tests directories.

* Replaced tmp netcdf _name attribute with name_str.

* Append user script path to system path.

* Revert "Feature 1319 no pickle" (#1717)

* Fixed typos, added content, and modified release date format

* #1715 Initial release

* #1715 Do not combined if there are no overlapping beteewn TQZ and UV records

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Added pb2nc_compute_pbl_cape

* #1715 Reduced obs_bufr_var. Removed pb_report_type

* #1715 Added a blank line for Error/Warning

* Per #1725, return good status from TrackInfoArray::add() when using an ATCF line to create a new track. (#1726)

* Per #1705, update the threshold node heirarchy by adding a climo_prob() function to determine the climatological probability of a CDP-type threshold. Also update derive_climo_prob() in pair_base.cc to call the new climo_prob() function. (#1724)

* Bugfix 1716 develop perc_thresh (#1722)

* Per #1716, committing changes from Randy Bullock to support floating point percentile thresholds.

* Per #1716, no code changes, just consistent formatting.

* Per #1716, change SFP50 example to SFP33.3 to show an example of using floating point percentile values.

* Update pull_request_template.md

* Feature 1733 exc (#1734)

* Per #1733, add column_exc_name, column_exc_val, init_exc_name, and init_exc_val options to the TCStat config files.

* Per #1733, enhance tc_stat to support the column_exc and init_exc config file and job command filtering options.

* Per #1733, update stat_analysis to support the -column_exc job filtering option. Still need to update docuementation and add unit tests.

* Per #1773, update the user's guide with the new config and job command options.

* Per #1733, add call to stat_analysis to exercise -column_str and -column_exc options.

* Per #1733, I ran into a namespace conflict in tc_stat where -init_exc was used for to filter by time AND my string value. So I switched to using -init_str_exc instead. And made the corresponding change to -column_str_exc in stat_analysis and tc_stat. Also changed internal variable names to use IncMap and ExcMap to keep the logic clear.

* Per #1733, tc_stat config file updates to switch from column_exc and init_exc to column_str_exc and init_str_exc.

* Per #1733, add tc_stat and stat_analysis jobs to exercise the string filtering options.

* Bugfix 1737 develop little_r (#1739)

* Per #1737, migrate the same fix from main_v9.1 over to the develop branch.

* Per #1737, add another unit test for running ascii2nc with corrupt littl_r records.

Co-authored-by: David Fillmore <fillmore.winslow.david@gmail.com>
Co-authored-by: Howard Soh <hsoh@kiowa.rap.ucar.edu>
Co-authored-by: hsoh-u <hsoh@ucar.edu>
Co-authored-by: Julie.Prestopnik <jpresto@ucar.edu>
Co-authored-by: David Fillmore <davidfillmore@users.noreply.github.com>
Co-authored-by: John Halley Gotway <johnhg@kiowa.rap.ucar.edu>
Co-authored-by: MET Tools Test Account <met_test@kiowa.rap.ucar.edu>
  • Loading branch information
8 people authored Mar 31, 2021
1 parent 11cb05d commit 7ad8e22
Show file tree
Hide file tree
Showing 19 changed files with 625 additions and 329 deletions.
4 changes: 3 additions & 1 deletion .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,9 @@

- [ ] Recommend testing for the reviewer(s) to perform, including the location of input datasets, and any additional instructions:</br>

- [ ] Do these changes include sufficient documentation and testing updates? **[Yes or No]**
- [ ] Do these changes include sufficient documentation updates, ensuring that no errors or warnings exist in the build of the documentation? **[Yes or No]**

- [ ] Do these changes include sufficient testing updates? **[Yes or No]**

- [ ] Will this PR result in changes to the test suite? **[Yes or No]**</br>
If **yes**, describe the new output and/or changes to the existing output:</br>
Expand Down
12 changes: 12 additions & 0 deletions met/data/config/TCStatConfig_default
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,12 @@ column_thresh_val = [];
column_str_name = [];
column_str_val = [];

//
// Stratify by excluding strings in non-numeric data columns.
//
column_str_exc_name = [];
column_str_exc_val = [];

//
// Similar to the column_thresh options above
//
Expand All @@ -123,6 +129,12 @@ init_thresh_val = [];
init_str_name = [];
init_str_val = [];

//
// Similar to the column_str_exc options above
//
init_str_exc_name = [];
init_str_exc_val = [];

//
// Stratify by the ADECK and BDECK distances to land.
//
Expand Down
16 changes: 9 additions & 7 deletions met/docs/Users_Guide/config_options.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3748,17 +3748,19 @@ Where "job_name" is set to one of the following:
Job command FILTERING options that may be used only when -line_type
has been listed once. These options take two arguments: the name of the
data column to be used and the min, max, or exact value for that column.
If multiple column eq/min/max/str options are listed, the job will be
If multiple column eq/min/max/str/exc options are listed, the job will be
performed on their intersection:

.. code-block:: none
"-column_min col_name value" e.g. -column_min BASER 0.02
"-column_max col_name value"
"-column_eq col_name value"
"-column_thresh col_name threshold" e.g. -column_thresh FCST '>273'
"-column_str col_name string" separate multiple filtering strings
with commas
"-column_min col_name value" e.g. -column_min BASER 0.02
"-column_max col_name value"
"-column_eq col_name value"
"-column_thresh col_name threshold" e.g. -column_thresh FCST '>273'
"-column_str col_name string" separate multiple filtering strings
with commas
"-column_str_exc col_name string" separate multiple filtering strings
with commas
Job command options to DEFINE the analysis job. Unless otherwise noted,
Expand Down
52 changes: 45 additions & 7 deletions met/docs/Users_Guide/config_options_tc.rst
Original file line number Diff line number Diff line change
Expand Up @@ -517,8 +517,8 @@ For example:

Stratify by performing string matching on non-numeric data columns.
Specify a comma-separated list of columns names and values
to be checked. May add using the "-column_str name string" job command
options.
to be included in the analysis.
May add using the "-column_str name string" job command options.

For example:

Expand All @@ -531,6 +531,23 @@ For example:
column_str_name = [];
column_str_val = [];
**column_str_exc_name, column_str_exc_val**

Stratify by performing string matching on non-numeric data columns.
Specify a comma-separated list of columns names and values
to be excluded from the analysis.
May add using the "-column_str_exc name string" job command options.

For example:

| column_str_exc_name = [ "LEVEL" ];
| column_str_exc_val = [ "TD" ];
|
.. code-block:: none
column_str_exc_name = [];
column_str_exc_val = [];
**init_thresh_name, init_thresh_val**

Expand Down Expand Up @@ -567,6 +584,23 @@ For example:
init_str_name = [];
init_str_val = [];
**init_str_exc_name, init_str_exc_val**

Just like the column_str_exc options above, but apply the string matching only
when lead = 0. If lead = 0 string does match, discard the entire track.
May add using the "-init_str_exc name thresh" job command options.

For example:

| init_str_exc_name = [ "LEVEL" ];
| init_str_exc_val = [ "HU" ];
|
.. code-block:: none
init_str_exc_name = [];
init_str_exc_val = [];
**water_only**

Stratify by the ADECK and BDECK distances to land. Once either the ADECK or
Expand Down Expand Up @@ -747,8 +781,10 @@ Where "job_name" is set to one of the following:
"-track_watch_warn name"
"-column_thresh name thresh"
"-column_str name string"
"-column_str_exc name string"
"-init_thresh name thresh"
"-init_str name string"
"-init_str_exc name string"
Additional filtering options that may be used only when -line_type
has been listed only once. These options take two arguments: the name
Expand All @@ -758,11 +794,13 @@ Where "job_name" is set to one of the following:

.. code-block:: none
"-column_min col_name value" For example: -column_min TK_ERR 100.00
"-column_max col_name value"
"-column_eq col_name value"
"-column_str col_name string" separate multiple filtering strings
with commas
"-column_min col_name value" For example: -column_min TK_ERR 100.00
"-column_max col_name value"
"-column_eq col_name value"
"-column_str col_name string" separate multiple filtering strings
with commas
"-column_str_exc col_name string" separate multiple filtering strings
with commas
Required Args: -dump_row

Expand Down
4 changes: 2 additions & 2 deletions met/docs/Users_Guide/gsi-tools.rst
Original file line number Diff line number Diff line change
Expand Up @@ -230,7 +230,7 @@ The GSID2MPR tool writes the same set of MPR output columns for the conventional
- PRS_MAX_WGT
- Pressure of the maximum weighing function

The gsid2mpr output may be passed to the Stat-Analysis tool to derive additional statistics. In particular, users should consider running the **aggregate_stat** job type to read MPR lines and compute partial sums (SL1L2), continuous statistics (CNT), contingency table counts (CTC), or contingency table statistics (CTS). Stat-Analysis has been enhanced to parse any extra columns found at the end of the input lines. Users can filter the values in those extra columns using the **-column_thresh** and **-column_str** job command options.
The gsid2mpr output may be passed to the Stat-Analysis tool to derive additional statistics. In particular, users should consider running the **aggregate_stat** job type to read MPR lines and compute partial sums (SL1L2), continuous statistics (CNT), contingency table counts (CTC), or contingency table statistics (CTS). Stat-Analysis has been enhanced to parse any extra columns found at the end of the input lines. Users can filter the values in those extra columns using the **-column_thresh**, **-column_str**, and **-column_str_exc** job command options.

An example of the Stat-Analysis calling sequence is shown below:

Expand Down Expand Up @@ -425,7 +425,7 @@ The GSID2MPR tool writes the same set of ORANK output columns for the convention
- TZFND
- d(Tz)/d(Tr)

The gsidens2orank output may be passed to the Stat-Analysis tool to derive additional statistics. In particular, users should consider running the **aggregate_stat** job type to read ORANK lines and ranked histograms (RHIST), probability integral transform histograms (PHIST), and spread-skill variance output (SSVAR). Stat-Analysis has been enhanced to parse any extra columns found at the end of the input lines. Users can filter the values in those extra columns using the **-column_thresh** and **-column_str** job command options.
The gsidens2orank output may be passed to the Stat-Analysis tool to derive additional statistics. In particular, users should consider running the **aggregate_stat** job type to read ORANK lines and ranked histograms (RHIST), probability integral transform histograms (PHIST), and spread-skill variance output (SSVAR). Stat-Analysis has been enhanced to parse any extra columns found at the end of the input lines. Users can filter the values in those extra columns using the **-column_thresh**, **-column_str**, and **-column_str_exc** job command options.

An example of the Stat-Analysis calling sequence is shown below:

Expand Down
15 changes: 8 additions & 7 deletions met/docs/Users_Guide/stat-analysis.rst
Original file line number Diff line number Diff line change
Expand Up @@ -522,13 +522,14 @@ This job command option is extremely useful. It can be used multiple times to sp

.. code-block:: none
-column_min col_name value
-column_max col_name value
-column_eq col_name value
-column_thresh col_name thresh
-column_str col_name string
The column filtering options may be used when the **-line_type** has been set to a single value. These options take two arguments, the name of the data column to be used followed by a value, string, or threshold to be applied. If multiple column_min/max/eq/thresh/str options are listed, the job will be performed on their intersection. Each input line is only retained if its value meets the numeric filtering criteria defined or matches one of the strings defined by the **-column_str** option. Multiple filtering strings may be listed using commas. Defining thresholds in MET is described in :numref:`config_options`.
-column_min col_name value
-column_max col_name value
-column_eq col_name value
-column_thresh col_name thresh
-column_str col_name string
-column_str_exc col_name string
The column filtering options may be used when the **-line_type** has been set to a single value. These options take two arguments, the name of the data column to be used followed by a value, string, or threshold to be applied. If multiple column_min/max/eq/thresh/str options are listed, the job will be performed on their intersection. Each input line is only retained if its value meets the numeric filtering criteria defined, matches one of the strings defined by the **-column_str** option, or does not match any of the string defined by the **-column_str_exc** option. Multiple filtering strings may be listed using commas. Defining thresholds in MET is described in :numref:`config_options`.

.. code-block:: none
Expand Down
24 changes: 21 additions & 3 deletions met/docs/Users_Guide/tc-stat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -251,7 +251,16 @@ _________________________
column_str_name = [];
column_str_val = [];
The **column_str_name** and **column_str_val** fields stratify by performing string matching on non-numeric data columns. Specify a comma-separated list of columns names and values to be checked. The length of the **column_str_val** should match that of the **column_str_name**. Using the **-column_str name val** option within the job command lines may further refine these selections.
The **column_str_name** and **column_str_val** fields stratify by performing string matching on non-numeric data columns. Specify a comma-separated list of columns names and values to be **included** in the analysis. The length of the **column_str_val** should match that of the **column_str_name**. Using the **-column_str name val** option within the job command lines may further refine these selections.

_________________________

.. code-block:: none
column_str_exc_name = [];
column_str_exc_val = [];
The **column_str_exc_name** and **column_str_exc_val** fields stratify by performing string matching on non-numeric data columns. Specify a comma-separated list of columns names and values to be **excluded** from the analysis. The length of the **column_str_exc_val** should match that of the **column_str_exc_name**. Using the **-column_str_exc name val** option within the job command lines may further refine these selections.

_________________________

Expand All @@ -260,7 +269,7 @@ _________________________
init_thresh_name = [];
init_thresh_val = [];
The **init_thresh_name** and **init_thresh_val** fields stratify by applying thresholds to numeric data columns only when lead = 0. If lead =0, but the value does not meet the threshold, discard the entire track. The length of the **init_thresh_val** should match that of the **init_thresh_name**. Using the **-init_thresh name val** option within the job command lines may further refine these selections.
The **init_thresh_name** and **init_thresh_val** fields stratify by applying thresholds to numeric data columns only when lead = 0. If lead = 0, but the value does not meet the threshold, discard the entire track. The length of the **init_thresh_val** should match that of the **init_thresh_name**. Using the **-init_thresh name val** option within the job command lines may further refine these selections.

_________________________

Expand All @@ -269,7 +278,16 @@ _________________________
init_str_name = [];
init_str_val = [];
The **init_str_name** and **init_str_val** fields stratify by performing string matching on non-numeric data columns only when lead = 0. If lead =0, but the string does not match, discard the entire track. The length of the **init_str_val** should match that of the **init_str_name**. Using the **-init_str name val** option within the job command lines may further refine these selections.
The **init_str_name** and **init_str_val** fields stratify by performing string matching on non-numeric data columns only when lead = 0. If lead = 0, but the string **does not** match, discard the entire track. The length of the **init_str_val** should match that of the **init_str_name**. Using the **-init_str name val** option within the job command lines may further refine these selections.

_________________________

.. code-block:: none
init_str_exc_name = [];
init_str_exc_val = [];
The **init_str_exc_name** and **init_str_exc_val** fields stratify by performing string matching on non-numeric data columns only when lead = 0. If lead = 0, and the string **does** match, discard the entire track. The length of the **init_str_exc_val** should match that of the **init_str_exc_name**. Using the **-init_str_exc name val** option within the job command lines may further refine these selections.

_________________________

Expand Down
4 changes: 4 additions & 0 deletions met/src/basic/vx_config/config_constants.h
Original file line number Diff line number Diff line change
Expand Up @@ -1037,10 +1037,14 @@ static const char conf_key_column_thresh_name[] = "column_thresh_name";
static const char conf_key_column_thresh_val[] = "column_thresh_val";
static const char conf_key_column_str_name[] = "column_str_name";
static const char conf_key_column_str_val[] = "column_str_val";
static const char conf_key_column_str_exc_name[] = "column_str_exc_name";
static const char conf_key_column_str_exc_val[] = "column_str_exc_val";
static const char conf_key_init_thresh_name[] = "init_thresh_name";
static const char conf_key_init_thresh_val[] = "init_thresh_val";
static const char conf_key_init_str_name[] = "init_str_name";
static const char conf_key_init_str_val[] = "init_str_val";
static const char conf_key_init_str_exc_name[] = "init_str_exc_name";
static const char conf_key_init_str_exc_val[] = "init_str_exc_val";
static const char conf_key_water_only[] = "water_only";
static const char conf_key_rirw_track[] = "rirw.track";
static const char conf_key_rirw_time_adeck[] = "rirw.adeck.time";
Expand Down
77 changes: 64 additions & 13 deletions met/src/libcode/vx_analysis_util/stat_job.cc
Original file line number Diff line number Diff line change
Expand Up @@ -172,7 +172,8 @@ void STATAnalysisJob::clear() {
wmo_fisher_stats.clear();

column_thresh_map.clear();
column_str_map.clear();
column_str_inc_map.clear();
column_str_exc_map.clear();

by_column.clear();

Expand Down Expand Up @@ -306,7 +307,8 @@ void STATAnalysisJob::assign(const STATAnalysisJob & aj) {
wmo_fisher_stats = aj.wmo_fisher_stats;

column_thresh_map = aj.column_thresh_map;
column_str_map = aj.column_str_map;
column_str_inc_map = aj.column_str_inc_map;
column_str_exc_map = aj.column_str_exc_map;

by_column = aj.by_column;

Expand Down Expand Up @@ -507,9 +509,16 @@ void STATAnalysisJob::dump(ostream & out, int depth) const {
thr_it->second.dump(out, depth + 1);
}

out << prefix << "column_str_map ...\n";
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_map.begin();
str_it != column_str_map.end(); str_it++) {
out << prefix << "column_str_inc_map ...\n";
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_inc_map.begin();
str_it != column_str_inc_map.end(); str_it++) {
out << prefix << str_it->first << ": \n";
str_it->second.dump(out, depth + 1);
}

out << prefix << "column_str_exc_map ...\n";
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_exc_map.begin();
str_it != column_str_exc_map.end(); str_it++) {
out << prefix << str_it->first << ": \n";
str_it->second.dump(out, depth + 1);
}
Expand Down Expand Up @@ -948,15 +957,27 @@ int STATAnalysisJob::is_keeper(const STATLine & L) const {
//
// column_str
//
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_map.begin();
str_it != column_str_map.end(); str_it++) {
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_inc_map.begin();
str_it != column_str_inc_map.end(); str_it++) {

//
// Check if the current value is in the list for the column
//
if(!str_it->second.has(L.get_item(str_it->first.c_str(), false))) return(0);
}

//
// column_str_exc
//
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_exc_map.begin();
str_it != column_str_exc_map.end(); str_it++) {

//
// Check if the current value is not in the list for the column
//
if(str_it->second.has(L.get_item(str_it->first.c_str(), false))) return(0);
}

//
// For MPR lines, check mask_grid, mask_poly, and mask_sid
//
Expand Down Expand Up @@ -1125,7 +1146,10 @@ void STATAnalysisJob::parse_job_command(const char *jobstring) {
column_thresh_map.clear();
}
else if(jc_array[i] == "-column_str" ) {
column_str_map.clear();
column_str_inc_map.clear();
}
else if(jc_array[i] == "-column_str_exc" ) {
column_str_exc_map.clear();
}
else if(jc_array[i] == "-set_hdr" ) {
hdr_name.clear();
Expand Down Expand Up @@ -1376,12 +1400,30 @@ void STATAnalysisJob::parse_job_command(const char *jobstring) {
col_value.add_css(jc_array[i+2]);

// If the column name is already present in the map, add to it
if(column_str_map.count(col_name) > 0) {
column_str_map[col_name].add(col_value);
if(column_str_inc_map.count(col_name) > 0) {
column_str_inc_map[col_name].add(col_value);
}
// Otherwise, add a new map entry
else {
column_str_map.insert(pair<ConcatString, StringArray>(col_name, col_value));
column_str_inc_map.insert(pair<ConcatString, StringArray>(col_name, col_value));
}
i+=2;
}
else if(jc_array[i] == "-column_str_exc") {

// Parse the column name and value
col_name = to_upper((string)jc_array[i+1]);
col_value.clear();
col_value.set_ignore_case(1);
col_value.add_css(jc_array[i+2]);

// If the column name is already present in the map, add to it
if(column_str_exc_map.count(col_name) > 0) {
column_str_exc_map[col_name].add(col_value);
}
// Otherwise, add a new map entry
else {
column_str_exc_map.insert(pair<ConcatString, StringArray>(col_name, col_value));
}
i+=2;
}
Expand Down Expand Up @@ -2461,14 +2503,23 @@ ConcatString STATAnalysisJob::get_jobstring() const {
}

// column_str
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_map.begin();
str_it != column_str_map.end(); str_it++) {
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_inc_map.begin();
str_it != column_str_inc_map.end(); str_it++) {

for(i=0; i<str_it->second.n(); i++) {
js << "-column_str " << str_it->first << " " << str_it->second[i] << " ";
}
}

// column_str_exc
for(map<ConcatString,StringArray>::const_iterator str_it = column_str_exc_map.begin();
str_it != column_str_exc_map.end(); str_it++) {

for(i=0; i<str_it->second.n(); i++) {
js << "-column_str_exc " << str_it->first << " " << str_it->second[i] << " ";
}
}

// by_column
if(by_column.n() > 0) {
for(i=0; i<by_column.n(); i++)
Expand Down
Loading

0 comments on commit 7ad8e22

Please sign in to comment.