Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature 1905 ens_ctrl #1955

Merged
merged 22 commits into from
Nov 3, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
22 commits
Select commit Hold shift + click to select a range
3b0d6d2
Per #1905, no real change. Just spacing
JohnHalleyGotway Oct 8, 2021
144ac67
Per #1905, add the -ctrl command line option to ensemble_stat and upd…
JohnHalleyGotway Oct 8, 2021
b39d9de
Per #1905, update the users guide with the -ctrl command line option.
JohnHalleyGotway Oct 8, 2021
ae054b1
Not actually related to #1905, but clarifying the pcp_combine usage s…
JohnHalleyGotway Oct 11, 2021
f2339a3
Avoid inline stars in pcp_combine example that don't render well.
JohnHalleyGotway Oct 11, 2021
52935a9
Per #1905, tweak the docs.
JohnHalleyGotway Oct 11, 2021
e89f122
Per #1905, more tweaks.
JohnHalleyGotway Oct 11, 2021
aeae0f0
Merge branch 'develop' into feature_1905_ens_ctrl
JohnHalleyGotway Oct 20, 2021
8712a4c
Merge branch 'develop' into feature_1905_ens_ctrl
JohnHalleyGotway Oct 25, 2021
b41d73a
tMerge branch 'develop' into feature_1905_ens_ctrl
JohnHalleyGotway Oct 27, 2021
2a63c70
Per #1905, just aligning code.
JohnHalleyGotway Oct 27, 2021
519daad
Per #1905, add a call to ensemble_stat using the -ctrl option. I real…
JohnHalleyGotway Oct 27, 2021
1e25123
Per #1905, enable STDEV output in the NetCDF file since it's computat…
JohnHalleyGotway Oct 27, 2021
e92e04b
Per #1905, add NumArray::mean(), variance(), and stdev() with options…
JohnHalleyGotway Oct 29, 2021
a8e6ca8
Per #1905, keep track of which ensemble member is the control and exc…
JohnHalleyGotway Oct 29, 2021
fa5bed5
Per #1905, store the index of the ensemble control member.
JohnHalleyGotway Oct 29, 2021
2b7952a
Per #1905, store the ensemble control as the first member.
JohnHalleyGotway Oct 29, 2021
2b02ea6
Per #1905, clean up comments.
JohnHalleyGotway Oct 29, 2021
872c50f
Per #1905, define ctrl_index in ensemble_stat.h instead of ensemble_s…
JohnHalleyGotway Oct 29, 2021
07e7d51
Per #1905, correct ssvar logic for skipping the ensemble control member.
JohnHalleyGotway Oct 30, 2021
2992759
Per #1905, revert variable name usage to minimize diffs from develop …
JohnHalleyGotway Nov 1, 2021
a616008
Small spelling correction
j-opatz Nov 2, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions met/docs/Users_Guide/ensemble-stat.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,7 @@ The usage statement for the Ensemble Stat tool is shown below:
[-grid_obs file]
[-point_obs file]
[-ens_mean file]
[-ctrl file]
[-obs_valid_beg time]
[-obs_valid_end time]
[-outdir path]
Expand All @@ -92,23 +93,23 @@ Optional arguments for ensemble_stat

4. To produce ensemble statistics using gridded observations, use the **-grid_obs file** option to specify a gridded observation file. This option may be used multiple times if your observations are in several files.

5. To produce ensemble statistics using point observations, use the **-point_obs file** option to specify a NetCDF point observation file. This option may be used multiple times if your observations are in several files.

5. To produce ensemble statistics using point observations, use the **-point_obs file** to specify a NetCDF point observation file. This option may be used multiple times if your observations are in several files.
6. To override the simple ensemble mean value of the input ensemble members for the ECNT, SSVAR, and ORANK line types, the **-ens_mean file** option specifies an ensemble mean model data file. This option replaces the **-ssvar_mean file** option from earlier versions of MET.

7. The **-ctrl file** option specifies an ensemble control member data file. The control member is included in the computation of the ensemble mean but excluded from the spread.

6. To override the simple ensemble mean value of the input ensemble members for the ECNT, SSVAR, and ORANK line types, the **-ens_mean file** specifies an ensemble mean model data file. This option replaces the **-ssvar_mean file** from earlier versions of MET.
8. To filter point observations by time, use **-obs_valid_beg time** in YYYYMMDD[_HH[MMSS]] format to set the beginning of the matching observation time window.

7. To filter point observations by time, use **-obs_valid_beg time** in YYYYMMDD[_HH[MMSS]] format to set the beginning of the matching observation time window.
9. As above, use **-obs_valid_end time** in YYYYMMDD[_HH[MMSS]] format to set the end of the matching observation time window.

8. As above, use **-obs_valid_end time** in YYYYMMDD[_HH[MMSS]] format to set the end of the matching observation time window.
10. Specify the **-outdir path** option to override the default output directory (./).

9. Specify the **-outdir path** option to override the default output directory (./).
11. The **-log** file outputs log messages to the specified file.

10. The **-log** file outputs log messages to the specified file.
12. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging.

11. The **-v level** option indicates the desired level of verbosity. The value of "level" will override the default setting of 2. Setting the verbosity to 0 will make the tool run with no log messages, while increasing the verbosity will increase the amount of logging.

12. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.
13. The **-compress level** option indicates the desired level of compression (deflate level) for NetCDF variables. The valid level is between 0 and 9. The value of "level" will override the default setting of 0 from the configuration file or the environment variable MET_NC_COMPRESS. Setting the compression level to 0 will make no compression for the NetCDF output. Lower number is for fast compression and higher number is for better compression.

An example of the ensemble_stat calling sequence is shown below:

Expand Down
2 changes: 1 addition & 1 deletion met/docs/Users_Guide/reformat_grid.rst
Original file line number Diff line number Diff line change
Expand Up @@ -74,7 +74,7 @@ Required arguments for the pcp_combine
Optional arguments for pcp_combine
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

3. The **-field string** option defines the data to be extracted from the input files. Use this option when processing fields other than **APCP** or non-GRIB files. This option may be used multiple times and output will be created for each.
3. The **-field string** option defines the data to be extracted from the input files. Use this option when processing fields other than **APCP** or non-GRIB files. It can be used multiple times and output will be created for each. In general, the field string should include the **name** and **level** of the requested data and be enclosed in single quotes. It is processed as an inline configuration file and may also include data filtering, censoring, and conversion options. For example, use **-field ‘name=”ACPCP”; level=”A6”; convert(x)=x/25.4;’** to read 6-hourly accumulated convective precipitation from a GRIB file and convert from millimeters to inches.

4. The **-name list** option is a comma-separated list of output variable names which override the default choices. If specified, the number of names must match the number of variables to be written to the output file.

Expand Down
53 changes: 53 additions & 0 deletions met/src/basic/vx_util/num_array.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1157,6 +1157,59 @@ double NumArray::wmean_fisher(const NumArray &wgt) const

}


////////////////////////////////////////////////////////////////////////


double NumArray::variance(int skip_index) const

{

if(n() == 0) return ( bad_data_double );

int j, count;
double s, s_sq, var;

s = s_sq = 0.0;
count = 0;

for(j=0; j<n(); j++) {
if(is_bad_data(e[j]) || j == skip_index) continue;
s += e[j];
s_sq += e[j]*e[j];
count++;
}

// Check for slightly negative precision error
if(count > 1) {
var = (s_sq - s*s/(double) count)/((double) (count - 1));
if(is_eq(var, 0.0)) var = 0.0;
}
else {
var = bad_data_double;
}

return(var);

}


////////////////////////////////////////////////////////////////////////


double NumArray::stdev(int skip_index) const

{

double v = variance(skip_index);

if ( !is_bad_data(v) ) v = sqrt(v);

return(v);

}


////////////////////////////////////////////////////////////////////////

//
Expand Down
4 changes: 4 additions & 0 deletions met/src/basic/vx_util/num_array.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
#include <iostream>

#include "concat_string.h"
#include "is_bad_data.h"


////////////////////////////////////////////////////////////////////////
Expand Down Expand Up @@ -107,6 +108,9 @@ class NumArray {
double mean_sqrt() const;
double mean_fisher() const;

double variance(int skip_index = bad_data_int) const;
double stdev(int skip_index = bad_data_int) const;

double wmean(const NumArray &) const;
double wmean_sqrt(const NumArray &) const;
double wmean_fisher(const NumArray &) const;
Expand Down
60 changes: 45 additions & 15 deletions met/src/libcode/vx_statistics/pair_data_ensemble.cc
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ void PairDataEnsemble::clear() {

n_ens = 0;
n_pair = 0;
ctrl_index = bad_data_int;
skip_const = false;
skip_ba.clear();

Expand All @@ -121,6 +122,7 @@ void PairDataEnsemble::clear() {

esum_na.clear();
esumsq_na.clear();
esumn_na.clear();

mn_na.clear();
mn_oerr_na.clear();
Expand Down Expand Up @@ -170,6 +172,7 @@ void PairDataEnsemble::extend(int n) {
var_plus_oerr_na.extend (n);
esum_na.extend (n);
esumsq_na.extend (n);
esumn_na.extend (n);
mn_na.extend (n);
mn_oerr_na.extend (n);

Expand Down Expand Up @@ -222,6 +225,7 @@ void PairDataEnsemble::assign(const PairDataEnsemble &pd) {
ign_na = pd.ign_na;
pit_na = pd.pit_na;
n_pair = pd.n_pair;
ctrl_index = pd.ctrl_index;
skip_const = pd.skip_const;
skip_ba = pd.skip_ba;

Expand All @@ -231,6 +235,7 @@ void PairDataEnsemble::assign(const PairDataEnsemble &pd) {

esum_na = pd.esum_na;
esumsq_na = pd.esumsq_na;
esumn_na = pd.esumn_na;

mn_na = pd.mn_na;
mn_oerr_na = pd.mn_oerr_na;
Expand Down Expand Up @@ -290,12 +295,14 @@ void PairDataEnsemble::add_ens_var_sums(int i_obs, double v) {
if(i_obs >= esum_na.n()) {
esum_na.add(0.0);
esumsq_na.add(0.0);
esumn_na.add(0.0);
}

// Track sums of the raw ensemble member values
if(!is_bad_data(v)) {
esum_na.inc(i_obs, v);
esumsq_na.inc(i_obs, v*v);
esumn_na.inc(i_obs, 1);
}

return;
Expand Down Expand Up @@ -416,17 +423,17 @@ void PairDataEnsemble::compute_pair_vals(const gsl_rng *rng_ptr) {
else {

// Compute the variance of the unperturbed ensemble members
var_unperturbed = compute_variance(esum_na[i], esumsq_na[i], v_na[i]);
var_unperturbed = compute_variance(esum_na[i], esumsq_na[i], esumn_na[i]);
var_na.add(var_unperturbed);

// Process the observation error information.
ObsErrorEntry * e = (has_obs_error() ? obs_error_entry[i] : 0);
if(e) {

// Compute perturbed ensemble mean and variance
cur_ens.compute_mean_variance(mean, var_perturbed);
mn_oerr_na.add(mean);
var_oerr_na.add(var_perturbed);
// Compute perturbed ensemble mean and variance
// Exclude the control member from the variance
mn_oerr_na.add(cur_ens.mean());
var_oerr_na.add(cur_ens.variance(ctrl_index));

// Compute the variance plus observation error variance.
var_plus_oerr_na.add(var_unperturbed +
Expand Down Expand Up @@ -467,8 +474,12 @@ void PairDataEnsemble::compute_pair_vals(const gsl_rng *rng_ptr) {
crps_emp_na.add(compute_crps_emp(o_na[i], cur_ens));
crpscl_emp_na.add(compute_crps_emp(o_na[i], cur_clm));

// Ensemble mean and standard deviation
// Exclude the control member from the standard deviation
mean = cur_ens.mean();
stdev = cur_ens.stdev(ctrl_index);

// Store Gaussian CRPS stats
cur_ens.compute_mean_stdev(mean, stdev);
crps_gaus_na.add(compute_crps_gaus(o_na[i], mean, stdev));
crpscl_gaus_na.add(compute_crps_gaus(o_na[i], cmn_na[i], csd_na[i]));
ign_na.add(compute_ens_ign(o_na[i], mean, stdev));
Expand Down Expand Up @@ -613,7 +624,7 @@ struct ssvar_bin_comp {

void PairDataEnsemble::compute_ssvar() {
int i, j;
double mn, var;
double var;
ssvar_bin_map bins;
NumArray cur;

Expand Down Expand Up @@ -644,21 +655,22 @@ void PairDataEnsemble::compute_ssvar() {
if(skip_ba[i]) continue;

// Store ensemble values for the current observation
for(j=0, cur.erase(); j<n_ens; j++) cur.add(e_na[j][i]);

// Compute standard deviation
cur.compute_mean_variance(mn, var);
// Exclude the control member from the variance
for(j=0, cur.erase(); j<n_ens; j++) {
if(j == ctrl_index) continue;
cur.add(e_na[j][i]);
}

// Build a variance point
ens_ssvar_pt pt;
pt.var = var;
pt.var = cur.variance();
pt.f = mn_na[i];
pt.o = o_na[i];
pt.w = wgt_na[i];

// Determine the bin for the current point and add it to the list
// Bins are defined starting at 0 and are left-closed, right-open
j = floor(var/ssvar_bin_size);
j = floor(pt.var/ssvar_bin_size);
string ssvar_min = str_format("%.5e", j*ssvar_bin_size).contents();
if( !bins.count(ssvar_min) ){
ssvar_pt_list pts;
Expand Down Expand Up @@ -801,7 +813,7 @@ PairDataEnsemble PairDataEnsemble::subset_pairs_obs_thresh(const SingleThresh &o
//
// Exclude from subset:
// sid_sa, lat_na, lon_na, x_na, y_na, vld_ta, lvl_ta, elv_ta,
// o_qc_sa, esum_na, esumsq_na
// o_qc_sa, esum_na, esumsq_na, esumn_na

pd.wgt_na.add(wgt_na[i]);
pd.o_na.add(o_na[i]);
Expand Down Expand Up @@ -1667,7 +1679,10 @@ void VxPairDataEnsemble::add_ens(int member, bool mn, Grid &gr) {
else {

// Track unperturbed ensemble variance sums
pd[i][j][k].add_ens_var_sums(l, fcst_v);
// Exclude the control member from the variance
if(member != pd[i][j][k].ctrl_index) {
pd[i][j][k].add_ens_var_sums(l, fcst_v);
}

// Apply observation error perturbation, if requested
if(obs_error_info->flag) {
Expand Down Expand Up @@ -1781,6 +1796,21 @@ void VxPairDataEnsemble::calc_obs_summary() {

////////////////////////////////////////////////////////////////////////

void VxPairDataEnsemble::set_ctrl_index(int index) {

for(int i=0; i < n_msg_typ; i++){
for(int j=0; j < n_mask; j++){
for(int k=0; k < n_interp; k++){
pd[i][j][k].ctrl_index = index;
}
}
}

return;
}

////////////////////////////////////////////////////////////////////////

void VxPairDataEnsemble::set_skip_const(bool tf) {

for(int i=0; i < n_msg_typ; i++){
Expand Down
4 changes: 4 additions & 0 deletions met/src/libcode/vx_statistics/pair_data_ensemble.h
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,7 @@ class PairDataEnsemble : public PairBase {

int n_ens; // Number of ensemble members
int n_pair; // Number of valid pairs, n_obs - sum(skip_ba)
int ctrl_index; // Index of the control member
bool skip_const; // Skip cases where the observation and
// all ensemble members are constant
BoolArray skip_ba; // Flag for each observation [n_obs]
Expand All @@ -106,6 +107,7 @@ class PairDataEnsemble : public PairBase {

NumArray esum_na; // Sum of unperturbed ensemble values [n_obs]
NumArray esumsq_na; // Sum of unperturbed ensemble squared values [n_obs]
NumArray esumn_na; // Count of ensemble values [n_obs]

NumArray mn_na; // Ensemble mean value [n_obs]
NumArray mn_oerr_na; // Mean of perturbed members [n_obs]
Expand Down Expand Up @@ -293,6 +295,8 @@ class VxPairDataEnsemble {

void calc_obs_summary();

void set_ctrl_index(int);

void set_skip_const(bool);
};

Expand Down
Loading