-
Notifications
You must be signed in to change notification settings - Fork 10
Spatial consistency test resistant
The sct_resistant
function offers a more outlier-resistant spatial consistency test (SCT) compared to the standard sct
, though it typically operates at a slower pace.
In this method, the p observed values are represented by a p-vector, values
, and their geographical positions by points (latitude, longitude, and elevation). Users can select which observations to analyze using the obs_to_check
vector, where a '1' indicates checking the corresponding observation and a '0' signifies using it without verification.
Like sct
, sct_resistant
employs a moving window approach across the area of interest. It identifies several 'centroid' observations, around which inner and outer circles are established. This approach allows for localized customization of the SCT, with many parameters being dependent on location.
An important aspect of SCT is determining a background or a preliminary guess for each observation. This is achieved by considering all observations within an outer circle, with several options available:
- VerticalProfile. Estimates the background based on observation elevations, optimized for temperature profiles in complex terrains, as detailed in Frei (2014).
- VerticalProfileTheilSen. Uses Theil-Sen linear regression (as per Wilks (2019)) to estimate the background, which is more resistant to data outliers than the VerticalProfile method.
- MeanOuterCircle. Sets a constant background based on the mean value of observations within the outer circle.
- MedianOuterCircle. Uses the median value of observations within the outer circle as a constant background.
- External. Allows users to provide their own background values.
The output includes a quality flag p-vector for each observation and diagnostics for any observations flagged as unreliable, such as the z-score, in line with the core SCT algorithm described subsequently.
flag | description |
---|---|
-999 | missing flag (observation not checked) |
0 | good observation |
1 | bad observation |
11 | isolated observation, it is the only observation inside the inner circle |
12 | isolated observation, less than num_min_outer observations inside outer circle |
Defining the Range of Valid and Admissible Values
Thesct_resistant
function necessitates defining two distinct ranges around each observed value: the valid value range (values_minv
, values_maxv
) and the admissible value range (values_mina
, values_maxa
). Users must determine the minimum (valid range) and maximum (admissible range) levels of uncertainty for an independent estimate of a given observation. Consequently, if the discrepancies between observations and independent estimates fall within the valid value range in a predefined area, the observations are deemed reliable. Conversely, significant deviations beyond the admissible value range from the independent estimate cast doubt on the observed value. This concept is depicted in the figures Figure for the cross-section and Figure for the scheme.
To illustrate, consider setting ranges for temperature. A valid value range might be ±1°C around observed values, indicating that an independent estimate within this margin is acceptable. In contrast, an admissible value range of ±20°C implies that any independent estimate beyond this threshold potentially signifies an unreliable observation.
NOTE: Users must individually set the ranges for each observation using the "observations" vector. in the context of the example above, for the admissible values range, they should define values_mina
as "observations - 20" and values_maxa
as "observations + 20". A similar method should be used for determining the range of valid values.
This algorithm represents an iterative approach to assess the quality of observations, adjusting for potential outliers and varying conditions across different locations. It emphasizes the importance of local context in evaluating observation quality, combining statistical analysis with spatial considerations.
The primary algorithm for sct_resistant, along with the core SCT algorithm, is outlined as follows (see the Figures as references):
Main Algorithm
- Initialization:
- Define
num_iterations
for the number of sweeps through all observations. - In the first iteration, only flag bad observations. If none are found, exit the loop.
- Define
- Loop Over Observations:
- For each observation:
- Determine if it's a suitable centroid for the concentric circles based on its status (neither previously flagged as good nor bad).
- Select observations within the outer circle (p_out, with p_out greater or equal to
num_min_outer
and lesser or equal tonum_max
) and inner circle (p_in). - Choose observations within the inner circle not yet tested (p_test).
- Check for isolated locations, which cannot be reliably tested and are thus flagged with specific codes. Conditions are 0 observations inside the inner circle OR p_out <
num_min
and the codes returned are 11 and 12. - Calculate or assign background values for p_out observations.
- If observations and backgrounds for all p_test are close enough, flag them as good. The condition is that the background at each location to test is within the corresponding range of valid values.
- Apply the SCT-core: flag the worst observation as bad or all as good based on statistics from p_in.
- For each observation:
- Two Additional Iterations are performed:
- First iteration focuses on observations yet to be decided. Consider as centroids all the observations yet to be decided (this may happen when we reached
num_iterations
of the previous loop). - Second iteration revisits bad observations, potentially reconsidering them as good based on new information. Consider as centroids all bad observations and use good observations only. Bad observations can be saved if they pass the final round.
- First iteration focuses on observations yet to be decided. Consider as centroids all the observations yet to be decided (this may happen when we reached
SCT-core algorithm
- Correlation Estimation:
- Use an analytical function to estimate error correlations at nearby locations.
- Determine horizontal and vertical de-correlation lengths using adaptive estimates and predefined scales.
kth_closest_obs_horizontal_scale
determines the horizontal de-correlation length as the average distance between locations in the outer circle and their k-th nearest observation. This value is confined within the predefinedmin_horizontal_scale
andmax_horizontal_scale
. The vertical de-correlation length is set byvertical_scale
.
- Loop Over Inner Circle Observations:
- Compute analysis and cross-validation analysis for each observation.
- Collect statistics of the variable chi = square.root[(observation - analysis) * (observation - cv-analysis)]. Only include values where the cross-validation analysis falls within the range of admissible values.
- Analysis:
- If in basic mode, use chi as the z-score.
- Otherwise, compute normalized deviations z = (chi - median( chi )) / interquartile_range(chi).
- If any p_test observation has a z value outside thresholds, flag the worst as bad.
- Otherwise, flag all as good.
Similar to spatial consistency test, but with some additional parameters.
Parameter | Type | Unit | Description |
---|---|---|---|
max_horizontal_scale | float | m | Maximum horizontal decorrelation length |
kth_closest_obs_horizontal_scale | int | Number of closest observations to consider in the adaptive estimation of the horizontal decorrelation length | |
value_mina | vec | ou | Minimum admissible value |
value_maxa | vec | ou | Maximum admissible value |
value_minv | vec | ou | Minimum valid value |
value_maxv | vec | ou | Maximum valid value |
ou = Unit of the observation
# R code
obs_to_check <- rep(1, npoints)
background_values <- rep(0, npoints)
background_elab_type <- "MedianOuterCircle"
num_min_outer <- 3
num_max_outer <- 10
inner_radius <- 20000
outer_radius <- 50000
num_iterations <- 10
num_min_prof <- 1
min_elev_diff <- 100
min_horizontal_scale <- 250
max_horizontal_scale <- 100000
kth_closest_obs_horizontal_scale <- 2
vertical_scale <- 200
tpos <- rep(1,N) * 16
tneg <- rep(1,N) * 16
eps2 <- rep(1,N) * 0.5
values_mina <- precip_obs - 20
values_maxa <- precip_obs + 20
values_minv <- precip_obs - 1
values_maxv <- precip_obs + 1
debug <- T
basic <- T
points <- Points(lats, lons, elevs)
res <- sct_resistant( points, precip_obs, obs_to_check, background_values,
background_elab_type, num_min_outer, num_max_outer,
inner_radius, outer_radius, num_iterations, num_min_prof,
min_elev_diff, min_horizontal_scale, max_horizontal_scale,
kth_closest_obs_horizontal_scale, vertical_scale,
values_mina, values_maxa, values_minv, values_maxv,
eps2, tpos, tneg, debug, basic)
# flags
print( res[[1]])
# scores
print( res[[2]])
Copyright © 2019-2023 Norwegian Meteorological Institute