Performance evaluation for forecasting modeling with spatiotemporal structures in data

Yujun Zhou and Kathy Baylis

Abstract The growth in available geospatial data, along with the rise of machine learning methods, have let themselves to numerous spatial-temporal forecasting applications to solve real-world problems such as deforestation, pollution, and food security. Choosing the right performance evaluation matters for generating accurate and trustworthy out-of-sample predictions. However, with spatial-temporal dependencies between observations in both the training and testing data, the independence assumption of the testing set is violated. As a result, model performance evaluated using cross-validation (CV), and out-of-sample (OOS) can be over-optimistic. In this study, we show the changes in CV and OOS performance when we adjust for different types of spatiotemporal correlations in both simulated data and real-world panel data. We also show how the model selection is affected by the performance evaluation process to prefer overfitting models. Lastly, we propose and compare solutions such as blocking and clustering to improve performance evaluation procedures in both simulated and real-world data with spatiotemporal structures.

JEL classifications: C33, C52, C53

Keywords: Machine-learning, Spatial autocorrelation, Spatial cross-validation

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Performance evaluation for forecasting modeling with spatiotemporal structures in data

Files

README.md

Latest commit

History

README.md

File metadata and controls

Performance evaluation for forecasting modeling with spatiotemporal structures in data