Skip to content

An R implementation of a new covariates selection method in Dynamic Regression Models.

Notifications You must be signed in to change notification settings

anaezquerro/dynamic-arimax

Repository files navigation

Welcome to dynamic-arimax repository!

This is an R implementation of a new covariates selection method in Dynamic Regression Models. This proposal was published in:

The PDF file of both proceedings are attached in this repository in folder proceedings/. We recommend reading at least one of the documents to use this code to deeply understand the mathematics of our method.

For any suggestion or issue with our code, please contact us in order to solve it and improve our implementation.

Table of Contents

  1. Installation
  2. Structure of the module
  3. Documentation and examples
  4. Table of results
  5. Acknowledgements

Installation

In order to use this implementation and run all files, the following prerequisites are needed:

Once R, RStudio IDE and rtools have been installed, you can run installation.R to automatically install all R-packages needed and test-installation.R to check if all libraries have been correctly added.

Structure of the module

  • auto-fit.R: Implementation of automatic fitting in ARIMA or ARIMAX models. It uses the forecast::auto.arima() function and iteratively removes non-significative coefficients.
  • auto-select.R: Implementation of the covariates selection method and their respective correlation lags. For more information we suggest reading the proceedings attached to the repository.
  • forecasting.R: Implementation of dynamic regression models forecasting once a model has been fitted with the selection function.
  • plot-tools.R: Script for fancy Plotly graphics.
  • documentation/: Folder where documentation examples are provided.
  • data/: Datasets needed to run examples in EXAMPLES.md.

Documentation and examples

Consult a detailed documentation of the code and examples of use in Documentation.html file.

Simulation results

We present in the following tables some metrics obtained via simulating $M=100$ scenarios where a dependent variable was artificially constructed via some randomly generated time series (modelable by an ARIMA model). Specifically, in each scenario:

  1. Seven time series were randomly generated. Six of them were used as the set of covariates: $\mathcal{X} = {X_t^{(1)}, ..., X_t^{(6)}}$ and the remaining as the residuals of the model $\eta_t$.
  2. Random lags $r_i \in[0, 6]$ for $i=1...6$, where selected for each covariate as well as regression coefficients $\beta_0,...,\beta_3$.
  3. The dependent variable $Y_t$ was constructed via the DR model formula:

$$ Y_t = \beta_0 + \beta_1 X_{t-r_1}^{(1)} + \beta_2 X_{t-r_2}^{(2)} + \beta_3 X_{t-r_3}^{(3)} + \eta_t$$

We tested our selection method with different configurations:

  • With different stationary tests (via auto.arima() function or Dickey-Fuller test).
  • With different information criterions (AIC, BIC or AICc).

For more detailed information about the simulation procedure, please read the proceeding of the repository.

Results when $\eta_t \sim \text{ARMA}(p,q)$ is stationary

  • Percentage of correctly added covariates to the model (true positive):
AIC BIC AICc
adf.test 97.66% 97.66% 97.66%
auto.arima 98.33% 98.33% 98.33%
  • Percentage of incorrectly added covariates to the model (false positive):
AIC BIC AICc
adf.test 3.66% 1.33% 3.66%
auto.arima 3.66% 1.33% 3.66%
  • Percentage of correctly not added covariates to the model (true negative):
AIC BIC AICc
adf.test 96.33% 98.66% 96.33%
auto.arima 96.33% 98.66% 96.33%
  • Percentage of incorrectly not added covariates to the model (false negative):
AIC BIC AICc
adf.test 2.33% 2.33% 2.33%
auto.arima 1.66% 1.66% 1.66%

Results when $\eta_t \sim \text{ARIMA}(p,d,q)$ is non-stationary

  • Percentage of correctly added covariates to the model (true positive):
AIC BIC AICc
adf.test 93.33% 93.33% 93.33%
auto.arima 94.33% 94.66% 95.33%
  • Percentage of incorrectly added covariates to the model (false positive):
AIC BIC AICc
adf.test 4.33% 0.30% 4.33%
auto.arima 5.00% 1.33% 5.00%
  • Percentage of correctly not added covariates to the model (true negative):
AIC BIC AICc
adf.test 95.00% 98.66% 95.00%
auto.arima 94.66% 99.66% 95.66%
  • Percentage of incorrectly not added covariates to the model (false negative):
AIC BIC AICc
adf.test 6.66% 6.66% 6.66%
auto.arima 4.66% 5.33% 4.66%

Acknowledgements

To Banco Santander for the scholarships offered in 2021/2022, which helped the investigation of this proposal, and to MODES investigation group of University of A Coruña.

About

An R implementation of a new covariates selection method in Dynamic Regression Models.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages