Skip to content

Paul-Haimerl/BTtest

Repository files navigation

BTtest

CRAN_Version_Badge CRAN_Downloads_Badge R-CMD-check License_GPLv3_Badge codecov

You are analyzing a panel data set and want to determine if the cross-sectional units share a linear trend as well as any $I(1)$ or $I(0)$ dynamics?

Conveniently test for the number and type of common factors in large nonstationary panels using the routine by Barigozzi & Trapani (2022).

Installation

Always stay up-to-date with the development version (0.10.3) of BTtest from GitHub with:

# install.packages('devtools')
devtools::install_github('Paul-Haimerl/BTtest')
library(BTtest)

The stable version (0.10.3) is available on CRAN:

install.packages('BTtest')

Data

The BTtest packages includes a function that automatically simulates a panel with common nonstationary trends:

set.seed(1)
# Simulate a DGP containing a factor with a linear drift (r1 = 1, d1 = 1 -> drift = TRUE) and 
# I(1) process (d2 = 1 -> drift_I1 = TRUE), one zero-mean I(1) factor 
# (r2 = 1 -> r_I1 = 2; since drift_I1 = TRUE) and two zero-mean I(0) factors (r3 = 2 -> r_I0 = 2)
X <- sim_DGP(N = 100, n_Periods = 200, drift = TRUE, drift_I1 = TRUE, r_I1 = 2, r_I0 = 2)

For specifics on the DGP, I refer to Barigozzi & Trapani (2022, sec. 5).

The Barigozzi & Trapani (2022) test

To run the test, the user only needs to pass a $T \times N$ data matrix X and specify an upper limit on the number of factors (r_max), a significance level (alpha) and whether to use a less (BT1 = TRUE) or more conservative (BT1 = FALSE) eigenvalue scaling scheme:

BTresult <- BTtest(X = X, r_max = 10, alpha = 0.05, BT1 = TRUE)
print(BTresult)
#> r_1_hat r_2_hat r_3_hat 
#>       1       1       2

Differences between BT1 = TRUE/ FALSE, where BT1 = TRUE tends to identify more factors compared to BT1 = FALSE, quickly vanish when the panel includes more than 200 time periods (Barigozzi & Trapani 2022, sec. 5; Trapani, 2018, sec. 3).

BTtest returns a vector indicating the existence of (i) a factor subject to a linear trend ($r_1$), the number of (ii) zero-mean $I(1)$ factors ($r_2$) and the number of (iii) zero-mean $I(0)$ factors ($r_3$). Note that only one factor with a linear trend can be identified.

The test statistic is constructed from R draws of an i.i.d. standard normal random variable. Consequently, the test results are nondeterministic and may vary slightly between executions, particularly when R is small. However, in practical applications this randomness can be eliminated by specifying a random seed set.seed() before invoking BTtest().

The Bai (2004) Integrated Information Criterion

An alternative way of estimating the total number of factors in a nonstationary panel are the Integrated Information Criteria by Bai (2004). The package also contains a function to easily evaluate this measure:

IPCresult <- BaiIPC(X = X, r_max = 10)
print(IPCresult)
#> IPC_1 IPC_2 IPC_3 
#>     2     2     2

References

  • Bai, J. (2004). Estimating cross-section common stochastic trends in nonstationary panel data. Journal of Econometrics, 122(1), 137-183. DOI: 10.1016/j.jeconom.2003.10.022
  • Barigozzi, M., & Trapani, L. (2022). Testing for common trends in nonstationary large datasets. Journal of Business & Economic Statistics, 40(3), 1107-1122. DOI: 10.1080/07350015.2021.1901719
  • Trapani, L. (2018). A randomized sequential procedure to determine the number of factors. Journal of the American Statistical Association, 113(523), 1341-1349. DOI: 10.1080/01621459.2017.1328359