Skip to content

1. Package Motivation

Trent Henderson edited this page Apr 7, 2021 · 2 revisions

The highly comparative time-series analysis approach to temporal data is a data-driven and largely field agnostic one. Pioneered largely through Ben Fulcher's MATLAB toolbox hctsa and various associated feature-based time-series analysis publications (such as this paper and this paper), this approach has proven effective in detecting signal from noise, classifying groups, and performing regression tasks. These performance gains likely occur for several reasons:

  • Feature space is much more computationally efficient than measurement space - enabling a more diverse range of algorithms and statistical models to be fit to its outputs
  • Feature space can reveal dynamical and nonlinear relationships between statistical processes that the measurement space may not be able to detect - enabling a deeper and potentially more sophisticated understanding of the empirical structure and similarity between time series
  • Dimension reduction techniques generalise well to the feature space - enabling methods such as Principal Components Analysis and t-SNE to reveal patterns across groups of features, and promote effective data visualisation

Since MATLAB is proprietary software, a major barrier to broader adoption of this philosophy of highly comparative methods is one largely of the available tools. This package, catch22, aims to bridge some of this gap by providing convenience functions for users of the free open-source language R that automatically calculate a subset of 22 features from the broader hctsa toolbox that have been shown to be effective and minimally redundant.

Clone this wiki locally