catch22: CAnonical Time-series CHaracteristics
catch22 GitHub
  • Welcome to catch22
    • Citing catch22
    • Publications using catch22
  • LANGUAGE-SPECIFIC DOCS
    • Python
    • MATLAB
    • R
    • Julia
    • C-compiled
  • INFORMATION ABOUT CATCH22
    • Feature Descriptions
      • Feature Overview Table
      • Distribution shape
      • Extreme event timing
      • Linear autocorrelation structure
      • Nonlinear autocorrelation
      • Symbolic
      • Incremental differences
      • Simple forecasting
      • Self-affine scaling
      • Other
    • API Reference
      • Python API
      • Julia API
      • R API
      • MATLAB API
    • Contributing to catch22
      • Contributor Code of Conduct
    • Related Packages
    • License
Powered by GitBook
On this page
  • Catch24 (Catch22 + Mean + Std)
  • Feature Dependencies
  • Pairwise Spearman Correlations
  • PC Loadings
  1. INFORMATION ABOUT CATCH22
  2. Feature Descriptions

Feature Overview Table

A high-level summary table of all catch22 features.

PreviousFeature DescriptionsNextDistribution shape

Last updated 11 months ago

Note: All catch22 features are statistical properties of the z-scored time series - they aim to focus on the properties of the time-ordering of the data and are insensitive to the raw values in the time series.

In the following table, we give the original feature name (from the Lubba et al. (2019) ), and a shorter name more suitable for use in feature descriptions. Features are also (loosely) categorised into broader conceptual groupings.

Each feature is listed according to the order in which it, along with its associated value, is returned by catch22 e.g., the first feature returned (i.e., #1) is always DN_HistogramMode_5.

#
Feature name
Short name
Category
Description

1

DN_HistogramMode_5

mode_5

5-bin histogram mode

2

DN_HistogramMode_10

mode_10

10-bin histogram mode

3

DN_OutlierInclude_p_001_mdrmd

outlier_timing_pos

Positive outlier timing

4

DN_OutlierInclude_n_001_mdrmd

outlier_timing_neg

Negative outlier timing

5

first1e_acf_tau

acf_timescale

6

firstMin_acf

acf_first_min

First minimum of the ACF

7

SP_Summaries_welch_rect_area_5_1

low_freq_power

Power in lowest 20% frequencies

8

SP_Summaries_welch_rect_centroid

centroid_freq

Centroid frequency

9

FC_LocalSimple_mean3_stderr

forecast_error

Error of 3-point rolling mean forecast

10

FC_LocalSimple_mean1_tauresrat

whiten_timescale

Change in autocorrelation timescale after incremental differencing

11

MD_hrv_classic_pnn40

high_fluctuation

Proportion of high incremental changes in the series

12

SB_BinaryStats_mean_longstretch1

stretch_high

Longest stretch of above-mean values

13

SB_BinaryStats_diff_longstretch0

stretch_decreasing

Longest stretch of decreasing values

14

SB_MotifThree_quantile_hh

entropy_pairs

Entropy of successive pairs in symbolized series

15

CO_HistogramAMI_even_2_5

ami2

Histogram-based automutual information (lag 2, 5 bins)

16

CO_trev_1_num

trev

Time reversibility

17

IN_AutoMutualInfoStats_40_gaussian_fmmi

ami_timescale

First minimum of the AMI function

18

SB_TransitionMatrix_3ac_sumdiagcov

transition_variance

Transition matrix column variance

19

PD_PeriodicityWang_th001

periodicity

Wang's periodicity metric

20

CO_Embed2_Dist_tau_d_expfit_meandiff

embedding_dist

Goodness of exponential fit to embedding distance distribution

21

SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1

rs_range

Rescaled range fluctuation analysis (low-scale scaling)

22

SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1

dfa

Detrended fluctuation analysis (low-scale scaling)


Catch24 (Catch22 + Mean + Std)

In some cases, in which scale and spread of the raw time-series values may be relevant to class differences, the two simple distributional moment features (using the catch24 flag in the software implementations) can be added. This will result in 24 features being calculated: the catch22 features in addition to the mean and standard deviation.

#
Feature name
Short name
Description

23

DN_Mean

mean

Mean

24

DN_Spread_Std

std

Standard deviation


Feature Dependencies

Although the selection framework used to generate the catch22 feature set included a step to reduce redundancy, it was not designed to generate an independent set of features.

Pairwise Spearman Correlations

  • We find a large cluster of features sensitive to the autocorrelation of a time series.

  • We also find a small cluster of two highly correlated features, DN_HistogramMode_5 and DN_HistogramMode_10, which measure the mode of the z-scired time-series distribution using different numbers of bins.

This dependency structure should be taken in mind when interpreting the result of catch22 analyses: Does your dataset exhibit any of these generic dependencies, or some unique dependencies?

PC Loadings

Below is a similar plot, but with colour overlayed according to the weights onto the first three principal components:

Broadly,

  • The first two principal components capture different aspects of the autocorrelation structure.

  • The third principal component captures different aspects of the distribution symmetry.


First crossing of the ACF

Below is an example of the generic non-independence of features. We have plotted the Spearman correlation coefficient between all pairs of features, quantifying the similarity of their outputs across a diverse range of :

1/e1/e1/e
paper
1000 empirical time series
Distribution shape
Distribution shape
Extreme event timing
Extreme event timing
Linear autocorrelation
Linear autocorrelation
Linear autocorrelation
Linear autocorrelation
Simple forecasting
Incremental differences
Incremental differences
Symbolic
Symbolic
Symbolic
Nonlinear autocorrelation
Nonlinear autocorrelation
Linear autocorrelation structure
Symbolic
Linear autocorrelation structure
Other
Self-affine scaling
Self-affine scaling
Page cover image