Feature Overview Table
A high-level summary table of all catch22 features.
Note: All catch22 features are statistical properties of the z-scored time series - they aim to focus on the properties of the time-ordering of the data and are insensitive to the raw values in the time series.
In the following table, we give the original feature name (from the Lubba et al. (2019) paper), and a shorter name more suitable for use in feature descriptions. Features are also (loosely) categorised into broader conceptual groupings.
Each feature is listed according to the order in which it, along with its associated value, is returned by catch22 e.g., the first feature returned (i.e., #1) is always DN_HistogramMode_5.
1
DN_HistogramMode_5
mode_5
5-bin histogram mode
2
DN_HistogramMode_10
mode_10
10-bin histogram mode
3
DN_OutlierInclude_p_001_mdrmd
outlier_timing_pos
Positive outlier timing
4
DN_OutlierInclude_n_001_mdrmd
outlier_timing_neg
Negative outlier timing
5
first1e_acf_tau
acf_timescale
6
firstMin_acf
acf_first_min
First minimum of the ACF
7
SP_Summaries_welch_rect_area_5_1
low_freq_power
Power in lowest 20% frequencies
8
SP_Summaries_welch_rect_centroid
centroid_freq
Centroid frequency
9
FC_LocalSimple_mean3_stderr
forecast_error
Error of 3-point rolling mean forecast
10
FC_LocalSimple_mean1_tauresrat
whiten_timescale
Change in autocorrelation timescale after incremental differencing
11
MD_hrv_classic_pnn40
high_fluctuation
Proportion of high incremental changes in the series
12
SB_BinaryStats_mean_longstretch1
stretch_high
Longest stretch of above-mean values
13
SB_BinaryStats_diff_longstretch0
stretch_decreasing
Longest stretch of decreasing values
14
SB_MotifThree_quantile_hh
entropy_pairs
Entropy of successive pairs in symbolized series
15
CO_HistogramAMI_even_2_5
ami2
Histogram-based automutual information (lag 2, 5 bins)
16
CO_trev_1_num
trev
Time reversibility
17
IN_AutoMutualInfoStats_40_gaussian_fmmi
ami_timescale
First minimum of the AMI function
18
SB_TransitionMatrix_3ac_sumdiagcov
transition_variance
Transition matrix column variance
19
PD_PeriodicityWang_th001
periodicity
Wang's periodicity metric
20
CO_Embed2_Dist_tau_d_expfit_meandiff
embedding_dist
Goodness of exponential fit to embedding distance distribution
21
SC_FluctAnal_2_rsrangefit_50_1_logi_prop_r1
rs_range
Rescaled range fluctuation analysis (low-scale scaling)
22
SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1
dfa
Detrended fluctuation analysis (low-scale scaling)
Catch24 (Catch22 + Mean + Std)
In some cases, in which scale and spread of the raw time-series values may be relevant to class differences, the two simple distributional moment features (using the catch24
flag in the software implementations) can be added. This will result in 24 features being calculated: the catch22 features in addition to the mean and standard deviation.
23
DN_Mean
mean
Mean
24
DN_Spread_Std
std
Standard deviation
Feature Dependencies
Although the selection framework used to generate the catch22 feature set included a step to reduce redundancy, it was not designed to generate an independent set of features.
Pairwise Spearman Correlations
Below is an example of the generic non-independence of features. We have plotted the Spearman correlation coefficient between all pairs of features, quantifying the similarity of their outputs across a diverse range of 1000 empirical time series:
We find a large cluster of features sensitive to the autocorrelation of a time series.
We also find a small cluster of two highly correlated features,
DN_HistogramMode_5
andDN_HistogramMode_10
, which measure the mode of the z-scired time-series distribution using different numbers of bins.
This dependency structure should be taken in mind when interpreting the result of catch22 analyses: Does your dataset exhibit any of these generic dependencies, or some unique dependencies?
PC Loadings
Below is a similar plot, but with colour overlayed according to the weights onto the first three principal components:
Broadly,
The first two principal components capture different aspects of the autocorrelation structure.
The third principal component captures different aspects of the distribution symmetry.
Last updated