# Feature Overview Table

**Note**: All *catch22* features are statistical properties of the ***z*****-scored** time series - they aim to focus on the properties of the time-ordering of the data and are insensitive to the raw values in the time series.&#x20;

In the following table, we give the original feature name (from the Lubba et al. (2019) [paper](https://time-series-features.gitbook.io/catch22/welcome-to-catch22/citing-catch22)), and a shorter name more suitable for use in feature descriptions. Features are also (loosely) categorised into broader conceptual groupings.

Each feature is listed according to the order in which it, along with its associated value, is returned by *catch22* e.g., the first feature returned (i.e., #1) is always `DN_HistogramMode_5.`

<table data-full-width="false"><thead><tr><th width="69">#</th><th width="188">Feature name</th><th width="219">Short name</th><th width="150">Category</th><th>Description</th></tr></thead><tbody><tr><td>1</td><td><code>DN_HistogramMode_5</code></td><td><code>mode_5</code></td><td><a href="distribution-shape"><mark style="color:orange;">Distribution shape</mark></a></td><td>5-bin histogram mode</td></tr><tr><td>2</td><td><code>DN_HistogramMode_10</code></td><td><code>mode_10</code></td><td><a href="distribution-shape"><mark style="color:orange;">Distribution shape</mark></a></td><td>10-bin histogram mode</td></tr><tr><td>3</td><td><code>DN_OutlierInclude_p_001_mdrmd</code></td><td><code>outlier_timing_pos</code></td><td><a href="extreme-event-timing"><mark style="color:orange;">Extreme event timing</mark></a></td><td>Positive outlier timing</td></tr><tr><td>4</td><td><code>DN_OutlierInclude_n_001_mdrmd</code></td><td><code>outlier_timing_neg</code></td><td><a href="extreme-event-timing"><mark style="color:orange;">Extreme event timing</mark></a></td><td>Negative outlier timing</td></tr><tr><td>5</td><td><code>first1e_acf_tau</code></td><td><code>acf_timescale</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation</mark></a></td><td>First <span class="math">1/e</span> crossing of the ACF</td></tr><tr><td>6</td><td><code>firstMin_acf</code></td><td><code>acf_first_min</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation</mark></a></td><td>First minimum of the ACF</td></tr><tr><td>7</td><td><code>SP_Summaries_welch_rect_area_5_1</code></td><td><code>low_freq_power</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation</mark></a></td><td>Power in lowest 20% frequencies </td></tr><tr><td>8</td><td><code>SP_Summaries_welch_rect_centroid</code></td><td><code>centroid_freq</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation</mark></a></td><td>Centroid frequency</td></tr><tr><td>9</td><td><code>FC_LocalSimple_mean3_stderr</code></td><td><code>forecast_error</code></td><td><a href="simple-forecasting"><mark style="color:orange;">Simple forecasting</mark></a></td><td>Error of 3-point rolling mean forecast</td></tr><tr><td>10</td><td><code>FC_LocalSimple_mean1_tauresrat</code></td><td><code>whiten_timescale</code></td><td><a href="incremental-differences"><mark style="color:orange;">Incremental differences</mark></a></td><td>Change in autocorrelation timescale after incremental differencing</td></tr><tr><td>11</td><td><code>MD_hrv_classic_pnn40</code></td><td><code>high_fluctuation</code></td><td><a href="incremental-differences"><mark style="color:orange;">Incremental differences</mark></a></td><td>Proportion of high incremental changes in the series</td></tr><tr><td>12</td><td><code>SB_BinaryStats_mean_longstretch1</code></td><td><code>stretch_high</code></td><td><a href="symbolic"><mark style="color:orange;">Symbolic</mark></a></td><td>Longest stretch of above-mean values</td></tr><tr><td>13</td><td><code>SB_BinaryStats_diff_longstretch0</code></td><td><code>stretch_decreasing</code></td><td><a href="symbolic"><mark style="color:orange;">Symbolic</mark></a></td><td>Longest stretch of decreasing values</td></tr><tr><td>14</td><td><code>SB_MotifThree_quantile_hh</code></td><td><code>entropy_pairs</code></td><td><a href="symbolic"><mark style="color:orange;">Symbolic</mark></a></td><td>Entropy of successive pairs in symbolized series</td></tr><tr><td>15</td><td><code>CO_HistogramAMI_even_2_5</code></td><td><code>ami2</code></td><td><a href="nonlinear-autocorrelation"><mark style="color:orange;">Nonlinear autocorrelation</mark></a></td><td>Histogram-based automutual information (lag 2, 5 bins)</td></tr><tr><td>16</td><td><code>CO_trev_1_num</code></td><td><code>trev</code></td><td><a href="nonlinear-autocorrelation"><mark style="color:orange;">Nonlinear autocorrelation</mark></a></td><td>Time reversibility</td></tr><tr><td>17</td><td><code>IN_AutoMutualInfoStats_40_gaussian_fmmi</code></td><td><code>ami_timescale</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation structure</mark></a></td><td>First minimum of the AMI function</td></tr><tr><td>18</td><td><code>SB_TransitionMatrix_3ac_sumdiagcov</code></td><td><code>transition_variance</code></td><td><a href="symbolic"><mark style="color:orange;">Symbolic</mark></a></td><td>Transition matrix column variance</td></tr><tr><td>19</td><td><code>PD_PeriodicityWang_th001</code></td><td><code>periodicity</code></td><td><a href="linear-autocorrelation-structure"><mark style="color:orange;">Linear autocorrelation structure</mark></a></td><td>Wang's periodicity metric</td></tr><tr><td>20</td><td><code>CO_Embed2_Dist_tau_d_expfit_meandiff</code></td><td><code>embedding_dist</code></td><td><a href="other"><mark style="color:orange;">Other</mark></a></td><td>Goodness of exponential fit to embedding distance distribution</td></tr><tr><td>21</td><td><code>SC_FluctAnal_2_rsrangeﬁt_50_1_logi_prop_r1</code></td><td><code>rs_range</code></td><td><a href="self-affine-scaling"><mark style="color:orange;">Self-affine scaling</mark></a></td><td>Rescaled range fluctuation analysis (low-scale scaling)</td></tr><tr><td>22</td><td><code>SC_FluctAnal_2_dfa_50_1_2_logi_prop_r1</code></td><td><code>dfa</code></td><td><a href="self-affine-scaling"><mark style="color:orange;">Self-affine scaling</mark></a></td><td>Detrended fluctuation analysis (low-scale scaling)</td></tr></tbody></table>

***

## *Catch24* (*Catch22* + Mean + Std)

In some cases, in which scale and spread of the raw time-series values may be relevant to class differences, the two simple distributional moment features (using the `catch24` flag in the software implementations) can be added. This will result in 24 features being calculated: the *catch22* features in addition to the **mean** and **standard deviation**.&#x20;

<table data-full-width="false"><thead><tr><th width="136">#</th><th>Feature name</th><th>Short name</th><th>Description</th></tr></thead><tbody><tr><td>23</td><td><mark style="color:green;"><code>DN_Mean</code></mark></td><td><code>mean</code></td><td>Mean</td></tr><tr><td>24</td><td><mark style="color:green;"><code>DN_Spread_Std</code></mark></td><td><code>std</code></td><td>Standard deviation</td></tr></tbody></table>

***

## Feature Dependencies

Although the selection framework used to generate the *catch22* feature set included a step to reduce redundancy, it was not designed to generate an independent set of features.

### Pairwise Spearman Correlations

Below is an example of the generic non-independence of features. We have plotted the Spearman correlation coefficient between all pairs of features, quantifying the similarity of their outputs across a diverse range of [1000 empirical time series](https://doi.org/10.6084/m9.figshare.5436136.v9):

<figure><img src="https://650896658-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F3Or28XkZfNq0bJ4X4zXE%2Fuploads%2Fo1AEiZXZl7Lo1WOBb0dY%2Fimage.png?alt=media&#x26;token=54142c5b-c526-414d-b2c4-f971fba70f9f" alt=""><figcaption></figcaption></figure>

* We find a large cluster of features sensitive to the autocorrelation of a time series.
* We also find a small cluster of two highly correlated features, `DN_HistogramMode_5` and `DN_HistogramMode_10`, which measure the mode of the z-scired time-series distribution using different numbers of bins.&#x20;

This dependency structure should be taken in mind when interpreting the result of catch22 analyses: *Does your dataset exhibit any of these generic dependencies, or some unique dependencies?*&#x20;

### PC Loadings

Below is a similar plot, but with colour overlayed according to the weights onto the first three principal components:

<figure><img src="https://650896658-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F3Or28XkZfNq0bJ4X4zXE%2Fuploads%2FSBB2yUJCJMJ6xk2rEKEA%2Fimage.png?alt=media&#x26;token=580f839d-19cb-46b0-bb74-2aeb3ee5a54e" alt=""><figcaption></figcaption></figure>

Broadly,

* The first two principal components capture different aspects of the autocorrelation structure.
* The third principal component captures different aspects of the distribution symmetry.&#x20;

***
