catch22: CAnonical Time-series CHaracteristics
catch22 GitHub
  • Welcome to catch22
    • Citing catch22
    • Publications using catch22
  • LANGUAGE-SPECIFIC DOCS
    • Python
    • MATLAB
    • R
    • Julia
    • C-compiled
  • INFORMATION ABOUT CATCH22
    • Feature Descriptions
      • Feature Overview Table
      • Distribution shape
      • Extreme event timing
      • Linear autocorrelation structure
      • Nonlinear autocorrelation
      • Symbolic
      • Incremental differences
      • Simple forecasting
      • Self-affine scaling
      • Other
    • API Reference
      • Python API
      • Julia API
      • R API
      • MATLAB API
    • Contributing to catch22
      • Contributor Code of Conduct
    • Related Packages
    • License
Powered by GitBook
On this page
  • What these features do
  • What it measures
  1. INFORMATION ABOUT CATCH22
  2. Feature Descriptions

Extreme event timing

The DN_OutlierInclude features measure the timing of extreme events relative to the start and end of the time series.

catch22 contains two features based on the DN_OutlierInclude function in hctsa:

  • outlier_timing_pos (the hctsa feature DN_OutlierInclude_p_001_mdrmd) i.e, the mdrmd output from running DN_OutlierInclude(x_z,'pos',0.01) in hctsa.

  • outlier_timing_neg (the hctsa feature DN_OutlierInclude_n_001_mdrmd) i.e., the mdrmd output from runningDN_OutlierInclude(x_z,'neg',0.01) in hctsa).

What these features do

These features involve the following steps:

  1. z-score the input time series.

  2. Initialise an equally spaced set of increments, from zero to the maximum values of the time series, in the case of outlier_timing_pos (or from 0 to the minimum value of the time series in the case of outlier_timing_neg). In this way, a set of increasingly `extreme' deviations from the mean (either deviations above-the-mean or below-the-mean) are analysed across the loop in Step (3).

  3. At each threshold set in Step (2):

    1. Determine the time points in which the time series is `over-threshold'.

    2. Compute the median index of all such over-threshold time points, as rmd .

    3. For interpretation, and to appropriately compare time series of different lengths, we then linearly re-scale rmd such that a median right in the middle of the time series, at index N/2, maps to 0, a value at the end of the time series, at index N, maps to 1, and a value at the start of the time series, index 1, maps to a -1.

  4. The final statistic returns the median of all values of rmd values across all values of the threshold, as the output statistic.


What it measures

These statistics measure whether over-threshold events (either positive or negative deviations from the mean) tend to be positioned relative near the start of the time series (output values near -1), approximately equally likely to be anywhere through the time series (output values near 0), or more likely to be near the end of the time series (output values near 1). These features thus capture something related to the stationarity of over-threshold events.

To give an intuition, below we plot some examples of how rmd at a fixed threshold (80% the maximum positive deviation) for the case of outlier_timing_pos ( note that the full statistic takes the median of rmd across a range of thresholds, as described above).

Consider these examples:

Time series, that have extreme events (red dots, relative to the threshold, shown as a dashed red line) distributed similarly across time, will yield values close to zero for this statistic (vertical blue line). For example these:

Time series like these, for which large deviations from the mean tend to occur nearer to the end of the time series, will have values closer to 1:

Time series like these, for which large deviations from the mean tend to occur nearer the start of the time series, will have values nearer to -1:



PreviousDistribution shapeNextLinear autocorrelation structure

Last updated 1 year ago

Page cover image