arrow-left

All pages
gitbookPowered by GitBook
1 of 4

Loading...

Loading...

Loading...

Loading...

Compiling binaries

Some external code packages require compiled binary code to be used. Compilation of the mex code is handled by compile_mex as part of the install script, but the TISEAN package binaries need to be compiled separately in the command line.

hashtag
Compiling mex code

Many of the operations (especially external code packages) rely on mex functions (pieces of code written in C or fortran), that need to be compiled to run natively on a given system architecture. To ensure that as many operations as possible run successfully on your data, you should compile these mex functions for your system. This requires working compilers (e.g., gcc, g++) to be installed on your system, which can be configured using mex -setup (cf. doc mex for more information).

Once mex is set up, the mex functions used in the time-series code repository can be compiled by navigating to the Toolboxes directory and then running compile_mex.

hashtag
Compiling the TISEAN binaries

Some operations rely on the , which Matlab accesses via the terminal using system commands, so the TISEAN binaries cannot be installed from within Matlab, but instead must be installed from the command line. If you are running Linux or Mac, we will assume that you are familiar with the command line, while those running Windows will require an alternate method to install TISEAN, as explained below.

hashtag
Installing TISEAN on Linux or Mac

In the command line (not within Matlab), navigate to the Toolboxes/Tisean_3.0.1 directory of the repository, then run the following chain of commands:

This should install the TISEAN binaries in your ~/bin/ directory (you can instead install into a system-wide directory, /usr/bin, for example, by running ./configure –prefix=/usr). Additional information about the TISEAN installation process is provided .

If installation was successful then you should be able to access the newly-compiled binaries from the commandline, e.g., typing the command which poincare should return the path to the TISEAN function poincare. Otherwise, you should check that the install directory is in your system path, e.g., by adding the following:

to your ~/.bash_profile (and running source ~/.bash_profile to update).

The path where TISEAN is installed will also have to be in Matlab’s environment path, which is added by startup.m, assuming that the binaries are stored in ~/bin. The startup.m code also adds the DYLD_LIBRARY_PATH, which is also required for TISEAN to function properly.

If you choose to use a custom location for the TISEAN binaries, that is not in the default Matlab system path (getenv('PATH') in Matlab), then you will have to add this path manually. You can test that Matlab can see the TISEAN binaries by typing, for example, the following into Matlab:

!which poincare

If Matlab’s system paths are set up correctly, this command should return the path to your compiled TISEAN binary, poincare.

hashtag
Installing TISEAN on Windows

If you are running Matlab from Windows, you will need a mechanism for Matlab to call system commands and find compiled TISEAN binaries. There are two options:

  1. Install on your machine. Cygwin provides a Linux distribution-like environment on Windows. Use this environment to compile and install TISEAN (as per the instructions above for Linux or Mac), which will require it to have . Matlab will then also need to be launched from Cygwin, using the command: matlab &. This instance of Matlab should then be able to call system commands through cygwin, including the ability to access the TISEAN binaries.

  2. Sacrifice operations that rely on TISEAN. In total, TISEAN-based operations account for approximately 300 operations in the operation library. Although they provide important, well-tested implementations of nonlinear time-series analysis methods, it's not the end of the world if you decide it's too much trouble to install and are ok to miss out on these methods (see below on how to explicitly remove them from a computed library).

hashtag
Ignoring TISEAN functions

If you decide not to use functions from the TISEAN package, you should initialize your dataset with the TISEAN functions removed. You could do this by removing them from you INP_ops.txt file when initializing your dataset, or you could remove them from your initialized hctsa dataset by filtering on the 'tisean' keyword.

For example, to filter a local Matlab hctsa file (e.g., HCTSA.mat), you can use the following: TS_LocalClearRemove('raw','ops',TS_GetIDs('tisean','raw','ops'),true);, which will remove all operations with the 'tisean' keyword from the hctsa dataset in HCTSA.mat.

[If you are using a mySQL database to store the results of your hctsa calculations, TISEAN functions can be removed from the database as follows: SQL_ClearRemove('ops',SQL_GetIDs('ops',0,'tisean',{}),true)].

TISEAN nonlinear time-series analysis packagearrow-up-right
on the TISEAN websitearrow-up-right
Cygwinarrow-up-right
C and fortran compilers installedarrow-up-right
$ ./configure
$ make clean
$ make
$ make install
    export PATH=$PATH:$HOME/bin

Structure of the hctsa framework

hashtag
Overview

The hctsa framework consists of three basic objects containing relevant metadata:

  1. Master Operations specify pieces of code (Matlab functions) and their inputs to be computed. Taking in a single time series, master operations can generate a large number of outputs as a Matlab structure, each of which can be identified with a single operation (or 'feature').

  2. Operations (or 'features') are a single number summarizing some measure of structure in a time series. In hctsa, each operation links to an output from a piece of evaluated code (a master operation).

  3. Time series are univariate and uniformly sampled time-ordered measurements.

These three different objects are summarized below:

In the example above, a master operation specifies the code to run, CO_AutoCorr(x,1:5,'TimeDomain'), which outputs the autocorrelation of the input time series (x) at lags 1, 2, ..., 5. Each operation (or 'feature') is a single number that draws on this set of outputs, for example, the autocorrelation at lag 1, which is named AC_1, for example.

In the hctsa framework, master operations, operations, and time series are stored as tables that contain all of their associated keywords and metadata (and actual time-series data in the case of time series).

For a given hctsa analysis, the user must specify a set of code to evaluate (master operations), their associated individual outputs to measure (operations), and a set of time series to evaluate the features on (time series).

We provide a default library of over 7700 operations (derived from approximately 1000 unique master operations). This can be customized, and additional pieces of code can also be added to the repository.

hashtag
The results of a hctsa analysis

Having specified a set of master operations, operations, and time series, the results of computing these functions in the time series data are stored in three matrices:

  • TS_DataMat is an n x m data matrix containing the results of applying m operations to the n time series.

  • TS_Quality is an n x m matrix containing quality labels for each operation output (coding different outputs such as errors or NaNs). Quality labels are described in the section below.

hashtag
HCTSA .mat files

Each HCTSA*.mat file includes the tables described above: for TimeSeries (corresponding to the rows of the TS_ matrices), Operations (corresponding to columns of the TS_ matrices), and MasterOperations, corresponding to the code evaluated to compute the operations. In addition, the results are stored as above: TS_DataMat, TS_Quality, and TS_CalcTime.

hashtag
Quality labels

Quality labels are used to indicate when operations take non-real values, or when fatal errors were encountered. Quality labels are stored in the Quality column of the Results table in the mySQL database, and in local Matlab files as the TS_Quality matrix.

When the quality label is nonzero, this indicates that a special-valued output occurred. In this case, the output value of the operation is set to zero, as a convention, and the quality label codes the special output value:

Overview of an hctsa analysis

At its core, hctsa analysis involves computing a library of time-series analysis features (which we call operations) on a time-series dataset.

The basic sequence of a Matlab-based hctsa analysis is to: 1. Initialize a HCTSA.mat file, which contains all of the information about the set of time series and operations in your analysis, as well as the results of applying all operations to all time series, using TS_Init,

  1. These operations can be computed on your time-series data using TS_Compute. The results are structured in the local HCTSA.mat file containing matrices (that store the results of the computations) and the tables (that store information about the time-series data and operations), as described .

  2. After the computation is complete, are provided to understand and interpret the results.

hashtag
Example 1: Compute a feature vector for a time series

As a quick check of your operation library, you can compute the full default code library on a time-series data vector (a column vector of real numbers) as follows:

hashtag
Example 2: Analyze a time-series dataset

Suppose you have have a time-series dataset to analyze. You first generate a formatted INP_ts.mat input file containing your time series data and associated name and keyword labels, as described . You then initialize an hctsa calculation using the default library of features:

This generates a local file, HCTSA.mat containing the associated metadata for your time series, as well as information about the full time-series feature library (Operations) and the set of functions and code to call to evaluate them (MasterOperations), as described .

Next you want to evaluate the code on all of the time series in your dataset. For this you can simply run:

As described , or, for larger datasets, using a script to regularly save back to the local file (cf. sample_runscript_matlab).

Having run your calculations, you may then want to label your data using the keywords you provided in the case that you have labeled groups of time series:

and then normalize and filter the data using the default sigmoidal transformation:

A range of visualization scripts are then available to analyze the results, such as plotting the reordered data matrix:

To inspect a low-dimensional representation of the data:

Or to determine which features are best at classifying the labeled groups of time series in your dataset:

Each of these functions can be run with a range of input settings.

TS_CalcTime is an n x m matrix containing calculation times for each operation output. Note that the calculation time stored is for the corresponding master operation.

Output had a non-zero imaginary component

6

Output was empty (e.g., [])

7

Field specified for this operation did not exist in the master operation output structure

Master Operation

Operation

Time Series

Summary:

Code and inputs to execute

Single feature

Univariate data

Example:

CO_AutoCorr(x,1:5,'TimeDomain')

AC_1

[1.2, 33.7, -0.1, ...]

Quality label

Description

0

No problems with calculation. Output was a real number.

1

A fatal error was encountered.

2

Output of the code was NaN.

3

Output of the code was Inf.

4

Output of the code was -Inf

5

here
a range of processing, analysis, and plotting functions
here
here
herearrow-up-right
x = randn(500,1); % A random time-series
featVector = TS_CalculateFeatureVector(x,false); % compute the default feature vector for x
TS_Init('INP_ts.mat');
TS_Compute;
TS_LabelGroups;
TS_Normalize;
TS_Cluster; % compute a reordering of data and features
TS_PlotDataMatrix; % plot the data matrix
TS_PlotLowDim;
TS_TopFeatures;

Installing and setting up

The hctsa package can be used completely within Matlab, allowing users to analyse time-series datasets quickly and easily. Here we will focus on this Matlab-based use of the software, but note that, for larger datasets requiring distributed computing set-ups, or datasets that may grow with time, hctsa can also be linked to a mySQL database, as described in a dedicated chapter.

hashtag
Installing the hctsa package

The simplest way to get the hctsa package up and running is to run the install script, which adds the required paths to dependent time-series packages (toolboxes), and compiles mex binaries to work on your system architecture. Once this one-off installation step is complete, you're ready to go! (NB: to include additional functions from the TISEAN nonlinear time-series analysis package, you'll also need to ).

After installation, future use of the package can begin by opening Matlab, navigating to the hctsa package, and then loading the paths required by the hctsa package by running the startup script.

compile TISEAN routines