Working with a mySQL database
When running large-scale hctsa computations, it can be useful to set up a mySQL database for time series, operations, and the computation results, and have many Matlab instances (running on different nodes of a compute cluster, for example) communicate directly with the database.
The hctsa software comes with this (optional) functionality, allowing a more powerful, distributed way to compute and store results of a large-scale computation.
This chapter outlines the steps involved in setting up, and running hctsa computations using a linked mySQL database.
Installing the hctsa code package to work with a mySQL database
The hctsa package requires some preliminary set up to work with a mySQL database, described here:
Installation of mySQL, either locally, or on an accessible server.
Setting up Matlab with a mySQL java connector (done by running the
install_jconnector
script in the Database directory, and then restarting Matlab).
After the database is set up, and the packages required by hctsa are installed (by running the install
script), linking to a mySQL database can be done by running the install_database
script, which:
Sets up Matlab to be able to communicate with the mySQL server and creates a new database to store Matlab calculations in, described here.
This section contains additional details about each of these steps.
Note that the above steps are one-off installation steps; once the software is installed and compiled, a typical workflow will simply involve opening Matlab, running the startup
script (which adds all paths required for the hctsa software), and then working within Matlab from any desired directory.
Adding a time-series dataset
Once installed using our default library of operations, the typical next step is to add a dataset of time series to the database using the SQL_Add
command. Custom master operations and operations can also be added, if required.
Computation, processing, and analysis
After installing the software and importing a time-series dataset to a mySQL database, the process by which data is retrieved from the database to local Matlab files (using SQL_Retrieve
), feature sets computed within Matlab (using TS_Compute
), and computed data stored back in the database (SQL_store
) is described in detail here.
After the computation is complete for a time-series dataset, a range of processing, analysis, and plotting functions are also provided with the software, as described here.
Last updated