Working with a mySQL database
When running large-scale hctsa computations, it can be useful to set up a mySQL database for time series, operations, and the computation results, and have many Matlab instances (running on different nodes of a compute cluster, for example) communicate directly with the database.
The hctsa software comes with this (optional) functionality, allowing a more powerful, distributed way to compute and store results of a large-scale computation.
This chapter outlines the steps involved in setting up, and running hctsa computations using a linked mySQL database.
Installing the hctsa code package to work with a mySQL database
The hctsa package requires some preliminary set up to work with a mySQL database, described here:
Installation of mySQL, either locally, or on an accessible server.
Setting up Matlab with a mySQL java connector (done by running the
install_jconnectorscript in the Database directory, and then restarting Matlab).
After the database is set up, and the packages required by hctsa are installed (by running the install script), linking to a mySQL database can be done by running the install_database script, which:
Sets up Matlab to be able to communicate with the mySQL server and creates a new database to store Matlab calculations in, described here.
This section contains additional details about each of these steps.
Note that the above steps are one-off installation steps; once the software is installed and compiled, a typical workflow will simply involve opening Matlab, running the startup script (which adds all paths required for the hctsa software), and then working within Matlab from any desired directory.
Adding a time-series dataset
Once installed using our default library of operations, the typical next step is to add a dataset of time series to the database using the SQL_Add command. Custom master operations and operations can also be added, if required.
Computation, processing, and analysis
After installing the software and importing a time-series dataset to a mySQL database, the process by which data is retrieved from the database to local Matlab files (using SQL_Retrieve), feature sets computed within Matlab (using TS_Compute), and computed data stored back in the database (SQL_store) is described in detail here.
After the computation is complete for a time-series dataset, a range of processing, analysis, and plotting functions are also provided with the software, as described here.
Last updated
Was this helpful?