> For the complete documentation index, see [llms.txt](https://time-series-features.gitbook.io/pyspi/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://time-series-features.gitbook.io/pyspi/installing-and-using-pyspi/usage/faq.md).

# FAQ

> ### *How many SPIs should I measure for my dataset?*

When starting out, we recommend that users work with a smaller subset of available SPIs first, so they get a sense of computation times and working with the output in a lower-dimensional space. Users have the option to pass in a customised configuration *.yaml* file as described in the [creating a reduced SPI set documentation. ](/pyspi/installing-and-using-pyspi/usage/advanced-usage/creating-a-reduced-spi-set.md)

Alternatively, we provide two pre-defined subsets of SPIs that can serve as good starting points: *sonnet* and *fast*. The *sonnet* subset includes 14 SPIs selected to represent the 14 modules identified through hierarchical clustering in the [original paper](https://www.nature.com/articles/s43588-023-00519-x). To retain as many SPIs as possible while minimising computation time, we also offer a *fast* option that omits the most computationally expensive SPIs. Either SPI subset can be toggled by setting the corresponding flag in the [*Calculator*](/pyspi/information-about-pyspi/api-reference/pyspi.calculator.calculator.md)*()* function call as follows:

```python
from pyspi import Calculator
data = ... # your dataset
calc = Calculator(dataset=data, subset="sonnet") # or calc = Calculator(subset="fast")
```

***

> ### *What pre-processing steps are applied to my data?*&#x20;

There are two pre-processing steps that can be applied to your raw multivariate time series (MTS) dataset before computing SPIs:&#x20;

(1) **Detrend:** Detrend each time series in the dataset individually along the time dimension using the [SciPy detrend function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.signal.detrend.html) with default settings. If enabled, detrending is always applied to the dataset **before z-score normalisation.**&#x20;

(2) **Z-score normalise:** Normalise each time series in the dataset individually along the time dimension using the[ SciPy zscore function](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.zscore.html).&#x20;

By **default**, when instantiating a [`Calculator()`](/pyspi/information-about-pyspi/api-reference/pyspi.calculator.calculator.md) object with your dataset, *pyspi* will normalise each time series — representing a process in a MTS dataset — individually along the time axis.

If you would to specify which pre-processing steps to include/exclude, you can pass the corresponding flags for each operation when instantiating a [`Calculator()`](/pyspi/information-about-pyspi/api-reference/pyspi.calculator.calculator.md). Here are some examples of how you can skip either or both operations:

```python
# skip detrending, keep z-scoring
calc = Calculator(dataset=data, detrend=False)

# skip z-scoring, keep detrending
calc = Calculator(dataset=data, zscore=False, detrend=True)

# disable both detrending and zscoring
calc = Calculator(dataset=data, zscore=False, detrend=False)
```

After successfully instantiating a Calculator object, a summary of the pre-processing steps will be displayed for verification before computing SPIs. Here is an example of the output when explicitly setting the detrending step to `False`:

```
216 SPI(s) were successfully initialised.

[1/2] Skipping detrending of the dataset...
[2/2] Normalising (z-scoring) the dataset...
```

> ### *How long does pyspi take to run?*

This depends on the size of your multivariate time series (MTS) data – both the number of processes and the number of time points (observations). In general, we recommend that users try running *pyspi* first with a small representative sample from their dataset to assess time and computing requirements, and scaling up accordingly. The amount of time also depends on the feature set you’re using – whether it’s the full set of all SPIs or a reduced set (like *sonnet* or *fast* described above).

To give users a sense of how long *pyspi* takes to run, we ran a series of experiments on a high-performing computing cluster with 2 cores, 2 MPI, and 40GB memory. We ran *pyspi* on simulated NumPy arrays with either a fixed number of processes (2) or fixed number of time points (100) to see how timing scales with the array size. Here are the results:

<figure><img src="/files/o5YWZOy6YrwgecmpWot4" alt=""><figcaption><p><strong>The time to run </strong><em><strong>pyspi</strong></em><strong> scales with the number of time points (left) or number of processes (right).</strong> </p></figcaption></figure>

We note that computation times for the *sonnet* and *fast* subset are roughly equivalent, and the full set of SPIs requires increasingly large amounts of time to compute with increasing time series lengths. The computation time for the full set of SPIs increases with a consistent slope to that of the *sonnet* and *fast* subsets with increasing number of processes (right).&#x20;

Here are the timing values for each condition, which can help users estimate the computation time requirements for their dataset:

<figure><img src="/files/0m6yn3kHh8z3HPsktK22" alt=""><figcaption></figcaption></figure>

***

> ### *How can I contribute to pyspi?*

Contributions play a vital role in the continual development and enhancement of pyspi, a project built and enriched through community collaboration. By participating in this project, you are contributing to the broader community and helping shape the future of this package.

Code is not the only way to contribute to *pyspi*. Reviewing pull requests, answering questions to help others and aid in troubleshooting, organising and teaching tutorials and improving documentation are all priceless contributions to the project. For further details on how you can contribute to the project, as well as general guidelines for our contributors, please refer to [Contributing to pyspi](/pyspi/development/development/contributing-to-pyspi.md).&#x20;

***

> ### ***Do I need to normalise my dataset before applying pyspi?***

When passing your dataset into the Calculator object, *pyspi* will automatically ***z-*****score** (normalise) along the *time* axis by default (see [API reference](/pyspi/information-about-pyspi/api-reference/pyspi.data.data.md) for Data object). This means that you can supply raw values to the Calculator object without having to normalise the dataset as a pre-processing step.

If you do not wish for *pyspi* to *z-*&#x73;core your data, or you would like more control over how your data is pre-processed, you can pass the `normalise=False` flag to the `Calculator` when instantiating.&#x20;

```python
from pyspi.calculator import Calculator
import numpy as np

# your dataset
data = ... 

# instantiate a Calculator object as usual and set the normalise flag
calc = Calculator(dataset=data, normalise=False)

# disable both detrending and normalisation
calc = Calculator(datast=data, detrend=False, normalise=False)

```

***

> ### ***Can I distribute pyspi calculations across a cluster?***

If you have access to a portable batch system (PBS) cluster and are processing MTS with many processes (or are analysing many MTS), then you may find the [*pyspi* distribute](https://github.com/DynamicsAndNeuralSystems/pyspi-distribute) repository helpful. Each job contains one calculator object that is associated with one MTS. To get started with running pyspi jobs on a PBS-type cluster, follow our guide located [here](/pyspi/installing-and-using-pyspi/usage/advanced-usage/distributing-calculations-on-a-cluster.md).&#x20;

***

> ### ***How can I cite pyspi in my work?***&#x20;

If you used *pyspi* in your work, it would be greatly appreciated if you [cite the original authors](/pyspi/welcome-to-pyspi/citing-pyspi.md). Feel free to star our [GitHub repository](https://github.com/DynamicsAndNeuralSystems/pyspi) if you find our package useful, as this also helps to increase awareness in the time-series analysis community.&#x20;

***

> ### ***Can I run pyspi on my operating system?***

*pyspi* is designed with cross-platform compatibility in mind and can be run on various operating systems, ensuring a wide range of users have access to pyspi and all of it features. Specifically, *pyspi* currently supports:

* **MacOS (Python >= 3.9)**
* **Windows (Python >=3.8)**
* **Linux (Python >=3.8)**

In all cases, ensure that you have the required version of Python installed, as *pyspi* is a python-based package. We actively monitor and work on compatibility issues that may arise with new updates to these operating systems. Users are encouraged to report any compatibility issues they encounter on our [GitHub issues page](https://github.com/DynamicsAndNeuralSystems/pyspi/issues), helping us improve *pyspi* for all users.&#x20;

***

> ### ***Are there examples showcasing a complete pipeline using pyspi?***

Yes, we currently provide two notebooks with examples of complete pipelines using *pyspi*. These notebooks are available in the[ Usage Examples](/pyspi/installing-and-using-pyspi/usage/walkthrough-tutorials.md) section. If you want to share a notebook with additional pipelines or specific use cases, please feel free to contact us.&#x20;

***

> ### ***How can I save my results from pyspi?***&#x20;

Once you have computed the SPIs for your dataset, the results will be stored in the calculator object. We recommend saving the calculator as a .[`pkl`](https://docs.python.org/3/library/pickle.html) file using the [dill library in python](https://pypi.org/project/dill/). To get started, you will need to install the dill package: <mark style="color:red;">`pip install dill`</mark>.

1. ***Saving*** a calculator

```python
from pyspi.calculator import Calculator
import dill

# compute the SPIs as usual
data = np.load('../pyspi/data/forex.npy').T
calc = Calculator(dataset=data, subset='fast')
calc.compute()

# save the calculator object as a .pkl
with open('saved_calculator_name.pkl', 'wb') as f:
    dill.dump(calc, f)
```

2. ***Loading*** a calculator

```python
# specify the location of your saved calculator .pkl file
loc = '../pyspi/saved_calculator_name.pkl'

with open(loc, 'rb') as f:
    calc = dill.load(f)

# now access the calculator object as usual
calc.table
calc.table['cov_EmpricialCovariance']
```

***


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://time-series-features.gitbook.io/pyspi/installing-and-using-pyspi/usage/faq.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
