CNS*2020 Tutorial T7: Characterizing neural dynamics using highly comparative time-series analysis

fskinner · July 18, 2020, 11:09am

Question during tutorial: Would this be helpful for data where you say want to focus on something like cross-frequency coupling etc.? Have you done any applications in this way?

roxana.zeraati · July 18, 2020, 11:10am

Can one use hctsa for a case when we have two clusters of time-series that we already know they are different in some specific features but we would like to find out if there are other features that can describe the differences between these two clusters?

roxana.zeraati · July 18, 2020, 11:28am

Can we have access to slides and video of the tutorial afterward?

ben.fulcher · July 18, 2020, 11:39am

sanjayankur31 · July 18, 2020, 11:40am

Great, thanks very much!

gmschroe · July 18, 2020, 12:58pm

Thank you for the tutorial! If I remember correctly, you mentioned during the talk that you are working on a multivariate time series version of hctsa. Do you have any suggestions for using the current version to analyse multivariate time series?

roxana.zeraati · July 19, 2020, 9:48am

Since you mentioned that hctsa only works for univariate problems, how do you put in time-series of multiple EEG electrodes or multiple fMRI voxels from a single session recording to the toolbox?

ben.fulcher · July 19, 2020, 11:28pm

Dear @Jaber_Al_Nahian thanks for the question and welcome to neurostars!

1—You can specify the classification algorithm in the cfnParams structure. You can see the options in GiveMeCfn. Typically because there is complexity in the embedding in a high-dimensional feature space, we have tried to remove complexity in the classifiers (to avoid overfitting), also for interpretability. You can also use OutputToCSV and use the hctsa data in other environments (like python), [if this route, feel free to share your python workflow here]

2—hctsa is designed to extracting features from a univariate time series. We are currently designing and implementing a multivariate version. In the meantime, you can either concatenate univariate features of each component of your system (e.g., using a reduced set of features, like catch22 to avoid massive dimensionality explosion), and perhaps add some simple pairwise dependence measures to summarize the multivariate structure.

3—Can you confirm what toolboxes you have (and what Matlab version you’re using)? If you have Matlab2020 it tries to use confusionchart, and otherwise tries to use plotconfusion (requires the Deep Learning toolbox). See lines 214–222 of TS_Classify.

4—Use OutputToCSV—gives you csv files corresponding to a given HCTSA calculation that you can analyze however you please.

Hope this helps, and good luck!

Ben

ben.fulcher · July 19, 2020, 11:32pm

Most users are within Universities with a Matlab license, so I haven’t come across issues with this (and the main analysis code is licensed non-commercial anyway). But there are a couple solutions if this comes up:

Use an alternative (e.g., native R or native python) feature-extraction tool, such as those listed here. These have far fewer features.
Use Matlab temporarily to derive a reduced set of useful features for your problem, and then implement them (or find non-Matlab implementations). This pipeline is demonstrated (and implemented) in catch22 and we currently have new reduced ones in development. The goal is to code them in C so they can be efficiently used in any programming language.

ben.fulcher · July 19, 2020, 11:41pm

Hi @Poe thanks for the question, and welcome to neurostars!

Yes, hctsa is designed for supervised learning, but you can leverage it’s ability to extract thousands of features from time series for a wide variety of specific applications.

You could run clustering, or investigate low-dimensional projections of your data, etc. using the data from hctsa.

Some generic versions are done within hctsa (e.g., TS_PlotLowDim (computes and visualizes a low-dimensional projection), TS_Cluster (does hierarchical clustering)), but you can implement your own specific pipelines within Matlab to work from the HCTSA .mat files, or export to .csv and work from your own favorite environment (using OutputToCSV).

Hope this helps—good luck

Ben

ben.fulcher · July 19, 2020, 11:44pm

Hi @fskinner welcome to neurostars!
Cross-frequency coupline is a very specific analysis that doesn’t really fit with hctsa. Although you can be creative with the hctsa feature set (which goes far beyond decompositions into frequency bands), and maybe come up with some different features that capture multivariate dependencies. We have some work on this in development.
Best,
Ben

ben.fulcher · July 19, 2020, 11:48pm

Thanks for the question @roxana.zeraati and welcome!

Yes—you can do this by labeling the clusters and just running the pipeline as I described (e.g., in the tutorial with EEG data). You could see if the features you know should come up do come up (or add your implementations to the hctsa feature set so that they are guaranteed to come up).
The default analyses (like TS_TopFeatures) will do this, and you can compare where the features you expect to come up sit amongst the other possible features.
You could also do multivariate learning on the feature space (e.g., start with your baseline features and test whether any features in combination to the baseline features can significantly improve the classification of the two groups). In the simplest case you could just compare your baseline set of B features with new sets of B+1 features, that each add an individual feature. You can do testing for a significant (e.g., cross-validated) improvement in performance of the set of B+1 sets (and assess significance relative to null/random features).
Just an idea—but good luck!

ben.fulcher · July 19, 2020, 11:50pm

Yes, slides have been posted to the sched event, or are here.
Will send the video recording to CNS today and hopefully it will up on their youtube channel soon.

ben.fulcher · July 19, 2020, 11:53pm

Hi @gmschroe
Thanks for the question and welcome to the neurostars community!
See some suggestions in my response to @Jaber_Al_Nahian below.
A simple analysis would concatenate the feature sets corresponding to each univariate element of your system (or do some dimensionality reduction to capture the unique univariate dynamical signatures), and perhaps add to some multivariate features. It depends on your problem how best to formulate it, and what question your asking that is related to finding good statistics of univariate dynamics (which is what hctsa is designed to do).
Hope this helps, thanks for the interest in our work, and good luck!
Ben

sanjayankur31 · July 20, 2020, 9:09am

Hi @ben.fulcher! Thanks for the tutorial, and all the discussion (and the music!).

I do understand the difficulties of moving away from Matlab. There’s a whole eco system around it now (neuroimaging is also highly Matlab reliant, much more than comp-neuro), and moving individual tools away is hard because of all the dependencies that also need to be moved away. However, I do think it’s worth a start, so I’m very happy to hear that there is some effort to move to non Matlab platforms.

I do get this response from lots of folks, but it’s only true for universities in developed countries—not for developing countries. Also, individuals not affiliated to universities are totally left out because individual Matlab licenses are quite expensive. So, Matlab prevents tools from being really open (and not just open for academics at well funded universities).

Great. I’ve already added catch22 to our NeuroFedora queue, so we’ll have that included soon. I’ll also file a ticket upstream to request that it be tested with GNU Octave so that users can use it there without needing Matlab.

If an hctsa user could test it out with GNU Octave too, that would perhaps be a start. (As an example, the SPM toolkit for neuroimaging supports both Matlab and GNU Octave: https://en.wikibooks.org/wiki/SPM/Octave#GNU_Octave)

fskinner · July 20, 2020, 10:47am

ok - good to know. Thank you.

gmschroe · July 20, 2020, 11:15am

Thank you!

On the multivariate side, I can definitely see a need for comparing the performance of different graph theory/network measures. There are a variety of different ways to both define connectivity (e.g., correlation, coherence, measures based on phase, directed measures…) and evaluate various network properties, such as node centrality (node strength, eigenvector centrality, participation coefficient, etc.). As you noted in your tutorial for other measures, most papers use only a few definitions of connectivity and a few network measures, at most, making comparisons difficult (I’m guilty of this practice, myself!). Additionally, there is a lot of overlap in the information that different measures capture, as well as a need to compare the performance of these network measures to simpler, univariate ones.

Out of curiosity, are you planning to include network analysis in your multivariate approach, or are you aware of any other researchers who are undertaking a similar comparative approach in this domain?

ben.fulcher · July 21, 2020, 12:24am

Thanks @gmschroe — yes, what you describe is exactly what we are in the process of developing at the moment! The closest thing to it that exists is work done by a colleague of mine back during my PhD, who took a similar philosophy to the network analysis literature (High throughput network analysis), but it somehow was never taken to completion/published other than this brief article: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.702.955&rep=rep1&type=pdf#page=5
By next year we should have a full framework enabling comparison on all aspects of statistics you can derive from a multivariate time-series dataset (and this one will be in python) Do follow up (e.g., by email) if you have specific applications in mind from your own work

ben.fulcher · July 21, 2020, 12:39am

Thanks @sanjayankur31,

All of our reduced sets are coded in C, with wrappers for any language, and the new multivariate version of hctsa that we’ve been developing now is being done completely in python.

Perhaps they don’t get in touch with me (or more likely don’t know about hctsa), but I’ve only ever heard from one University researcher who wanted to use hctsa but didn’t have Matlab access (from an Austrian research institute), which makes it hard to distinguish whether this is a major issue for researchers in practice. But I take the principle of your point (and this is much clearer now than it was 12 years ago when hctsa development began) which is why new software developed in our lab is done in an open source environment (python so far).

The catch22 repo contains Matlab code, but this is only for testing the C coded implementations against the original Matlab files, its functionality is based on C files with instructions to compile and run from R, python, … (i.e., the Matlab files in the repo are not required for the functionality of computing the 22 features in any environment).

hctsa may be runnable in Octave (it used to be some years ago when that Austrian researcher worked with it, cf. pyopy, but with such a reduced feature set given the lack of access to toolboxes etc.) that I’m not sure it’s a good substitute. The interesting part of hctsa is that it only needs temporary access to matlab (for the initial computing of features), after which you can export the results to .csv, say, and determine the features that are useful for you and recode them as needed. We hope that through this process, we will start accumulating efficiently coded highly performing features in the spirit of the catch22 feature set (we are currently developing more such sets).

Thanks for your comments and feedback

ben.fulcher · July 22, 2020, 12:43am

Also FYI: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0220061