This topic is to organize questions and feedback for Tutorial T7
I have some queries regarding HCTSA package:
-
How can I use more machine learning algorithms rather than SVM linear only using TS_Classify command?
-
Can I perform a multivariate time series analysis using the HCTSA package?
-
I have facing problem generating confusion matrix using TS_classify command. It just opens a blank figure in MatLab. How can I overcome this problem?
-
How can i export the extracted features dataset and used it as a feature dataset?
@ben.fulcher would it be OK for the audience to take screenshots during the session to share over social media and the conference album please?
A general question (with my Free/Open source software (FOSS) hat on). Are there any plans to move away from Matlab (which is proprietary and thus not accessible to users perhaps outside universities) to perhaps Python/R/Julia. Or perhaps support/test with GNU Octave which is Matlab compatible?
Edit: I see that it uses other Matlab toolboxes, so perhaps this is not as trivial as one would hope.
From my understanding with regards to the hctsa toolbox, it seems that only classification/regression algorithms are included (e.g. linear SVM), which requires some knowledge of the initial dataset fed as input. How can we utilise the toolbox given that the time series data is unlabelled and clustering is involved in the study ? (e.g. unsupervised learning algorithms such as K-means, mean-shift clustering)
Question during tutorial: Would this be helpful for data where you say want to focus on something like cross-frequency coupling etc.? Have you done any applications in this way?
Can one use hctsa for a case when we have two clusters of time-series that we already know they are different in some specific features but we would like to find out if there are other features that can describe the differences between these two clusters?
Can we have access to slides and video of the tutorial afterward?
Great, thanks very much!
Thank you for the tutorial! If I remember correctly, you mentioned during the talk that you are working on a multivariate time series version of hctsa. Do you have any suggestions for using the current version to analyse multivariate time series?
Since you mentioned that hctsa only works for univariate problems, how do you put in time-series of multiple EEG electrodes or multiple fMRI voxels from a single session recording to the toolbox?
Dear @Jaber_Al_Nahian thanks for the question and welcome to neurostars!
1—You can specify the classification algorithm in the cfnParams
structure. You can see the options in GiveMeCfn
. Typically because there is complexity in the embedding in a high-dimensional feature space, we have tried to remove complexity in the classifiers (to avoid overfitting), also for interpretability. You can also use OutputToCSV
and use the hctsa data in other environments (like python), [if this route, feel free to share your python workflow here]
2—hctsa is designed to extracting features from a univariate time series. We are currently designing and implementing a multivariate version. In the meantime, you can either concatenate univariate features of each component of your system (e.g., using a reduced set of features, like catch22 to avoid massive dimensionality explosion), and perhaps add some simple pairwise dependence measures to summarize the multivariate structure.
3—Can you confirm what toolboxes you have (and what Matlab version you’re using)? If you have Matlab2020 it tries to use confusionchart
, and otherwise tries to use plotconfusion
(requires the Deep Learning toolbox). See lines 214–222 of TS_Classify
.
4—Use OutputToCSV
—gives you csv files corresponding to a given HCTSA calculation that you can analyze however you please.
Hope this helps, and good luck!
Ben
Most users are within Universities with a Matlab license, so I haven’t come across issues with this (and the main analysis code is licensed non-commercial anyway). But there are a couple solutions if this comes up:
- Use an alternative (e.g., native R or native python) feature-extraction tool, such as those listed here. These have far fewer features.
- Use Matlab temporarily to derive a reduced set of useful features for your problem, and then implement them (or find non-Matlab implementations). This pipeline is demonstrated (and implemented) in catch22 and we currently have new reduced ones in development. The goal is to code them in C so they can be efficiently used in any programming language.
Hi @Poe thanks for the question, and welcome to neurostars!
Yes, hctsa is designed for supervised learning, but you can leverage it’s ability to extract thousands of features from time series for a wide variety of specific applications.
You could run clustering, or investigate low-dimensional projections of your data, etc. using the data from hctsa.
Some generic versions are done within hctsa (e.g., TS_PlotLowDim
(computes and visualizes a low-dimensional projection), TS_Cluster
(does hierarchical clustering)), but you can implement your own specific pipelines within Matlab to work from the HCTSA .mat
files, or export to .csv and work from your own favorite environment (using OutputToCSV
).
Hope this helps—good luck
Ben
Hi @fskinner welcome to neurostars!
Cross-frequency coupline is a very specific analysis that doesn’t really fit with hctsa. Although you can be creative with the hctsa feature set (which goes far beyond decompositions into frequency bands), and maybe come up with some different features that capture multivariate dependencies. We have some work on this in development.
Best,
Ben
Thanks for the question @roxana.zeraati and welcome!
Yes—you can do this by labeling the clusters and just running the pipeline as I described (e.g., in the tutorial with EEG data). You could see if the features you know should come up do come up (or add your implementations to the hctsa feature set so that they are guaranteed to come up).
The default analyses (like TS_TopFeatures
) will do this, and you can compare where the features you expect to come up sit amongst the other possible features.
You could also do multivariate learning on the feature space (e.g., start with your baseline features and test whether any features in combination to the baseline features can significantly improve the classification of the two groups). In the simplest case you could just compare your baseline set of B features with new sets of B+1 features, that each add an individual feature. You can do testing for a significant (e.g., cross-validated) improvement in performance of the set of B+1 sets (and assess significance relative to null/random features).
Just an idea—but good luck!
Yes, slides have been posted to the sched event, or are here.
Will send the video recording to CNS today and hopefully it will up on their youtube channel soon.
Hi @gmschroe
Thanks for the question and welcome to the neurostars community!
See some suggestions in my response to @Jaber_Al_Nahian below.
A simple analysis would concatenate the feature sets corresponding to each univariate element of your system (or do some dimensionality reduction to capture the unique univariate dynamical signatures), and perhaps add to some multivariate features. It depends on your problem how best to formulate it, and what question your asking that is related to finding good statistics of univariate dynamics (which is what hctsa is designed to do).
Hope this helps, thanks for the interest in our work, and good luck!
Ben
Hi @ben.fulcher! Thanks for the tutorial, and all the discussion (and the music!).
I do understand the difficulties of moving away from Matlab. There’s a whole eco system around it now (neuroimaging is also highly Matlab reliant, much more than comp-neuro), and moving individual tools away is hard because of all the dependencies that also need to be moved away. However, I do think it’s worth a start, so I’m very happy to hear that there is some effort to move to non Matlab platforms.
I do get this response from lots of folks, but it’s only true for universities in developed countries—not for developing countries. Also, individuals not affiliated to universities are totally left out because individual Matlab licenses are quite expensive. So, Matlab prevents tools from being really open (and not just open for academics at well funded universities).
Great. I’ve already added catch22 to our NeuroFedora queue, so we’ll have that included soon. I’ll also file a ticket upstream to request that it be tested with GNU Octave so that users can use it there without needing Matlab.
If an hctsa user could test it out with GNU Octave too, that would perhaps be a start. (As an example, the SPM toolkit for neuroimaging supports both Matlab and GNU Octave: https://en.wikibooks.org/wiki/SPM/Octave#GNU_Octave)