PyMVPA Hyperalignment Tutorial error thrown

pymvpa
hyperalignment

#1

I am running through the hyperalignment tutorial (http://www.pymvpa.org/examples/hyperalignment.html), but am getting errors while running Searchlight Hyperalignment code. The error output is pasted below. To clarify, I am using the given tutorial dataset (hyperalignment_tutorial_data.hdf5.gz) and went through the first few parts up until “Comparing the Results”.

In addition, I am getting a similar error with the “Regularized Hyperalignment” code. And a value error is thrown with “Similarity Structures” and the error output is pasted below as well.

Is there part of the setup I’m missing? Thank you in advance!

Error output from “Searchlight Hyperalignment” and “Regularized Hyperalignment” (https://docs.google.com/document/d/1Z8LKwz4795mNmZTK07GhLfgziB0Qd3OILeheLRzRQt4/edit?usp=sharing). An excerpt is included below:

Performing classification analyses...
  between-subject (searchlight hyperaligned)...
---------------------------------------------------------------------------
DegenerateInputError                      Traceback (most recent call last)
<ipython-input-175-b8ec4e405550> in <module>()
     25     # Searchlight Hyperalignment returns a list of mappers corresponding to
     26     # subjects in the same order as the list of datasets we passed in.
---> 27     slhypmaps = slhyper(ds_train)
        global slhypmaps = undefined
        global slhyper = SearchlightHyperalignment(sparse_radius=3, featsel=0.4, exclude_from_model=[], nblocks=1)

/anaconda2/envs/pymvaenv/lib/python2.7/site-packages/mvpa2/base/learner.pyc in train(self=ZScoreMapper(), ds=Dataset(array([], shape=(0, 24), dtype=float64),...ems=[]), a=DatasetAttributesCollection(items=[])))
    120         if got_ds and (ds.nfeatures == 0 or len(ds) == 0):
    121             raise DegenerateInputError(
--> 122                 "Cannot train learner on degenerate data %s" % ds)
        ds = Dataset(array([], shape=(0, 24), dtype=float64), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[]), a=DatasetAttributesCollection(items=[]))
    123         if __debug__:
    124             debug(

DegenerateInputError: Cannot train learner on degenerate data <Dataset: 0x24@float64>

Error output for “Similarity Structures”:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-174-42943870f9c3> in <module>()
      2 anova = OneWayAnova()
      3 fscores = [anova(sd) for sd in ds_all]
----> 4 fscores = np.mean(np.asarray(vstack(fscores)), axis=0)
        global fscores = [Dataset(array([[ inf,  inf,  inf, ...,  inf, -inf,  inf]]), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[ArrayCollectable(name='fprob', doc=None, value=array([nan, nan, nan, ..., nan, nan, nan]), length=58861)]), a=DatasetAttributesCollection(items=[])), Dataset(array([[ inf,  inf, -inf, ..., -inf,  inf,  inf]]), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[ArrayCollectable(name='fprob', doc=None, value=array([nan, nan, nan, ..., nan, nan, nan]), length=60327)]), a=DatasetAttributesCollection(items=[]))]
        global np.mean = <function mean at 0x10d9b4b90>
        global np.asarray = <function asarray at 0x10d9a3b18>
        global vstack = <function vstack at 0x10e06b7d0>
        global axis = undefined
      5 # apply to full datasets
      6 ds_fs = [sd[:, fselector(fscores)] for sd in ds_all]

/anaconda2/envs/pymvaenv/lib/python2.7/site-packages/mvpa2/base/dataset.pyc in vstack(datasets=[Dataset(array([[ inf,  inf,  inf, ...,  inf, -in...8861)]), a=DatasetAttributesCollection(items=[])), Dataset(array([[ inf,  inf, -inf, ..., -inf,  in...0327)]), a=DatasetAttributesCollection(items=[]))], a=None, fa='drop_nonunique')
    744                              "datasets have varying attributes.")
    745     # will puke if not equal number of features
--> 746     stacked_samp = np.concatenate([ds.samples for ds in datasets], axis=0)
        stacked_samp = undefined
        global np.concatenate = <built-in function concatenate>
        ds.samples = array([[ inf,  inf, -inf, ..., -inf,  inf,  inf]])
        ds = Dataset(array([[ inf,  inf, -inf, ..., -inf,  inf,  inf]]), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[ArrayCollectable(name='fprob', doc=None, value=array([nan, nan, nan, ..., nan, nan, nan]), length=60327)]), a=DatasetAttributesCollection(items=[]))
        datasets = [Dataset(array([[ inf,  inf,  inf, ...,  inf, -inf,  inf]]), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[ArrayCollectable(name='fprob', doc=None, value=array([nan, nan, nan, ..., nan, nan, nan]), length=58861)]), a=DatasetAttributesCollection(items=[])), Dataset(array([[ inf,  inf, -inf, ..., -inf,  inf,  inf]]), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[ArrayCollectable(name='fprob', doc=None, value=array([nan, nan, nan, ..., nan, nan, nan]), length=60327)]), a=DatasetAttributesCollection(items=[]))]
        global axis = undefined
    747 
    748     stacked_sa = {}

ValueError: all the input array dimensions except for the concatenation axis must match exactly

#2

DegenerateInputError is probably caused by having an essentially empty Dataset, for example, in the error message related to line 122, Dataset(array([], shape=(0, 24), dtype=float64), sa=SampleAttributesCollection(items=[]), fa=FeatureAttributesCollection(items=[]), a=DatasetAttributesCollection(items=[])), the Dataset has no samples at all (no time points for typical fMRI data), and that would lead to various problems.

For the 2nd error, I think it’s best to check what caused these inf and -inf values. It might be related to the 1st error.


#3

Looking in more detail, I realized that the Searchlight Hyperalignment error comes from my fMRIprep output (the previous sections of hyperalignment refer to dataset as ‘datasets’ while the searchlight hyperalignment refers to it as ‘ds_all’)! My masked volumetric data is composed of rest state data, which means I only have one type of ‘target’ and ‘chunk’ that are ‘rest’ and ‘0’ respectively. Is this the correct way to set up the dataset?

If this is true, what should the training dataset and test dataset be since in the tutorial (http://www.pymvpa.org/examples/hyperalignment.html) we perform leave-one-run-out? Here we split the dataset into train_ds and test_ds and run into the error of train_ds being empty since there is only one run (aka rest state data).