Compute p-value for correlation of 2 3-D Arrays

TheChymera · December 22, 2017, 5:41pm

I have 2 3-D arrays (i.e. NIfTI files):

contains scores for how much stimulus-evoked activity in each voxel is influenced by a pharmacological treatment
contains scores for how much each voxel is populated by a certain receptor class

I would like to compute a p-value for the null hypothesis that the receptor class distribution does not predict the impact of the pharmacological treatment on stimulus-evoked activity.

I can obtain a similarity metric for the two images (e.g. via ANTs). I assume for the case at hand, global correlation testing 100% of values might be appropriate.

That would give me an “absolute” score for how well the values in one array match those in the other. Sadly, this is also influenced by such details as dynamic range, etc. Not least of all, a simple similarity metric from ANTs comes with no confidence interval.

Ideally I would like to know how well this specific arrangement of the specific values from one array, matches the other array. I am thinking this could be done by permuting all the values of one array, computing similarity scores for each permutation, and counting how many perform better vs worse than the actual permutation. Could certainly be done, but it’s quite a bit of work, and as it stands would take forever to execute.

Is there any already-existent library which can calculate a similarity metric between 2 3-D arrays - and give a confidence interval?

bthirion · December 29, 2017, 1:30pm

If you call Y the values of one array and X the values of another, statsmodel can give you confidence intervals and p-values for the affine relationship
Y = cst + X beta + error
So, I would start with that.
You permutation idea is fine – it avoids parametric assumptions to get a p-value. It actually would not take so much time ?

HTH

TheChymera · December 29, 2017, 7:41pm

I tried looking up affine relationship, and I’m not sure what that’s supposed to mean. I know affine as a transformation, but not a similarity metric. Given the formula, however, it looks like another way to get at the global correlation (since in your example the correlation coefficient can be estimated from the error term of the GLM).

The reason why I believe it would take forever is that computing 100% sampled global correlation (GC) with ANTs for one pair of images already takes almost a minute. I guess to get a good estimate I would need at least a few thousand permutations. Even if I parallelize it, that leaves me with ~50h (per image pair). Given 20 subjects that’s already over a month.

Perhaps I could get away with fewer permutations? How can I know how many permutations I need? I assume this is a function of the size of my array?

bthirion · December 29, 2017, 9:47pm

Indeed, computing correlation is equivalent to testing for affine relationships between images.

Sorry, bt i don’t get whet you need to use ANTS. With 2 images, you get their correlation in a fraction of a second.
See eg. https://github.com/nistats/nistats/pull/133
Maybe I’m missing something.