Hey, I am trying to create results tables for an analysis that I ran using Nilearn’s `non_parametric_inference`

. And I ran into a slight inconvenience.

(If you have not tried working with Nilearn yet… stop with whatever you are doing and try it out right now, it’s awesome).

**TLDR**: I intend to report both uncorrected and corrected results. Since the `non_parametric_inference`

function does not include uncorrected p-value map in the results output, I generated it myself with some custom code. To my surprise, I found differences in cluster sizes when I compare my uncorrected p<0.001 results with the cluster-wise corrected results from `non_parametric_inference`

. I am unsure if I am running into a bug or whether I am making an error myself. So that’s why I am posting here instead of the GitHub issues page.

*How I got here*

My design matrix (`HCvMD_design_matrix`

) looks like this:

I have 2x18 brain maps, and 4 regressors. So, as far as my statistical knowledge goes, the degrees of freedom for this model is 36-4=32.

Next, I ran a non-parametric test, with a cluster threshold of 0.001:

```
HCvMD_out_dict = non_parametric_inference(
ls_imgs,
design_matrix=HCvMD_design_matrix,
mask=second_level_mask,
model_intercept=True,
second_level_contrast="disorder",
two_sided_test=True,
smoothing_fwhm=6,
n_jobs=8,
threshold=0.001
)
```

[insert 4m43.9s waiting music here]

After I grabbed some coffee, I first wanted to see the uncorrected results table. I used the `get_clusters_table`

function. By means of trial and error (childhood habit), I found out that this function only produces sub-clusters for z-maps, and not for 10logP maps. So, I masked the t-value map from `non_parametric_inference`

using cluster-size error control map, converted the map into Z values, and then produced the cluster table:

```
# Create t to z conversion function
def t_to_z(t_scores, deg_of_freedom):
p_values = scipy.stats.t.sf(t_scores, df=deg_of_freedom)
z_values = scipy.stats.norm.isf(p_values)
return z_values
# Set threshold
unc_pval=0.001
dof = len(ls_subjects) - 4 # DOF: group, covar1, covar2, intercept
threshold = stats.t.isf(unc_pval / 2, df=dof) # two-sided
# Create mask for p<0.001 uncorrected threshold
HCvMD_thr_mask_img = math_img(f'(img < {-threshold}) | (img > {threshold})', img=HCvMD_out_dict['t'])
HCvMD_masked_data = masking.apply_mask(HCvMD_out_dict['t'], HCvMD_thr_mask_img)
HCvMD_masked_img = masking.unmask(HCvMD_masked_data, HCvMD_thr_mask_img)
# Convert to z-map
z_img = new_img_like(
HCvMD_masked_img, t_to_z(get_data(HCvMD_masked_img), deg_of_freedom=dof))
# Create cluster table
clusters = get_clusters_table(z_img, stat_threshold=0, cluster_threshold=10, two_sided=True)
clusters = clusters.round({'X': 0, 'Y': 0, 'Z': 0})
clusters.style.format({'X': '{:,.0f}'.format,
'Y': '{:,.0f}'.format,
'Z': '{:,.0f}'.format})
```

This gives me the following table:

Cluster ID | X | Y | Z | Peak Stat | Cluster Size (mm3) | |
---|---|---|---|---|---|---|

0 | 1 | 4 | -86 | 10 | 4.876244 | 5552 |

1 | 1a | -18 | -78 | 8 | 4.592168 | |

2 | 2 | 20 | -76 | 8 | 4.810282 | 1312 |

3 | 3 | 16 | -20 | -4 | 3.854187 | 496 |

4 | 4 | -40 | -74 | -50 | 3.754792 | 152 |

5 | 5 | -8 | -16 | -0 | 3.725331 | 96 |

6 | 1 | 52 | -66 | 14 | -3.508029 | 96 |

Now, in theory, if I find any significant clusters using the cluster-size error control p-map, the cluster sizes of these surviving clusters should be the same, since I used a threshold of 0.001 (two-sided) in both cases. Right?

The resulting cluster-wise error corrected contrast looks like this:

Oh cool! I found something!.

This could be the first cluster from the table above. (Obviously I know MNI coorinates from the top of my head, like everyone else ).

Let’s have a look…

```
# Create mask for p<0.05 cluster-size error corrected
HCvMD_thr_mask_img2 = binarize_img(HCvMD_out_dict['logp_max_size'], threshold=1.3)
HCvMD_masked_data2 = masking.apply_mask(HCvMD_out_dict['t'], HCvMD_thr_mask_img2)
HCvMD_masked_img2 = masking.unmask(HCvMD_masked_data2, HCvMD_thr_mask_img2)
# Convert to z-map
z_img = new_img_like(
HCvMD_masked_img2, t_to_z(get_data(HCvMD_masked_img2), deg_of_freedom=dof))
# Create cluster table
clusters = get_clusters_table(z_img, stat_threshold=0, cluster_threshold=10, two_sided=True)
clusters = clusters.round({'X': 0, 'Y': 0, 'Z': 0})
clusters.style.format({'X': '{:,.0f}'.format,
'Y': '{:,.0f}'.format,
'Z': '{:,.0f}'.format})
```

(yes, I know, I could’ve used more infomative variable names, but this was my quick and dirty attempt…)

This gives me the following table:

Cluster ID | X | Y | Z | Peak Stat | Cluster Size (mm3) | |
---|---|---|---|---|---|---|

0 | 1 | 4 | -86 | 10 | 4.876244 | 5744 |

1 | 1a | -18 | -78 | 8 | 4.592168 |

Ah, that’s the same cluster indeed! However… everything looks the same except for the cluster size.

I figured that this must have something to do with the initial thresholding. So, I started changing the DOF value in my uncorrected result (that childhood habit again…) and found the exact same cluster size result when I changed `dof = len(ls_subjects) - 4`

to `dof = len(ls_subjects) - 2`

.

So, now I am wondering what’s going wrong. Did I make a mistake in my assumptions? Or is there something else going on?

In the `permuted_ols`

source code, I see that covariates are first regressed out of the data before the permutation tests are being run on the residuals (correct me if I’m wrong). And if I run through the code with a debugger, the code uses 34 DOF instead of 32 DOF. In the line `dof = n_samples - (n_regressors + n_covars)`

, `n_regressors`

is set to 1 and `n_covars`

is also set to 1 but I do not understand why.

Do you have any idea @ymzayek?