Thanks, Bertrand!
Yeah, I figured that too, looking at the source code - Bonferroni threshold is calculated as an ‘inverse survival function’ (inf) of alpha/n_voxels. And that returns a z-score.
And apparently is done like that independent on what is the input image stat_img in the threshold_stats_img. So, even if my stat_img is t-value image or F-value image, the returned threshold is still a z-score for alpha/n_voxels. And applying that z-score threshold to F-value image is just wrong.
That’s quite dissapointing. And also in this way thresholding fMRI images is rather useless (correcting for the number of voxels rather than resels).

Are there any other Python based fMRI statistical analysis tools available?

Interesting point.
To me it would not make sense to create Bonferroni control for all possible statistical decision function. It makes more sense to do it for z-values only. What you are saying is that F-to-z and t-2-z converters are lacking. We would indeed add them.
(Currently, Nilearn GLMs directly return z-scale variables ; the point is that bare t/F statistical maps are not interpretable).
Best,
Bertrand

It is just not clear that nilearn.glm.threshold_stats_img(stat_img) implies that the stat_img should always be z-scaled image. It’s missleading. I assumed it can be any stats image.

I have now discovered that Nilearn also has a non_parametric_inference module. That’s at least helpful to overcome the lack of RFT implementation!

It is in the doctring. nilearn.glm.threshold_stats_img - Nilearn
Do you think it should be more explicit ?
Glad that you could indeed use non-parametric inference. This is certainly a good solution —albeit more expensive.
Best,
Bertrand