@ltetrel and @effigies – Thank you both for your responses and advice, this was really helpful to start working out a solution.
I have a few follow up thoughts –
I was reading this thread:
.. which is leaving me confused as to whether using --random-seed with fmriprep does actually produce deterministic results. As of the end of that thread in september, it seems like the answer is no, but that it’s obviously an fmriprep issue, and one that may also apply to any other preproc pipeline. After reading that thread it sounds like there is still some confusion and it may not be resolved among developers? Maybe this isn’t a question for either of you, but please point me in the right direction if not so I can find some updated info about that!
Much of the variability is coming from fmriprep, but some is still introduced in the first-level analysis run through the same fmriprep outputs (thanks again @effigies for the great troubleshooting suggestion).
I need to decide whether we want to try and implement a direct-reproduction option into our lab pipeline (i.e. build in a toggle to set the seeds or not), but first I’d really like to understand whether that is going to be effective and if it makes sense. @effigies your comment about a double-edged sword is a little confusing to me w/r/t this question:
a result that depends on a random seed is problematic.
If the seed is not fixed and I get a single output, isn’t the same thing still true but just in a way that isn’t reproducible? As in, the results are still dependent upon the seed, even if I am not the one setting it. In order for it to really not depend on the seed at all, we’d have to do something like what winkler suggested in the thread above, which is basically to run the process many times and merge the outputs across iterations to produce results that were not dependent upon a single fixed (or random) seed at all.
Could you explain the logic of why a process dependent upon a fixed seed is more problematic than results dependent upon a random seed? (I think I might be missing something about how it’s used?)
thanks again for your help