Fmriprep work-dir with an incomplete dataset

vlad · September 7, 2018, 5:58pm

Dear experts,

I have a fairly basic question with regards to work-dir and multiple executions of fmriprep

if dealing with incomplete dataset say

50 participants completed task-rest and
35 participants completed task-2back

is it a good approach with fmriprep to:

execute it twice, once for each of the tasks, using --task-id
and (optionally) use --participant_label to split the work between two (or more) machines/terminals…

(and if yes) should one specify:

different work-dir for each of the two executions,
- like -w scratch-rest for the first execution and -w scratch-2back for the second
- I reckon that this would be advised if the second execution can overwrite some files from the first execution
different out-dir for each of the two executions (most likely not as HTML reports can include number of functional runs)

Or is it OK to specify the very same work-dir for all the runs/tasks/participants as no files will be overwriten?

ChrisGorgolewski · September 7, 2018, 6:42pm

There would be no benefit to running fmriprep twice for each task separately. FMRIPREP handles inconsistencies between participants and process all available data for each participant.

Only benefit would come from multiple instances of FMRIPREP on multiple compute nodes/machines for non overlapping sets of participants. In such case each instance should have separate working directory, but they should all write to the same output (although you can also consolidate outputs when they are done).

vlad · September 7, 2018, 8:20pm

@ChrisGorgolewski thank you for your comprehensive answer.

It is great to know that FMRIPREP will handle the data inconsistencies when it is executed once for all the sessions/tasks/ETC.

Just to clear things out, for anyone in the future: Would FMRIPREP also handle such inconsistencies if executed twice—once for each of the tasks or some subset of tasks (one situation that I think it would be desirable is when a study contains a number of tasks, say 4-5, but it is required to output subset of results faster than the rest e.g., when a conference submission deadline is approaching

In such case, after the first execution of FMRIPREP is done e.g., for task-1 and task-2 and the out-dir and work-dir are complete, what approach would you recommend for handling task-3, task-4, which haven’t been processed earlier?

(1) tell FMRIPREP to use new out-dir-2 and new work-dir-2 for the following preprocessing
(2) tell FMRIPREP to use the old out-dir (same as in the first execution) but use a new work-dir-2
(3) reuse both directories out-dir and work-dir (if it is known that no files will be overwritten, except for the HTML reports ETC).

mtnhuck · September 7, 2018, 8:32pm

To follow-up on this: is there a way to implement slice timing for 1 type of task but not others? For example we have short TRs for resting-state so slice timing is unlikely to be useful (1190ms), while for other tasks we have a longer TR where it is probably necessary (2.5s).
Thanks!

effigies · September 7, 2018, 8:38pm

The easy way to do it is to remove the SliceTiming metadata from the tasks that you don’t want to perform STC on.

The slightly more complicated way would be to run fMRIPrep separately for each task (using the -t flag), and use --ignore slicetiming for the task you don’t want to run STC on.

ChrisGorgolewski · September 7, 2018, 8:45pm

This really should’ve been a new topic (it would’ve helped people find the answer).

This is not a well supported use case. (3) would work, but I would rename the HTML reports first to avoid overwriting them.

mtnhuck · September 7, 2018, 9:06pm

Thanks for the help all and sorry for not making it a new topic. Ill make sure I do that going forward. Currently I am planning on implementing the 2nd method that @effigies mentioned.