Passing subject id to DataSink and use it to substitute file names

JohannesWiesner · February 1, 2021, 11:29am

I downloaded the HCP data set and trying to learn how to use nipype with it. The first little task I set up for myself was to just iterate over each subject’s folder, select the tfMRI_WM_LR.nii.gz file within that folder and copy it over to another location. The iteration, file selection and copy process works but I struggle with the problem of how to dynamically add the subject id to the name of the file (like such: sub-{subject_id}_fMRI_WM_LR.nii.gz

from nipype.interfaces.io import SelectFiles, DataSink
from nipype import Workflow,Node
from nipype.interfaces.utility import IdentityInterface

# set data directory
data_dir = "/data/"

# set a list of subject directory names
subject_list = ["955465","952863"]

# create a node that iterates over the subjects
infosource = Node(IdentityInterface(fields=["subject_id"]),name="infosource")
infosource.iterables = [("subject_id",subject_list)]

# create a node that selects the functional nifti files based on a template
templates = {"func":"{subject_id}/MNINonLinear/Results/tfMRI_WM_LR/tfMRI_WM_LR.nii.gz"}
selectfiles = Node(SelectFiles(templates,base_directory=data_dir),name="selectfiles")

# create a destination folder to put the images
datasink = Node(DataSink(base_directory="/output"),name="datasink")

# define substitutions for the names of the output files
datasink.inputs.substitutions = [('tfMRI_WM_LR.nii.gz','foo.nii.gz')]

# define workflow
wf = Workflow(name="choosing_subjects")

# pass the subject id from infosource to selectfiles in order to select the directories of interest
wf.connect(infosource, "subject_id",selectfiles,"subject_id")

# pass the subject is from infosource to datasinks 'container' attribute so that for each subject
# an individual output directory with the subject id as name of the output is created
wf.connect(infosource, "subject_id",datasink,"container")

# pass the selected files from selectfiles to datasinks output which will place them in a subdirectory
# called 'func'
wf.connect(selectfiles,"func",datasink,"func")

# run workflow
wf.run()

How do I pass the subject_id from infosource to datasink so that datasink can use it to modify the filename in datasink.inputs.substitutions = [('tfMRI_WM_LR.nii.gz','foo.nii.gz')]?

JohannesWiesner · February 9, 2021, 2:33pm

Answering my own question here: Renaming a file using input parameters (such as the subject id) is possible via a Rename Node. You can connect this node to DataSink to dynamically change the names of your files.