How to use datasink to get BIDS styple output folders structure?

Dear experts,

I am trying to use Nipype to analyze my data from a three-day paradigm. The raw data was organized in BIDS:

├── README
├── sub-001
│ ├── ses-d1
│ │ ├── anat
│ │ ├── dwi
│ │ ├── fmap
│ │ ├── func
│ │ └── sub-001_ses-d1_scans.tsv
│ ├── ses-d2
│ │ ├── fmap
│ │ ├── func
│ │ └── sub-001_ses-d2_scans.tsv
│ └── ses-d3
│ ├── func
│ └── sub-001_ses-d3_scans.tsv

After preprocessing, I hope the output also organized in the same fashion using datasink, but encountered an error.

├── sub-001
│ ├── ses-d1
│ │ ├── swrsub-001_ses-d1_task-exp_bold_roi.nii
│ │ ├── rsub-001_ses-d1_task-exp_bold_roi.nii
│ │ ├── swrsub-001_ses-d1_task-rest-1_bold_roi.nii
│ │ ├── rsub-001_ses-d1_task-rest-1_bold_roi.nii
│ │ ├── swrsub-001_ses-d1_task-rest-2_bold_roi.nii
│ │ └── rsub-001_ses-d1_task-rest-2_bold_roi.nii
│ ├── ses-d2
│ │ ├── swrsub-001_ses-d2_task-exp_bold_roi.nii
│ │ ├── rsub-001_ses-d2_task-exp_bold_roi.nii
│ │ ├── swrsub-001_ses-d2_task-rest-1_bold_roi.nii
│ │ ├── rsub-001_ses-d2_task-rest-1_bold_roi.nii
│ │ ├── swrsub-001_ses-d2_task-rest-2_bold_roi.nii
│ │ └── rsub-001_ses-d2_task-rest-2_bold_roi.nii
│ └── ses-d3
│ │ ├── swrsub-001_ses-d2_task-exp_bold_roi.nii
│ │ ├── rsub-001_ses-d2_task-exp_bold_roi.nii
│ │ ├── swrsub-001_ses-d2_task-rest-1_bold_roi.nii
│ │ └──rsub-001_ses-d2_task-rest-1_bold_roi.nii

I read the example from the tutorial, and try to also apply the substitution argument in my workflow (see https://gitlab.com/hcp4715/repdopa_nipy/-/blob/master/Repdopa_proc_reproduce_matlab.ipynb for the notebook).

Below is about the input:

#ses_list = ['d1', 'd2', 'd3']
task_list=['exp', 'rest-1', 'rest-2', 'rest-3', 'rest-4']
ses_list = ['d2']
subject_list = ['001','002']
sf.iterables = [('subject_id', subject_list),
                ('ses_id', ses_list),
                ('task_id', task_list)]

Here is about the datasink:

output_folder = 'derivatives'
datasink = Node(DataSink(base_directory=join(output_dir),
                         container='datasink_preproc'),  # the name of the sub-folder of base_dirctory
               name = 'datasink')

substitutions = []
subjFolders = [('_ses_id_%s_subject_id_%s_task_id_%s' % (ses_id, sub, task_id), 'sub-%s/ses-%s/' % (sub, ses_id))
               for ses_id in ses_list 
               for sub in subject_list
               for task_id in task_list]

substitutions.extend(subjFolders)
datasink.inputs.substitutions = substitutions

spm_preproc.connect([(realign, datasink, [('realignment_parameters', '.@par')]),
                     (smooth, datasink, [('smoothed_files', '.@func')]), 
                     ]) 

However, the code always reports error:

200402-16:47:11,105 nipype.workflow WARNING:
	 [Node] Error on "spm_preproc.datasink" (/media/hcp4715/Data/Data/RepDopa/BIDS/reprod_spm/derivatives/spm_preproc/_ses_id_d2_subject_id_002_task_id_rest-1/datasink)
200402-16:47:11,109 nipype.workflow INFO:
	 [Job 58] Cached (spm_preproc.gunzip_anat).
200402-16:47:13,18 nipype.workflow ERROR:
	 Node datasink.a9 failed to run on host hcp4715-Precision-5510.
200402-16:47:13,19 nipype.workflow ERROR:
	 Saving crash info to /media/hcp4715/Data/Data/RepDopa/BIDS/crash-20200402-164713-neuro-datasink.a9-9bd81876-38d8-409a-b416-3711a8e44f7f.pklz
Traceback (most recent call last):
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
    result['result'] = node.run(updatehash=updatehash)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 472, in run
    result = self._run_interface(execute=True)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 563, in _run_interface
    return self._run_command(execute)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 378, in run
    outputs = self.aggregate_outputs(runtime)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 456, in aggregate_outputs
    predicted_outputs = self._list_outputs()
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/io.py", line 719, in _list_outputs
    if d[0] == '@':
IndexError: string index out of range

200402-16:47:13,23 nipype.workflow ERROR:
	 Node datasink.a8 failed to run on host hcp4715-Precision-5510.
200402-16:47:13,24 nipype.workflow ERROR:
	 Saving crash info to /media/hcp4715/Data/Data/RepDopa/BIDS/crash-20200402-164713-neuro-datasink.a8-0530d22c-004c-4627-ab2f-eac07ec8c858.pklz
Traceback (most recent call last):
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/plugins/multiproc.py", line 69, in run_node
    result['result'] = node.run(updatehash=updatehash)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 472, in run
    result = self._run_interface(execute=True)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 563, in _run_interface
    return self._run_command(execute)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/pipeline/engine/nodes.py", line 643, in _run_command
    result = self._interface.run(cwd=outdir)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 378, in run
    outputs = self.aggregate_outputs(runtime)
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/base/core.py", line 456, in aggregate_outputs
    predicted_outputs = self._list_outputs()
  File "/opt/miniconda-latest/envs/neuro/lib/python3.6/site-packages/nipype/interfaces/io.py", line 719, in _list_outputs
    if d[0] == '@':
IndexError: string index out of range

The report from datasink folder:

Node: datasink (io)
===================


 Hierarchy : spm_preproc.datasink
 Exec ID : datasink.a0


Original Inputs
---------------


* _outputs : {'.@par': '/media/hcp4715/Data/Data/RepDopa/BIDS/reprod_spm/derivatives/spm_preproc/_ses_id_d2_subject_id_001_task_id_exp/realign/rp_sub-001_ses-d2_task-exp_bold_roi.txt', '.@func': '/media/hcp4715/Data/Data/RepDopa/BIDS/reprod_spm/derivatives/spm_preproc/_ses_id_d2_subject_id_001_task_id_exp/smooth/swrsub-001_ses-d2_task-exp_bold_roi.nii'}
* base_directory : /media/hcp4715/Data/Data/RepDopa/BIDS/reprod_spm
* bucket : <undefined>
* container : datasink_preproc
* creds_path : <undefined>
* encrypt_bucket_keys : <undefined>
* local_copy : <undefined>
* parameterization : True
* regexp_substitutions : <undefined>
* remove_dest_dir : False
* strip_dir : <undefined>
* substitutions : [('_ses_id_d2_subject_id_001_task_id_exp', 'sub-001/ses-d2/'), ('_ses_id_d2_subject_id_001_task_id_rest-1', 'sub-001/ses-d2/'), ('_ses_id_d2_subject_id_001_task_id_rest-2', 'sub-001/ses-d2/'), ('_ses_id_d2_subject_id_001_task_id_rest-3', 'sub-001/ses-d2/'), ('_ses_id_d2_subject_id_001_task_id_rest-4', 'sub-001/ses-d2/'), ('_ses_id_d2_subject_id_002_task_id_exp', 'sub-002/ses-d2/'), ('_ses_id_d2_subject_id_002_task_id_rest-1', 'sub-002/ses-d2/'), ('_ses_id_d2_subject_id_002_task_id_rest-2', 'sub-002/ses-d2/'), ('_ses_id_d2_subject_id_002_task_id_rest-3', 'sub-002/ses-d2/'), ('_ses_id_d2_subject_id_002_task_id_rest-4', 'sub-002/ses-d2/')]

Previously, I asked for help in the other post Nipype: SPM realignment multiple sessions, I hope this new post described the issue more clearly.

Just after posting this issue, I planned to keep going without the substitution function.
But I found that even after commented all the substitution part, the code still does work.
So, I re-read the error message again, and spot the @ is still there. the @ is not part of the substitution but the datasink itself.

So the problem is here:

spm_preproc.connect([(realign, datasink, [(‘realignment_parameters’, ‘.@par)]),
(smooth, datasink, [(‘smoothed_files’, ‘.@func)]), ])

I don’t know the rules behind the function, but seem that there must be some string before the “.@”, and here is the problem.

Now, I change the datasink part:

output_folder = 'derivatives'
datasink = Node(DataSink(base_directory=join(output_dir),
                         container='datasink'),  # the name of the sub-folder of base_dirctory
               name = 'datasink')

# Use the following DataSink output substitutions
substitutions = []
# _ses_id_d2_subject_id_001_task_id_exp --> _ses_id_d2sub_001_task_id_exp
subjFolders = [('_ses_id_%s_subject_id_%s_task_id_%s' % (ses_id, sub, task_id), 'sub-%s/ses-%s/' % (sub, ses_id))
               for ses_id in ses_list 
               for sub in subject_list
               for task_id in task_list]

substitutions.extend(subjFolders)

datasink.inputs.substitutions = substitutions

spm_preproc.connect([(realign, datasink, [('realignment_parameters', 'preproc.@par')]),
                     (smooth, datasink, [('smoothed_files', 'preproc.@func')]), 
                     ]) 

Now it works just as I expected!

├── sub-001
│ └── ses-d2
│ ├── swrsub-001_ses-d2_task-exp_bold_roi.nii
│ ├── swrsub-001_ses-d2_task-rest-1_bold_roi.nii
│ ├── swrsub-001_ses-d2_task-rest-2_bold_roi.nii
│ ├── swrsub-001_ses-d2_task-rest-3_bold_roi.nii
│ └── swrsub-001_ses-d2_task-rest-4_bold_roi.nii
└── sub-002
│ └── ses-d2
│ │ ├── swrsub-002_ses-d2_task-exp_bold_roi.nii
│ │ ├── swrsub-002_ses-d2_task-rest-1_bold_roi.nii
│ │ ├── swrsub-002_ses-d2_task-rest-2_bold_roi.nii
│ │ ├── swrsub-002_ses-d2_task-rest-3_bold_roi.nii
│ │ └── swrsub-002_ses-d2_task-rest-4_bold_roi.nii

1 Like