Sad times with Nipype caching

veliebm · October 10, 2020, 3:21pm

Greetings neurocitizens!

I’m encountering what I believe to be a bug in Deconvolve, an AFNI interface in Nipype. I’m trying to get IRF data by feeding the parameter “[-iresp 1 sub-{self.subject_id}_IRF-all]” into Deconvolve using its args input. I’m running Deconvolve with a Nipype memory cache.

Here’s the bug: When I run Deconvolve with a memory cache, I can’t find any IRF files in its output folder. But when I run Deconvolve on its own, with the Deconvolve.run() method, the IRF files appear. I’m puzzled that the interface is clearly able to produce IRF files but fails to do so when I run it with a cache.

My current Deconvolve method, which uses Deconvolve.run() (attached to a Python class that performs a 1st-level analysis):

def Deconvolve(self):

    """
    Runs the 1st-level regression on the smoothed functional image.

    Wraps AFNI's 3dDeconvolve.
    
    AFNI command info: https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/programs/3dDeconvolve_sphx.html#ahelp-3ddeconvolve
    Nipype interface info: https://nipype.readthedocs.io/en/latest/api/generated/nipype.interfaces.afni.model.html#Deconvolve


    Returns
    -------
    InterfaceResult
        Stores information about the outputs of Deconvolve.

    """

    # Prepare regressor text files to scan into the interface.
    self._break_tsv(self.paths["events_tsv"], self.dirs["subject_info"])
    self._break_tsv(self.paths["regressors_tsv"], self.dirs["regressors"])
    
    # Total amount of regressors to include in the analysis
    amount_of_regressors = 1 + len(self.regressor_names)

    # Create string to pass to interface. Remove all unnecessary whitespace by default.
    arg_string = ' '.join(f"""
        -input {self.results["SUSAN"].outputs.smoothed_file}
        -GOFORIT 4
        -polort A
        -num_stimts {amount_of_regressors}
        -stim_times 1 {self.dirs["subject_info"]/'onset'}.txt 'CSPLINzero(0,18,10)'
        -stim_label 1 all
        -iresp 1 sub-{self.subject_id}_IRF-all
        -fout
    """.replace("\n", " ").split())

    # Add individual stim files to the string.
    for i, regressor_name in enumerate(self.regressor_names):
        stim_number = i + 2
        stim_file_info = f"-stim_file {stim_number} {self.dirs['regressors']/regressor_name}.txt -stim_base {stim_number}"
        stim_label_info = f"-stim_label {stim_number} {regressor_name}"
        arg_string += f" {stim_file_info} {stim_label_info}"

    # Create output dir for Deconvolve stuff.
    deconvolve_dir = self.dirs["output"]/"nipype-interfaces-afni-model-Deconvolve"
    deconvolve_dir.mkdir(exist_ok=True)

    # Run the Deconvolve interface.
    return Deconvolve().run(
        cwd=str(deconvolve_dir),
        args=arg_string
    )

My old Deconvolve method, which uses caching:

def Deconvolve(self):

    """
    Runs the 1st-level regression on the smoothed functional image.

    Wraps AFNI's 3dDeconvolve.
    
    AFNI command info: https://afni.nimh.nih.gov/pub/dist/doc/htmldoc/programs/3dDeconvolve_sphx.html#ahelp-3ddeconvolve
    Nipype interface info: https://nipype.readthedocs.io/en/latest/api/generated/nipype.interfaces.afni.model.html#Deconvolve


    Returns
    -------
    InterfaceResult
        Stores information about the outputs of Deconvolve.

    """

    # Prepare regressor text files to scan into the interface.
    self._break_tsv(self.paths["events_tsv"], self.dirs["subject_info"])
    self._break_tsv(self.paths["regressors_tsv"], self.dirs["regressors"])
    
    # Total amount of regressors to include in the analysis
    amount_of_regressors = 1 + len(self.regressor_names)

    # Create string to pass to interface. Remove all unnecessary whitespace by default.
    arg_string = ' '.join(f"""
        -input {self.results["SUSAN"].outputs.smoothed_file}
        -GOFORIT 4
        -polort A
        -num_stimts {amount_of_regressors}
        -stim_times 1 {self.dirs["subject_info"]/'onset'}.txt 'CSPLINzero(0,18,10)'
        -stim_label 1 all
        -iresp 1 sub-{self.subject_id}_IRF-all
        -fout
    """.replace("\n", " ").split())

    # Add individual stim files to the string.
    for i, regressor_name in enumerate(self.regressor_names):
        stim_number = i + 2
        stim_file_info = f"-stim_file {stim_number} {self.dirs['regressors']/regressor_name}.txt -stim_base {stim_number}"
        stim_label_info = f"-stim_label {stim_number} {regressor_name}"
        arg_string += f" {stim_file_info} {stim_label_info}"

    # Create output dir for Deconvolve stuff.
    deconvolve_dir = self.dirs["output"]/"nipype-interfaces-afni-model-Deconvolve"
    deconvolve_dir.mkdir(exist_ok=True)

    # Run the Deconvolve interface.
    return self.memory.cache(Deconvolve)(
        args=arg_string
    )

I really like caching because I struggle to piece together bonafide Nipype workflows. What might I be doing wrong?

Unrelated bonus question: I totally overuse the “args” input all Nipype interfaces have. It’s super convenient, because it lets me copy-paste my lab’s AFNI workflows straight into Python! Is there a downside to overusing “args”?

satra · October 10, 2020, 5:52pm

this may be a potential bug with the interface or the use of args. would require a bit of debugging.

the intent of Node in nipype which is used for caching is to create an isolated space to save the results of an interface. this requires that a process that runs in a given directory picks up the outputs from within the cache/working directory. many tools generate output not in the directory they were working in, but in the location of the input. it is possible that because of the way the input args are being constructed, the interface is not returning appropriate outputs. for debugging, could you please try setting the actual named inputs of the interface (instead of args) and see if that works with caching?

Is there a downside to overusing “args”?

this has to do with what is in the args, and significant downsides if the string contains input files, if you are using caching.

using a string doesn’t allow appropriate hashing of things like files. nipype has no idea whether an input file has changed or not, so you are hashing the value of the string. if your prior output changes but your names do not, using strings + memory cache will never re-trigger the new node. so in that sense it’s a significant downside if you make assumptions that the memory cache will rerun if inputs change.

as a simple example, in the following scenario, if input_file changes the output of your memory_cached interface will not change between subsequent executions. this is because the hash of the string args remains the same.

args = '-i /path/to/a/input_file'
memory_cache(SomeInterface)(args=args)

on the other hand, if files are not part of args, then there is no direct downsides. however, you will lose the ability to programmatically query input values. for example, what is the value of the input polort in 3dDeconvolve.

args = '... -polort 42 ...'
memory_cache(Deconvolve)(args=args)

I struggle to piece together bonafide Nipype workflows.

we would love your feedback on pydra, our new workflow engine to make sure that we have made it easier not harder to piece together workflows ! dataflows do make computation more structured and reproducible.

veliebm · October 12, 2020, 7:18pm

Thanks for trying to help! I hate to tell you this, but my PI and I decided today to give up on Nipype. From now on, we’re going to use fMRIPrep for preprocessing and AFNI for post-preprocessing. Nipype seems really cool, but I feel frustrated when I try to actually use it to build workflows. Also, the AFNI interfaces we need in Nipype are either missing options we need (Nipype’s Deconvolve doesn’t allow me to use “-iresp” as a parameter! I have no choice but to use “args” ) or interfaces are missing altogether. (Nipype doesn’t have interfaces for 3dttest or 3dMEMA at all!)

That being said, I’m peeking at the Pydra GitHub now and it looks pretty nifty. I’ll have to read more about it

satra · October 12, 2020, 11:10pm

no worries, i think creating workflows in nipype requires comfort, and perhaps more importantly expertise, with a few things. if you using fMRIPrep you are using Nipype and, perhaps more importantly, relying on a well crafted and tested workflow. perhaps your nipype knowledge will be helpful if you have to debug it at any point.

in general though, nipype is not exhaustive and we rely on the community to either bring up missing interfaces/fields and/or to add/improve them.