I created a simple nipype pipeline for testing, and it worked when running locally, either without plugin or with MultiProc. However, when I tried to submit the job to SLURM, the pipeline failed to run and it stopped at the very first node. The crash file has the following error message:
The plugin I used is SLURMGraph. I encountered the same problem with SLURM plugin. I am suspecting that it has something to do with the pickle file. Unfortunately I am not familiar with the way pickle works and wasn’t able to find out what the problem is.
Can someone suggest me what I can try to debug this issue? I would greatly appreciate any suggestions.
I am using nipype 1.0.3 with Python 3.6.4.
Thank you for your help!
I would check the following things:
- Cleaning the working directory
- Making sure that the environment (nipype version, PYTHONPATH etc.) is the same for each compute node (BTW Singularity is a great tool to manage environments).
Many thanks for your answers!
Now I checked again and I am suspecting that it has something do to with they way I input the file. I’ll work on it and get back to you.
Thanks a lot!
I cleaned up the working directory and made sure the environment is the same. I then tested on the gzip command line wrapper given on the tutorial website:
I was able to compress the input file locally. However, when I checked the generated _node.pklz file manually, I got the same error message as what I got when running on slurm:
Any clues why this is happening?
I think the issue is that you use custom made interfaces but those interfaces are defined inline instead of being imported as a part of a custom python package that is included in PYTHONPATH.
Thank you so much for the answer!
I imported the custom interface and included its directory in the PYTHONPATH. I was able to locally run it and got the output file. However, I got this ModuleNotFound error:
I am kind of confused about why this is happening. If the custom interface module is included in the search path, how come the module is not found?
Thank you again!
Hard to say. My best guess is that there is some nipype version mismatch between what created the pklz and what is using nipypecli. Could also be a bug - I never used
nipypecli show. Is everything else working?
Thank you for taking time to answer my question!
Everything else is working. When I turned on the debug mode, it didn’t report any errors either. It’s just that the node submitted failed to be executed on SLURM because of the ModuleNotFoundError.