Hi, I want to run Nipype in a distributed fashion on our cluster using ipyparallel.
I can run the processing in parallel on a single machine using multiproc, ipcluster or ipcontroller + ipengines. Now, whenever I try to run the engines in a distributed manner on different machines, using ipcontroller on a host machine and ipengines on different machines it fails. The engines are able to successfully register with the controller. Also my DataGrabber part starts out on all machines. But apparently my workflow nodes cannot find the temporary files which the previous nodes output… They seem to be in a /tmp directory i.e. locally on one of the machines. I don’t understand why I cannot change the directories (I tried the base_dir) nor why the different steps are not done on the same machine. I.e. first_step[file1]->second_step[file1]->last_step[file1] should all be on the same machine, right? But then they should have access to the same /tmp directory.
FileNotFoundError: [Errno 2] No such file or directory: ‘/tmp/tmpsv83oyqc/preprocessing/_subject_id_ADNI_136_S_0426/DataGrabber/result_DataGrabber.pklz’
Any help will be appreciated!