Pros and cons, nipype vs other pipeline engines e.g. Airflow

rcorredorj · January 24, 2019, 7:31am

Hi,

I’m was wondering if you guys can give me your thoughts on this. What are the pros and cons of nipype wrt, for instance, Airflow, and viceversa? Talking about pipelines for medical image processing / neuroimaging, I can agree that there are some common tools that are available through nipype, but airflow seems to me more robust also managing multiple pipelines at the time. Am I wrong ?

Thank you!
Ricardo
RaC

satra · February 5, 2019, 12:36pm

@rcorredorj - i do not have much experience with Airflow or for that matter a lot of different workflow systems that exist.

whenever we have looked at some of these in the past, the current nipype semantics of iterables, JoinNode and MapNode are the first things that we do not observe in many of the systems. The second piece is the ability to execute workflows on multiple types of systems (local computer, HPC clusters, AWS) seem to vary significantly across these systems. Third, many of the systems are built around evaluation of jobs that are typically very short, whereas in scientific computing, jobs can take anywhere from seconds to days. Fourth, very few of these pay attention to the granularity of provenance that is necessary for scientific record keeping.

We are currently finishing a rewrite of the Nipype workflow engine as new project called pydra, that should even help consolidate the semantics of the API further, and make it more usable for generic workflows. We dug into a few systems to see if things have changed, and we found that complexity of semantics of nested for loops and conditionals are still the kind of thing that most workflow systems handle differently.