I’m was wondering if you guys can give me your thoughts on this. What are the pros and cons of nipype wrt, for instance, Airflow, and viceversa? Talking about pipelines for medical image processing / neuroimaging, I can agree that there are some common tools that are available through nipype, but airflow seems to me more robust also managing multiple pipelines at the time. Am I wrong ?
whenever we have looked at some of these in the past, the current nipype semantics of iterables, JoinNode and MapNode are the first things that we do not observe in many of the systems. The second piece is the ability to execute workflows on multiple types of systems (local computer, HPC clusters, AWS) seem to vary significantly across these systems. Third, many of the systems are built around evaluation of jobs that are typically very short, whereas in scientific computing, jobs can take anywhere from seconds to days. Fourth, very few of these pay attention to the granularity of provenance that is necessary for scientific record keeping.
We are currently finishing a rewrite of the Nipype workflow engine as new project called pydra, that should even help consolidate the semantics of the API further, and make it more usable for generic workflows. We dug into a few systems to see if things have changed, and we found that complexity of semantics of nested for loops and conditionals are still the kind of thing that most workflow systems handle differently.