I’ve been experimenting with feeding fMRI methods sections to current LLMs (Claude mainly) and asking them to generate the corresponding preprocessing/analysis scripts.
The results are surprisingly coherent—one test produced a 700-line Nipype workflow that correctly handled motion correction, slice timing, coregistration, CompCor, the works. Not perfect, but maybe 85-90% of the way there.
Curious if anyone else has tried this, or if there’s interest in building something more robust around this idea—some kind of “co-pilot” for neuroimaging analysis.
Happy to share what I’ve tried so far if there’s interest.
thanks @fortheatlantic for this post and trying out workflow generation. i know @jbpoline 's group has been doing some of this extraction as well and more broadly the efforts around niwrap, styx, pydra, and nipoppy all point towards some “co-pilot for neuroimaging” directions.
with many established pipelines in play, it would be interesting consider where a co-pilot would be most useful. as an initial target, would this be well-suited for:
the initial preprocessing (given that many simply use fmriprep or hcp pipelines)
post pre-processing (cohort selection, downstream analyses)
special cases (NHP, multiscale data)
new algorithm development for tackling specific problems (deep sampling)
and how would someone using this validate the recommendations and have guardrails for parameter settings (at any stage). one key advantage to a co-pilot is that provenance tracking becomes more automated, something that has been non-trivial to implement in practice. given the technology available (standards, tools, and llms/agents), all of this above are feasible targets, just needs some care in ensuring that these costly systems don’t just run in circles and are efficient and accurate in getting to an appropriate path.
Thanks for responding! @satra I wasn’t aware of nipoppy etc. Just checked it out and I can see how these efforts are converging on similar goals from different angles. To your question about where a co-pilot would be most useful: my instinct is post-preprocessing. fMRIprep solved a lot of preprocessing headaches, but everything downstream is still scattered. Being able to describe analyses in natural language would remove a huge barrier to entry.
On validation and guardrails, I was thinking about:
Some kind of ‘explain back’ mechanism: every code block comes with plain english annotation, linking it to the method/intent
Pseudocode first: models propose a plan before writing syntax and the user confirms
Noting differences from known pipelines: flagging deviations from standard approaches
Checkpoint visualizations: auto generate QC plots at key stages (though maybe less relevant for post-preprocessing analysis)
Your point about provenance tracking is interesting – I hadn’t thought about it in that way, but a copilot that documents what was done and why could help with reproducibility in a way thats hard to do manually. Anyway, I’m faculty in a psych dept, and definitely don’t have the expertise to build something like this, but would be happy to chat with those who do! Would love to hear more about what you and others have been exploring.