Sharing data with multiple tasks

dvsmith · December 21, 2019, 11:01pm

Hi – I have a neuroimaging dataset with three distinct tasks. Each task is tied to its own research question and pre-registration (on AsPredicted) and thus I suspect that each one will be tied to its own paper. I am hoping to submit the first paper in early 2020, but the other papers will be lagged behind. So, I am wondering how to approach data sharing on OpenNeuro with this project? I don’t want the data from the other tasks to be public until the papers associated with them are at least submitted. Ideally, I’d also like each paper to include a link to the data and code for the analyses. I’d like to set things up where someone could easily install the dataset (via datalad) and reproduce all of the analyses with a few lines of code.

Should I simply upload everything and keep the dataset private for some period of time? Looks like the max is 36 months, which should be more than enough time. Or, should I split the dataset up into three tasks and have three separate uploads and accession IDs on OpenNeuro? Seems like the latter approach could be pretty clean, but I worry it could make it challenging for other groups to use the three tasks (or datasets) together to address other questions.

Thanks!
David

tsalo · December 24, 2019, 4:02pm

I don’t think it would be good to separate the dataset into three, as processing might require shared data (e.g., structurals). A good solution might be to take advantage of the dataset versioning- just upload the core dataset (structurals, dataset-level metadata files) and whichever task has been published in the initial release, and then upload the other tasks to the same dataset as publications come out for those tasks. Each version would have its own link (e.g., https://openneuro.org/datasets/ds002345/versions/1.0.0 and https://openneuro.org/datasets/ds002345/versions/1.0.1), but it would all be in one dataset.

dvsmith · December 25, 2019, 4:02am

Thanks! That’s a great idea!