Hi – I’m trying to incorporate datalad into my general lab procedures for data management. So far, it’s been really nice for data sharing across labs and institutions, but I’ve ran into a couple of issues (perhaps because I’m missing something):
- When I use
datalad publishmy files do not go to the Google Drive remote I set up and linked when I created the sibling on GitHub.
- Once I accumulate a lot of files (e.g., after running FMRIPREP on a dataset with 50 participants), the datalad save command becomes very slow. It doesn’t seem to be copying anything to the remote, but it is creating the sym links to the files in the annex. Even after 12 hours, it is not done.
Here are some of my commands:
# create dataset mkdir srndna-public-test cd srndna-public-test datalad create --annex-version 7 --text-no-annex --description "SRNDNA public test data on Smith Lab Linux" --shared-access group # create remote on Google Drive via Rclone git annex initremote gdrive type=external externaltype=rclone target=dvs-temple prefix=srndna-public-test/annex chunk=50MiB encryption=none rclone_layout=lower # create sibling and save setup datalad create-sibling-github srndna-public-test -s DVS-Lab-GitHub --github-organization DVS-Lab --publish-depends gdrive datalad save . -m "initial save" --version-tag "initialsetup" # convert to bids and save. this works ok. datalad run -m "heudiconv, defacing, and mriqc" "bash run_prepdata.sh" # run FMRIPREP and and FSL and then try to save again datalad save -m "add preprocessing and level 1 stats"
That last save runs forever. And nothing ever appears to go Google Drive. Sorry if I’m missing anything and thanks for such a great tool!