Thanks @yarikoptic !
We’ve now set up a simple script that builds a csv from OSF metadata and then calls datalad to update the dataset (see
addurls_from_csv here). My workflow is:
(1) Clone the dataset that we want to update from GitHub. (At this point, it’s in a few cases necessary to re-run
(2) Remove any obsolete files using
(3) Import the utility functions (same link as above, can’t post multiple links) and run
(4) Push the changes back to GitHub.
At this point, the local repo works fine if, for instance, I call
datalad get. However, if I clone the remote on GitHub and then call the same
datalad get command, I instead receive the error:
>>> datalad get tpl-NKI_res-01_label-brainNoCerebellum_probseg.nii.gz
[WARNING] Running get resulted in stderr output: git-annex: get: 1 failed
[ERROR ] not available; No other repository is known to contain the file.; (Note that these git remotes have annex-ignore set: origin) [get(/Users/rastko/Downloads/tpl-NKI/tpl-NKI/tpl-NKI_res-01_label-brainNoCerebellum_probseg.nii.gz)]
get(error): /Users/rastko/Downloads/tpl-NKI/tpl-NKI/tpl-NKI_res-01_label-brainNoCerebellum_probseg.nii.gz (file) [not available; No other repository is known to contain the file.; (Note that these git remotes have annex-ignore set: origin)]
I was hoping you might be able to advise as to where our workflow is incorrect – perhaps I’m losing the sibling somehow? I would be happy to provide any additional information if it would be helpful in any way (e.g., reproducing error). Here’s an example of what the resulting dataset looks like on GitHub.
Thanks in advance for any help!