The strange thing is that the exact same command runs perfectly fine when not wrapped in a datalad run command, and this also applies to fmriprep-docker, docker run, and even simple bash commands like mkdir (see error messages in my earlier message).
The only difference with your situation is that the KUL_dcm2bids.sh script is located outside of the superdataset (or any subdatasets) in my case, but is in my path, so that should not be the problem I guess?
Or do scripts need to be in the dataset itself for them to be run with datalad run?
Interestingly, when trying to run the entire datalad run command again, I now get an error that is similar to what I get when I try to run the simple bash commands of fmriprep or mriqc commands wrapped in datalad run:
/bin/sh: 1: KUL_dcm2bids.sh -d sourcedata/sub-KUL005 -p KUL005 -c study_config/sequences.txt -v: not found
But also it looks like datalad tries to pass that entire invocation as a command name, which is odd. That is why want to see denture invocation and version of datalad
PATH is /home/luna.kuleuven.be/u0027997/.local/bin:/home/luna.kuleuven.be/u0027997/gitkraken:/opt/weasis/bin:/usr/local/freesurfer/bin:/usr/local/freesurfer/fsfast/bin:/usr/local/freesurfer/tktools:/usr/local/fsl/bin:/usr/local/freesurfer/mni/bin:/opt/KUL_apps/KUL_FWT:/opt/KUL_apps/KUL_FWT:/opt/KUL_apps/KUL_VBG:/opt/KUL_apps/KUL_NeuroImaging_Tools:/opt/KUL_apps/dcm2niix/build/bin:/opt/KUL_apps/ANTs_installed/bin/:/opt/mrtrix3/bin:/opt/anaconda3/bin:/opt/anaconda3/condabin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games:/usr/local/games:/snap/bin:/opt/puppetlabs/bin
The script is in a cloned github repo with path /opt/KUL_apps/KUL_NeuroImaging_Tools
Here is the entire invocation and output
u0027997@gbw-s-labgas01:/data/proj_discoverie$ datalad run -m “run KUL_dcm2bids.sh on sub-KUL005” --input “sourcedata/sub-KUL005/" --output "BIDS/sub-KUL005/” “KUL_dcm2bids.sh -d sourcedata/sub-KUL005 -p KUL005 -c study_config/sequences.txt -v” --dry-run
[INFO ] Making sure inputs are available (this may take some time)
[INFO ] == Command start (output follows) =====
/bin/sh: 1: KUL_dcm2bids.sh -d sourcedata/sub-KUL005 -p KUL005 -c study_config/sequences.txt -v: not found
[INFO ] == Command exit (modification check follows) =====
[INFO ] The command had a non-zero exit code. If this is expected, you can save the changes with ‘datalad save -d . -r -F .git/COMMIT_EDITMSG’
CommandError: ‘‘KUL_dcm2bids.sh -d sourcedata/sub-KUL005 -p KUL005 -c study_config/sequences.txt -v’ --dry-run’ failed with exitcode 127 under /data/proj_discoverie
Having that --dry-run at the end, makes datalad assume you have that long thing the command and --dry-run to be is argument. Move --dry-run to be eg right after run
depends on how you installed. If we pip – pip install --upgrade datalad. if conda – just wait a bit (update ongoing) or conda uninstall datalad && pip install datalad
Upgrading datalad to 0.15.0 using pip worked fine, enabling the dry-run option.
When I run without dry-run now, I get an error about an outdated git-annex version
u0027997@gbw-s-labgas01:/data/proj_discoverie$ datalad run -m "run KUL_dcm2bids.sh on sub-KUL005" -i "sourcedata/sub-KUL005" -o "BIDS/*" "KUL_dcm2bids.sh sourcedata/sub-KUL005 -p KUL005 -c study_config/sequences.txt -v"
[INFO ] Making sure inputs are available (this may take some time) [ERROR ] OutdatedExternalDependency(No working git-annex installation of version >= 8.20200309. Visit http://handbook.datalad.org/r.html?install for instructions on how to install DataLad and git-annex… You have version 8.20200226) (OutdatedExternalDependency)
How shall I upgrade this?
Does not seem to work using pip or sudo apt-get upgrade or update, nor git annex upgrade in the repo? I also adapted git config based on upgrades
Reading package lists… Done Building dependency tree Reading state information… Done git-annex is already the newest version (8.20200226-1). git-annex set to manually installed. The following packages were automatically installed and are no longer required:
libllvm11 libpython2-stdlib linux-headers-5.4.0-80 linux-headers-5.4.0-80-generic linux-image-5.4.0-80-generic linux-modules-5.4.0-80-generic linux-modules-extra-5.4.0-80-generic python2 python2-minimal shim* Use ‘sudo apt autoremove’ to remove them. 0 upgraded, 0 newly installed, 0 to remove and 2 not upgraded.
The output reads “git-annex set to manually installed”, not sure whether this is a problem for datalad or any other applications using git-annex?
As far as I understand it, the problem seems to be that the version of git-annex that is installed on my Ubuntu 20.04.3 system while installing datalad (probably just using sudo apt-get install datalad as per these instructions from the handbook, rather than using pip) is not the most recent version, and that apt-get fails to update to more recent versions?
Uninstalling datalad and git-annex and then installing using the datalad installer (using -m pip for datalad, and -m datalad/git-annex:tested) did the trick, resulting in up-to-date versions of both datalad and git-annex.
A new problem arose, however, when wrapping fmriprep-docker into a datalad run command.
fmriprep runs fine, but the datalad save at the end bumps into permission denied errors when trying to add the large output files (.nii.gz) to the git-annex repo. Permissions for the respective dirs look fine, and the outputdir derivatives/fmriprep was correctly specified in the -o option “derivatives/fmriprep/*”.
I have this problem with my previous installation when running fmriprep-docker as a standalone command, followed by datalad save. Running the datalad save command as sudo solved the issue, but sudo datalad does not work anymore now, probably because of pip installation.
Which installation methods would you recommend to get both the most recent versions of datalad and git-annex, while at the same time being able to solve the above issue using sudo? Or do you have another workaround for this issue?
You are on Linux, so why not to use singularity container, even if built from docker (IIRC with datalad-container if you container-add docker://... it will import it into singularity. Also note that we have already prepared singularity containers for all bids apps within https://github.com/ReproNim/containers/ which I have mentioned before.
PS never use sudo unless you really need to perform some admin tasks! now you need to do something like sudo chown -R $USER.$USER DATASET_PATH to bring it to hopefully sane state in terms of permissions.
I am trying to set up a workflow for this dataset, so simply started over again.
For now, I want to get things up and running without containerizing, i.e. with datalad run rather than container-run. The problem below does not depend on this, because I get the same problem when running fmriprep-docker as a standalone command (which works fine) and then trying to datalad save.
Things work fine for mriqc, but for fmriprep, I keep on running into the same issue: fmriprep runs fine, but things go awry when datalad tries to save - see full error below.
I tried several options including creating my fmriprep output directory as a subdataset (and even datalad get after that) before running fmriprep (from the superdataset), not specifying it before and letting fmriprep create it, but in every scenario, I keep on getting the error that the large files cannot be added (to git-annex).
Any idea what the problem may be and how I could solve it?
Thanks a lot again!
Lukas
Here is my command (scenario where I did not create the fmriprep output dir as a subdataset before)
so – check owner/permissions on those files and directories they are under?
I bet since you are using Docker (although you said that without containerizing, you are using fmriprep-docker which I guess uses docker) – they might be owed by root, and hence those errors. That is why I suggested to use singularity instead so it runs in your user namespace. not familiar with fmriprep-docker enough to advice on either it does proper UID/GID mapping etc. But again – check permissions first!
The problem indeed seems to be that fmriprep-docker (and to a lesser extent mriqc ran in Docker) creates an output directory by default with root ownership, preventing a succesful datalad save for the annexed files due to permissions.
Last update: after some tweaking of the singularity installation, everything works fine.
You may want to add to the handbook that wrapping fmriprep-docker in a datalad run command is not a viable option due to permission issues on multi-user systems?!