Datalad: Downloading symlinks data

Hi,

I am trying to download a repository using datalad clone command. It downloaded the dataset with symlinks. So I used datalad get to get the actual content of file. Though these commands run without any problems, I still cannot open the file. It’s showing symlink files:

sub-S001_T1w_defaced.nii.gz -> 
../../../../.git/annex/objects/06/xQ/MD5E-s10835011--34bcbedf42ad11a3f98d25aec0a0a530.nii.gz/
MD5E-s10835011--34bcbedf42ad11a3f98d25aec0a0a530.nii.gz

How can I get the actual file? Any help will be appreciated.

Thank you.
Ruba


Hi @Rubaida001,

What are you trying to open the files with, and how are you trying to open the file? How much storage do you see a downloaded .nii.gz taking on your filesystem?

Best,
Steven

Hi Steven,

Many thanks for the quick response.

As those are .gz file, I simply tried to unzip it. So, for this file (sub-S001_T1w_defaced.nii.gz) I am expecting a NIFTI file with similar name, instead it’s creating MD5E-s10835011–34bcbedf42ad11a3f98d25aec0a0a530.nii.

The size of downloaded file is 148 bytes whereas the actual size is 10MB.

Ruba

Hi @Rubaida001,

Can you share your datalad get command and any terminal outputs when running it? To clarify, what you should be doing is something like

cd $DATALAD_DATASET_ROOT
### You can also cd further to the folder you want to get data from, if there's a specific one
datalad get $FILENAME 
### If you are not in the folder where the file is, you need to provide a path to that file
### Wildcard patterns e.g., sub-01*.nii.gz, are acceptable

This suggests that the file did not download, as 148 bytes sounds like the size of an empty symlink.

Best,
Steven

1 Like

Hi,

This is the command I have used :
datalad get data/PSYD_0102_S001_B/sub-S001/anat/sub-S001_T1w_defaced.nii.gz
There is no output in the terminal.

I also used this command: datalad get . -r to get all files content. The terminal output looks like this:

Ruba

Hi @Rubaida001,

That looks good. Keep in mind the data in the repo are still symlinks, but now should be linked to a bigger file. What were you trying to open the nifti with, and does it work now?

Best,
Steven

Hi Steven,

I can see only symlinks in my desktop. The size is also same 148 bytes. Here is a screenshot:

Best,
Ruba

Hi @Rubaida001,

Is this dataset available somewhere online such that I can try to reproduce this?

Best,
Steven

Hi Steven,

Here is the link of the dataset. As it is a private repo, you will not have the access. Is there any way to add you as a collaborator?

Thank you.
Ruba

Hello,

I tried to clone the repo again using following command:

datalad clone https://gin.g-node.org/psytest/TSYD_0101.git
cd TSYD_0101
datalad get . -r

I got this error:

Best,
Ruba

There’s a weird GIN/git-annex quirk where using the .git suffix disables the annex remote. You should be able to git annex enableremote origin, but if that doesn’t work, re-clone without the .git.

Hi,

I tried to run enableremote command but it failed with this message:

 enableremote (normal) origin 
  Remote origin not usable by git-annex; setting annex-ignore
  https://gin.g-node.org/psytest/TSYD_0101/config download failed: Not Found
failed
enableremote: 1 failed

I also clone the repo without .git (datalad clone https://gin.g-node.org/psytest/TSYD_0101) but cannot download the actual file content. The direcory has only the symlink files.

Best,
Ruba

The directory will always have symlinks. The linked files are where the data are contained. If you, for example, freeview one of the files, it should follow the symlink and load the data.

If it’s very important not to have symlinks, you can unlock the files:

datalad unlock .

Note that this will ~double the amount of space the dataset consumes.

Hi,

I am trying to download the data from repo. I have tried using both the datalad clone and datalad install commands, and then running datalad get . -r to get the actual file content. However, I keep getting the error message get (error) [Not available; .git remotes have annex-ignore set:origin. Just to check that I am creating the repository correctly, here are the commands I used:

    datalad create -c text2git TSYD_0101
    cd TSYD_0101
    datalad save -m "add files in repo"
    datalad siblings add -d . \
      --name gin \
      --url git@gin.g-node.org:/psytest/TSYD_0101.git
    datalad push --to gin

Any help will be appreciated.

Thank you.
Ruba

Seems fine. What do you get with

datalad clone https://gin.g-node.org/psytest/TSYD_0101
cd TSYD_0101
datalad get .
datalad unlock .

I got following error after running datalad get .:

Could you show the clone command as well?

Here is the output of clone command:

Oh, it might be inability to access data from private repos over http; could there be 401 and not 404 · Issue #111 · G-Node/gogs · GitHub. You may have to use SSH remotes in order to use git-annex on a private repository.

1 Like