Summary of what happened:
I’ve been trying a set up a dataset that primarily lives on a web server, but needs to be clone-able by other people. The annex files are visible and downloadable from the server’s website. In particular, the files I’m concerned about here are in a subdataset.
I used datalad addurls
to add the URL of each file on the server to each file in the annex. When I run git annex whereis filename
, it shows up that it lives on the server in the server’s local copy of the dataset, and that it lives on the web, with a correct URL. In fact, if I click on that URL and open it in a browser, it downloads my file.
The dataset lives on Github, but the annex does not. When I make a clone of the superdataset on my personal computer, I get messages like
[INFO ] Unable to parse git config from origin
[INFO ] Remote origin does not have git-annex installed; setting annex-ignore
| This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote origin
install(ok): /home/erin/Documents/DHA/carcas (dataset)
Then when I run datalad get carcas-models/
(where carcas-models
is the name of the subdataset that has my large files in the annex), I get this error message
[INFO ] Unable to parse git config from origin
[INFO ] Remote origin does not have git-annex installed; setting annex-ignore
[INFO ] This could be a problem with the git-annex installation on the remote. Please make sure that git-annex-shell is available in PATH when you ssh into the remote. Once you have fixed the git-annex installation, run: git annex enableremote origin
[INFO ] access to 1 dataset sibling serverweb not auto-enabled, enable with:
| datalad siblings -d "/home/erin/Documents/DHA/carcas/carcas-models" enable -s serverweb
install(ok): /home/erin/Documents/DHA/carcas/carcas-models (dataset) [Installed subdataset in order to get /home/erin/Documents/DHA/carcas/carcas-models]
get(error): carcas-models/models/Alpaca 3rd Carpal L.glb (file) [no known url
no known url
no known url]
get(error): carcas-models/models/Alpaca 4th Carpal L.glb (file) [no known url
no known url
no known url]
get(error): carcas-models/models/Alpaca Cranium.glb (file) [no known url
no known url
no known url]
get(error): carcas-models/models/Alpaca Mandible.glb (file) [no known url
no known url
no known url]
get(error): carcas-models/models/goat_mm.glb (file) [no known url
no known url
no known url]
action summary:
get (error: 5)
install (ok: 1)
I’m stuck on how to debug, because when I run git annex whereis models/Alpaca\ 3rd\ Carpal\ L.glb
, everything looks correct:
whereis models/Alpaca 3rd Carpal L.glb (2 copies)
00000000-0000-0000-0000-000000000001 -- web
095e299d-037e-4172-87e0-bbd7183a6613 -- CARCAS models on the 3dviewers server [here]
web: https://3dviewer.sites.carleton.edu/carcas/carcas-models/models/Alpaca%203rd%20Carpal%20L.glb
ok
Why can’t datalad get
find the models? How do I set things up properly so that people with clones from Github can download the models using datalad get
, pulling from the URL?
Screenshots / relevant information:
My operating system is Fedora 39,
I’m using Python 3.11.8
My Datalad version is 0.19.6