Hi Datalad Team,
I stupidly performed git annex uninit
in my superdataset rather than in a subdataset which I wanted to datalad remove
(permission issues in the git (annex) repo were preventing me from succesfully executing it), followed by manual removal of .git/annex and .git.
I performed git init
and git annex init
in the superdataset, reinitiating it as a git and git annex repo (and datalad dataset), but now my original subdatasets are directories in the superdataset rather than actual subdatasets. History will obviously be lost in the superdataset (which is not a disaster), but seems fine in the directories (hence, they are still datalad datasets, but not known by the superdataset). Losing it would not be a huge disaster either though if needed.
No remotes configured yet.
I was checking the datalad create
docs but it was slightly unclear to me what would happen with my existing directories (which are datalad datasets on their own already) if I execute the command based on the info that “This command only creates a new dataset, it does not add existing content to it, even if the target directory already contains additional files or directories”.
Could you please advise me on the easiest way to turn these directories into subdatasets again (.gitmodules still exists)?
Stupid mistake, thanks a lot in advance for helping me cleaning up the mess!
Best wishes,
Lukas
Hi All,
Just a gentle reminder on this post, would someone be able to give me advice on how to fix this?
Thanks,
Lukas
Sorry, missed original question.
And I never used uninit… Judging from docs it should’ve not effected submodules etc… But also - do you have a remote clone of this super dataset? We will need to recover git-annex branch.
ATM I am away from the laptop, but would also check if git reflog had idea where git annex branch was right before, so we could recover (unlikely git gc’ed it yet). And you should be able to git reset --hard your main branch and tree to prior state
Before doing anything though, of feasible, I would’ve made backup of that entire folder (filesystem snapshot or plain tarball) so I could experiment and get back if needed
Hi Yarik,
Thanks a ton!
Unfortunately, not remote clone on GIN yet.
Please find the output of git log, git reflog, git branch (listing the git-annex branch), and git status (executed in the superdataset) below. Nothing very helpful atm, but the fact that git branch still lists the git-annex branch seems promising to me? Any advice on how to proceed?
Maybe this is way too simplistic, but what would a datalad save (-r) from the superdataset do?
Thanks a ton!
Best wishes,
Lukas
u0027997@gbw-s-labgas01:/data/proj_moodbugs/proj_moodbugs_wp2$ git log
fatal: your current branch ‘master’ does not have any commits yet
u0027997@gbw-s-labgas01:/data/proj_moodbugs/proj_moodbugs_wp2$ git reflog
fatal: your current branch ‘master’ does not have any commits yet
u0027997@gbw-s-labgas01:/data/proj_moodbugs/proj_moodbugs_wp2$ git branch
git-annex
u0027997@gbw-s-labgas01:/data/proj_moodbugs/proj_moodbugs_wp2$ git status
On branch master
No commits yet
Untracked files:
(use “git add …” to include in what will be committed)
.bidsignore
.datalad/
.gitattributes
.gitignore
.gitmodules
BIDS/
code/
derivatives/
mriqc/
pipeline/
sourcedata/
study_config/
nothing added to commit but untracked files present (use “git add” to track)
u0027997@gbw-s-labgas01:/data/proj_moodbugs/proj_moodbugs_wp2$
Dear @yarikoptic, just a gentle reminder of the above - many thanks in advance!
yikes… that shouldn’t be the result of the git annex uninit
alone… and even followed up with git init
and git annex init
I do not see locally having history wiped out ATM on my quick attempt to reproduce
❯ datalad clone ///
install(ok): /tmp/datasets.datalad.org (dataset)
datalad clone /// 14.40s user 1.05s system 51% cpu 30.072 total
❯ cd datasets.datalad.org
abide/ crcns/ hbnssi/ nidm/ shub/
abide2/ dandi/ hcp-openaccess/ ohbm/ simon/
adhd200/ datalad/ indi/ openfmri/ studyforrest/
allen-brain-observatory/ datapackage.json kaggle/ openneuro/ templateflow/
centerforopenneuroscience/ dbic/ labs/ openneuro-derivatives/ workshops/
cifar/ devel/ naturalistic-data/ physionet/
conp-dataset/ dgenomes/ neuralensemble/ psychoinformatics-de/
corr/ dicoms/ neurovault/ repronim/
❯ git reflog
44444a6 (HEAD -> master, origin/synced/master, origin/master, origin/HEAD) HEAD@{0}: clone: from https://datasets.datalad.org/.git
❯ git status
On branch master
Your branch is up to date with 'origin/master'.
nothing to commit, working tree clean
❯ git annex uninit
uninit check ok
uninit objects ok
uninit finish ok
❯ git reflog
44444a6 (HEAD -> master, origin/synced/master, origin/master, origin/HEAD) HEAD@{0}: clone: from https://datasets.datalad.org/.git
❯ git br -a
* master
remotes/origin/HEAD -> origin/master
remotes/origin/git-annex
remotes/origin/master
remotes/origin/synced/master
❯ ls -l .git/refs/heads
total 4
-rw-rw-r-- 1 yoh yoh 41 Sep 15 09:25 master
❯ git init
Reinitialized existing Git repository in /tmp/datasets.datalad.org/.git/
❯ git reflog
44444a6 (HEAD -> master, origin/synced/master, origin/master, origin/HEAD) HEAD@{0}: clone: from https://datasets.datalad.org/.git
❯ git annex init
init ok
(recording state in git...)
❯ git reflog
44444a6 (HEAD -> master, origin/synced/master, origin/master, origin/HEAD) HEAD@{0}: clone: from https://datasets.datalad.org/.git
❯ git log
commit 44444a6e88e610cd4643e77f9349d1a9d62c5686 (HEAD -> master, origin/synced/master, origin/master, origin/HEAD)
Author: Yaroslav Halchenko <debian@onerussian.com>
Date: Wed Sep 3 14:24:27 2025 -0400
Recrawl and update openneuro
commit f34dabae3274cad3782599c68fd12cd0fdfa8fb6
Author: Yaroslav Halchenko <debian@onerussian.com>
Date: Sat Aug 30 03:45:16 2025 -0400
Updated to 10.20250828 from 10.20250721
...
space permitting I would have created a tarball of the entire thing in current state just in case (tar -czvf /data/proj_moodbugs/proj_moodbugs_wp2-backup.tgz /data/proj_moodbugs/proj_moodbugs_wp2
or alike) and then proceeded with datalad save
if you think that all underlying datasets are all ok. or just do with one, do it with datalad save -d . pathtosubds
. if all looks good, then indeed just do datalad save -m "resaving all datasets after some disastrous killing via uninit/init etc" -d . -r
or alike
also might check if any other branch survived may be somehow somewhere? ls -l .git/refs/heads
Looking into the bright future: I use and recommend to others to use filesystems with snapshots support. In my case it is BTRFS and I use btrbk
to establish daily backup snapshots (and then transfer to remote backup even). This way user mistakes could be mitigated easily since you can get prior state from e.g. a day old snapshot
Thanks a lot @yarikoptic! The simple datalad save -r approach worked perfectly fine!