Working with datasets with multiple branches

I am interested in understanding if/how DataLad supports datasets with multiple branches. Since datasets are git repositories I guess it would be possible to use the git branch command but I don’t know if this would somehow interfere with DataLad commands or if DataLad provides special tools for working with multiple branches.

I have read trough the basic’s part of the handbook and I have looked for documentation on this but I couldn’t find it so please let me know if I missed something.

bad news: DataLad doesn’t care much about branches ATM

good news: because of that, there should be no problem using git branch (and git checkout) directly – should work just fine (you better don’t checkout git-annex branch though ;)). The only “pain” would come whenever you need to manipulate super-datasets (git repositories with git submodules – i.e. subdatasets) – git submodule mechanism AFAIK still lacks many useful semantics. E.g. git checkout --recurse-submodules would not (I assume, didn’t check) switch to the “desired” branch since “branch” concept in git is just a pointer, and my magic might be your foo :wink:

ideas/WiP:

edit: all of the above said assuming you are on a regular POSIX compliant filesystem, and thus do not need to use adjusted branches for git-annex.

1 Like