What is the best way to have data stored in incoming branch, in the incoming-native branch?

I created a dataset using datalad ukb extension and downloaded some uk biobank data using the datalad ukb-update command. My data is stored in the incoming branch as zip files. I want to have them in the incoming-native branch that extracts files. What is the best way to do that? should I do something like merging the two branches?

This should be happening automatically. The incoming branch should have the raw downloads (ZIP and other formats provided by UKB). The `incoming-native’ branch has the extracts, see http://docs.datalad.org/projects/ukbiobank/concepts.html#branches

Can you possibly post the exact invocation of ukb-init and ukb-update that you have used? I suspect that something might have gone wrong, if you don’t see any extracts.

I should have explained the situation more accurately. I used ukb-init with individual id and field ids as advised in documents, and then used the ukb-update with a surrogate ukbfetch to move the already downloaded data from another place to the created dataset folder. All of these operations has been done with an older version of datalad-ukb that resulted in having the data in the incoming branch. when I rerun those commands with the updated version for a new individual, data is in incoming-native branch and extracted. So, the main question is how can I have the already moved data which are in the incoming branch in incoming-native?

Ah, thanks for the clarification. I think the system is not meant to be able to perform what you desire without further action.

What ukb-update does, is to rewrite the managed branches completely with every “new” download. So if you only feed it a “partial” update, it will not include any of the old downloads.

This is done, because UKB does not provide any indication of a version. They may change any data record content, data record availability, and even participant availability at any point in time and without notice. Therefore we use a fresh download of all data records for a participant at the time of an update, to have a consistent record matching the time of the update.

So if you really do know that nothing has changed, or you chose to not care about this, you have to take the original ZIPs out of the previous state of the incoming branch, and reinject them together with the other, new downloads. Symlinks to the annex should be good enough (but I have not tried that recently).

If there is demand for this to become a regular update feature, please file an issue at Issues · datalad/datalad-ukbiobank · GitHub

Thanks for the explanation. Now the behavior of the ukb-update is clear to me. will do your suggested procedure.

I tried your suggestion and again, it ended up in incoming branch with nothing in incoming-native. Here is the steps that I took,

  1. copied the data file to a folder out of the repo. (I did it simply with cp)
  2. changed the ukbfetch code to get the file from that folder.
  3. run datalad ukb-update on that dataset.
    When I am trying the ukb-update for a dataset that is newly created and initialized with ukb-init it works fine and I end up getting the data extracted in incoming-native branch! can it be related to the active branch? since for already moved data, the active branch is incoming.