Cannot find latest version when updating #datalad datasets

Hi,

this might be related to this thread, but I have a dataset on OpenNeuro, and the URL currently points to the latest version 1.0.4.

When installing it with datalad, it only fetches version 1.0.3 (not 1.0.4):

datalad install https://github.com/OpenNeuroDatasets/ds001919.git
cd ds001919/
datalad get .
cat CHANGES

Output:

1.0.0 2019-05-14
- Initial release

1.0.1 2019-05-21
- Updated: sub-mniS04 (not the correct subject)

1.0.2 2019-05-22
- Updated: sub-mniS04 (fixed error introduced in 1.0.1) 
- Updated: sub-tehranS04 (not the correct subject)

1.0.3 2019-07-08
  - Added: sub-pavia01, cmrra, cmrrb, ubc, tokyoSkyra(sub-07), tokyo750w(sub-07), tokyoIngenia(sub-07)
  - Fixed participants.tsv for tokyoSkyra

Whereas the latest CHANGES contains this.

Thank you for your help,
Julien

Hi @jcohenadad

Thank you for your message! I have raised an issue with our developers. By chance, have you tried the AWS S3 download method as well?

Thank you,
Franklin

have you tried the AWS S3 download method as well?

I have not

Thank you for the quick response!

Hi,

I would like to make a follow-up concerning the issue reported by @jcohenadad. The issue seemed to have been adressed on GH-issue 1200 .

I ran the following commands:

datalad install https://github.com/OpenNeuroDatasets/ds001919.git
cd ds001919/
datalad get .

The result is :

get (error: 35, impossible: 1, notneeded: 1, ok: 1626)

You can find the complete output of the terminal here .

The command seems to grab the latest snapshot, but it cannot retrieve the latest .nii.gz & .json.

Thank you very much for your help,
Alexandru

Hi @alexfoias

Thank you for your raising this! Looking at your output it appears to be associated with this issue. We are working on enhancing our GitHub remote audits to ensure that the latest version is able to be downloaded using Datalad (I have added this dataset to the list of confirming the GitHub remotes are the most up to date snapshot). We also have capabilities of downloading using S3 - aws s3 sync --no-sign-request s3://openneuro.org/ds001919 ds001919-download/

Thank you,
Franklin

Hi,

I’ve tried with aws and the downloaded dataset is not the latest.

Experiment done on 2019-09-19 at 14:59:24: from OSX 10.14.6

aws s3 sync --no-sign-request s3://openneuro.org/ds001919 ds001919-aws
cd ds001919-aws
more CHANGES

Problem: this is only version 1.0.3

1.0.0 2019-05-14

- Initial release

1.0.1 2019-05-21

- Updated: sub-mniS04 (not the correct subject)

1.0.2 2019-05-22

- Updated: sub-mniS04 (fixed error introduced in 1.0.1) 

- Updated: sub-tehranS04 (not the correct subject)

1.0.3 2019-07-08

- Added: sub-pavia01, cmrra, cmrrb, ubc, tokyoSkyra(sub-07), tokyo750w(sub-07), tokyoIngenia(sub-07)

- Fixed participants.tsv for tokyoSkyra

Whereas on https://openneuro.org/datasets/ds001919/versions/1.0.4, the same file gives:

1.0.0 2019-05-14
- Initial release

1.0.1 2019-05-21
- Updated: sub-mniS04 (not the correct subject)

1.0.2 2019-05-22
- Updated: sub-mniS04 (fixed error introduced in 1.0.1) 
- Updated: sub-tehranS04 (not the correct subject)

1.0.3 2019-07-08
- Added: sub-pavia01, cmrra, cmrrb, ubc, tokyoSkyra(sub-07), tokyo750w(sub-07), tokyoIngenia(sub-07)
- Fixed participants.tsv for tokyoSkyra

1.0.4 2019-07-17
  - - Added: sub-pavia02, sub-pavia03, sub-pavia04, sub-pavia05, sub-pavia06

Also, is there any update on this issue regarding datalad download?

Many thanks,
Julien

Hi @jcohenadad

Thank you for your message and additional information! I apologize for this inconvenience! This issue may be linked to the validation pending (we can retrigger the validation on your dataset). We have been working on enhancing our validation process on larger datasets.

Regarding the datalad download issue - we have this on our log to enhance the GitHub auditing functionality to make sure that datalad can function properly with our system.

Thank you,
Franklin

Hi @franklin,

We have managed to do more snapshots of the https://openneuro.org/datasets/ds001919 but the AWS seems to download snapshot 1.0.5. In addition on GH it seems to be stuck to 1.0.5.

Could you please help us with this issue ?

Thanks,
Alexandru

hi @alexfoias

Thank you for raising this. We are tracking other instances of this occurring. Our backend storage configuration is in the later stages of migration and settling in. Once our new configuration is completed we’ll be able to remedy this. The snapshot available does correspond with the paper submission.

I apologize for these inconveniences.

Thank you,
Franklin

Hi @franklin,

We have also tried the openneuro cli to grab the latest snapshot but it failed.
Here is my terminal:

(base) alfoi_admin@rosenberg:~$ openneuro download --snapshot 1.0.8 ds001919 ds001919-download/
/usr/lib/node_modules/openneuro-cli/src/config.js:1
TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string or an instance of Buffer or URL. Received null
    at Object.openSync (fs.js:450:10)
    at Proxy.readFileSync (fs.js:360:35)
    at readConfig (/usr/lib/node_modules/openneuro-cli/src/config.js:14:13)
    at getUrl (/usr/lib/node_modules/openneuro-cli/src/config.js:45:29)
    at getDownloadMetadata (/usr/lib/node_modules/openneuro-cli/src/download.js:41:22)
    at getDownload (/usr/lib/node_modules/openneuro-cli/src/download.js:56:3)
    at Command.download (/usr/lib/node_modules/openneuro-cli/src/actions.js:211:5)
    at Proxy.listener (/usr/lib/node_modules/openneuro-cli/node_modules/commander/index.js:315:8)
    at Proxy.emit (events.js:315:20)
    at Proxy.EventEmitter.emit (domain.js:482:12) {
  code: 'ERR_INVALID_ARG_TYPE'
}
(base) alfoi_admin@rosenberg:~$ openneuro -V
3.15.0-alpha.1

I even updated the cli to the latest version, but it seems to have a similar behaviour:

(base) alfoi_admin@rosenberg:~$ openneuro download --snapshot 1.0.8 ds001919 ds001919-download/
/usr/lib/node_modules/openneuro-cli/src/config.js:1
TypeError [ERR_INVALID_ARG_TYPE]: The "path" argument must be of type string or an instance of Buffer or URL. Received null
    at Object.openSync (fs.js:450:10)
    at Proxy.readFileSync (fs.js:360:35)
    at readConfig (/usr/lib/node_modules/openneuro-cli/src/config.js:14:13)
    at getUrl (/usr/lib/node_modules/openneuro-cli/src/config.js:45:29)
    at getDownloadMetadata (/usr/lib/node_modules/openneuro-cli/src/download.js:41:22)
    at getDownload (/usr/lib/node_modules/openneuro-cli/src/download.js:56:3)
    at Command.download (/usr/lib/node_modules/openneuro-cli/src/actions.js:211:5)
    at Proxy.listener (/usr/lib/node_modules/openneuro-cli/node_modules/commander/index.js:315:8)
    at Proxy.emit (events.js:315:20)
    at Proxy.EventEmitter.emit (domain.js:482:12) {
  code: 'ERR_INVALID_ARG_TYPE'
}
(base) alfoi_admin@rosenberg:~$ openneuro -V
3.15.1

Is there a way of grabbing the latest snapshot using command line? We’re currently stuck in processing the data on our remote servers due to this issue.

The possible solution suggested here, doesn’t work due to the fact that GH hasn’t been updated since 28 days ago.

Your help will be greatly appreciated.

Regards,
Alexandru

As a reference: This dataset can now be found at ds002902