Download-url from password protected site

I’m trying to use download-url with authentication and cannot do so. I do not know the details of the authentication on the site, but I would guess it is simple http_auth because the following works:

$ wget --http-user=<user> --http-password=<pass> https://n5eil01u.ecs.nsidc.org/ICEBRIDGE/IDBMG4.004/1993.01.01/BedMachineGreenland-2021-04-20.nc.xml

with results:

--2021-07-29 11:55:23--  https://n5eil01u.ecs.nsidc.org/ICEBRIDGE/IDBMG4.004/1993.01.01/BedMachineGreenland-2021-04-20.nc.xml
Resolving n5eil01u.ecs.nsidc.org (n5eil01u.ecs.nsidc.org)... 128.138.97.102
Connecting to n5eil01u.ecs.nsidc.org (n5eil01u.ecs.nsidc.org)|128.138.97.102|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://urs.earthdata.nasa.gov/oauth/authorize?app_type=401&client_id=_JLuwMHxb2xX6NwYTb4dRA&response_type=code&redirect_uri=https%3A%2F%2Fn5eil01u.ecs.nsidc.org%2FOPS%2Fredirect&state=aHR0cDovL241ZWlsMDF1LmVjcy5uc2lkYy5vcmcvSUNFQlJJREdFL0lEQk1HNC4wMDQvMTk5My4wMS4wMS9CZWRNYWNoaW5lR3JlZW5sYW5kLTIwMjEtMDQtMjAubmMueG1s [following]
--2021-07-29 11:55:24--  https://urs.earthdata.nasa.gov/oauth/authorize?app_type=401&client_id=_JLuwMHxb2xX6NwYTb4dRA&response_type=code&redirect_uri=https%3A%2F%2Fn5eil01u.ecs.nsidc.org%2FOPS%2Fredirect&state=aHR0cDovL241ZWlsMDF1LmVjcy5uc2lkYy5vcmcvSUNFQlJJREdFL0lEQk1HNC4wMDQvMTk5My4wMS4wMS9CZWRNYWNoaW5lR3JlZW5sYW5kLTIwMjEtMDQtMjAubmMueG1s
Resolving urs.earthdata.nasa.gov (urs.earthdata.nasa.gov)... 2001:4d0:241a:4081::89, 198.118.243.33
Connecting to urs.earthdata.nasa.gov (urs.earthdata.nasa.gov)|2001:4d0:241a:4081::89|:443... connected.
HTTP request sent, awaiting response... 401 Unauthorized
Authentication selected: Basic realm="Please enter your Earthdata Login credentials. If you do not have a Earthdata Login, create one at https://urs.earthdata.nasa.gov//users/new"
Reusing existing connection to [urs.earthdata.nasa.gov]:443.
HTTP request sent, awaiting response... 302 Found
Location: https://n5eil01u.ecs.nsidc.org/OPS/redirect?code=6a0268b78283b8033755fa83b158b939dac3bd2a92e697622b7a13e296f89cde&state=aHR0cDovL241ZWlsMDF1LmVjcy5uc2lkYy5vcmcvSUNFQlJJREdFL0lEQk1HNC4wMDQvMTk5My4wMS4wMS9CZWRNYWNoaW5lR3JlZW5sYW5kLTIwMjEtMDQtMjAubmMueG1s [following]
--2021-07-29 11:55:24--  https://n5eil01u.ecs.nsidc.org/OPS/redirect?code=6a0268b78283b8033755fa83b158b939dac3bd2a92e697622b7a13e296f89cde&state=aHR0cDovL241ZWlsMDF1LmVjcy5uc2lkYy5vcmcvSUNFQlJJREdFL0lEQk1HNC4wMDQvMTk5My4wMS4wMS9CZWRNYWNoaW5lR3JlZW5sYW5kLTIwMjEtMDQtMjAubmMueG1s
Connecting to n5eil01u.ecs.nsidc.org (n5eil01u.ecs.nsidc.org)|128.138.97.102|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://n5eil01u.ecs.nsidc.org/ICEBRIDGE/IDBMG4.004/1993.01.01/BedMachineGreenland-2021-04-20.nc.xml [following]
--2021-07-29 11:55:25--  https://n5eil01u.ecs.nsidc.org/ICEBRIDGE/IDBMG4.004/1993.01.01/BedMachineGreenland-2021-04-20.nc.xml
Connecting to n5eil01u.ecs.nsidc.org (n5eil01u.ecs.nsidc.org)|128.138.97.102|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 4061 (4.0K) [application/xml]
Saving to: ‘BedMachineGreenland-2021-04-20.nc.xml’

BedMachineGreenland-2021-04-20.nc.x 100%[================================================================>]   3.97K  --.-KB/s    in 0.04s   

2021-07-29 11:55:25 (110 KB/s) - ‘BedMachineGreenland-2021-04-20.nc.xml’ saved [4061/4061]

When I try to set this up in datalad, it does not work. I’ve tried (among other things) the following CFG file. I know the url_re should be tighter, but for now I am keeping it global in case auth occurs at some 3rd-party site with a different URL.

# Provider configuration file created to initially access
# https://n5eil01u.ecs.nsidc.org/ICEBRIDGE/IDBMG4.004/1993.01.01/BedMachineGreenland-2021-04-20.nc.xml

[provider:NSIDC]
url_re = .*
authentication_type = http_basic_auth
# Note that you might need to specify additional fields specific to the
# authenticator.  Fow now "look into the docs/source" of <class 'datalad.downloaders.http.HTTPBasicAuthAuthenticator'>
# http_basic_auth_
credential = NSIDC

[credential:NSIDC]
# If known, specify URL or email to how/where to request credentials
# url = ???
type = user_password

Is there some other way to set up http-auth, or can someone tell from the wget output that http_basic_auth is not the correct method?

Thanks,

-k.

eh, that is why wget (or was it curl or both) got into space and DataLad “not yet”.

Hi Yarik,

Wow - thank you for digging into this so thoroughly. I’ll follow up on the GitHub ticket: We are happily downloading authentication page even though 401 is returned · Issue #5846 · datalad/datalad · GitHub