I’m trying to query the MRIQC API (e.g., https://mriqc.nimh.nih.gov/api/v1/T1w) and I’m getting a “HTTP Error 502: Bad Gateway”. I am able to connect to the server itself (https://mriqc.nimh.nih.gov).
Does anybody know if the API server is down?
Thanks.
Looks like there is an issue with its database connection. Will let you know when it’s back online.
Should be up now, let me know if you see any issues with it.
It works now.
Thanks!
Hi,
I am also encountering HTTP Error 502: Bad Gateway and HTTP Error 504: Gateway Time-out errors when trying to query the API. I’ve tried several times over the past few weeks, and these errors don’t consistently occur on the same pages. I’m wondering if these issues could also be due to the server being down or experiencing instability.
Does anyone have recommendations on handling these errors effectively or on improving the connection to the server? Any advice would be greatly appreciated.
I am using the function below, adapted from here.
def get_iqms_despite_errors(modality, versions=None, software='mriqc', page_limit=None):
"""
Grab all iqms for the given modality and the list of versions
"""
print(f"Running query for {modality}")
url_root = 'https://mriqc.nimh.nih.gov/api/v1/{modality}?{query}'
page = 1
dfs = []
if versions is None:
versions = ['*']
for version in versions:
while True:
if page_limit is not None and page > page_limit:
print("Reached specified page limit (page_limit)")
break
query = []
if software is not None:
query.append('"provenance.software":"%s"' % software)
if version != '*':
query.append('"provenance.version":"%s"' % version)
page_url = url_root.format(
modality=modality, query='where={%s}&page=%d' % (','.join(query), page)
)
print(f"Fetching {page_url}")
try:
with urllib.request.urlopen(page_url) as url:
data = json.loads(url.read().decode())
dfs.append(pd.json_normalize(data['_items']))
# Check if 'next' link exists
if 'next' not in data['_links'].keys():
break # End of pages, stop the loop
else:
page += 1 # Continue to the next page
except urllib.error.HTTPError as e:
print(f"HTTP error on page {page}: {e}")
page += 1 # Skip this page and move to the next one
continue # Continue the loop despite the error
except Exception as e:
print(f"Other error on page {page}: {e}")
page += 1 # Skip this page and move to the next one
continue # Continue the loop despite the error
# Compose a pandas dataframe
return pd.concat(dfs, ignore_index=True)```
Server has been hit pretty hard by a few clients crawling it. I updated the python servers worker model to be more asynchronous to maybe survive timeouts from certain requests better.
Putting a timeout/sleep between urllib.request.urlopen
may help reduce number of errors you’re seeing. I haven’t implemented any server side throttling so far, so hugging the server to death isn’t too hard.
Let me know if you see any reduction in 502 or 504s, curious to see if my configuration changes made any difference.