I’m trying to create a searchable spreadsheet/csv with the metadata of almost all available datasets, and wondering whether anything similar has been done before.
For example, I’d like to have a csv with an entry for most available datasets. The entry would include wherever applicable modality, dataset size, organism, n_subjects, ages, condition(s), duration, brain region, license, authors, paper, maybe a few other things. This would allow for search through these datasets without needing to check many different repositories, and differentiate from KnowledgeSpace by showing metadata in a condensed view that can be aggregated to pivot tables, etc. It seems there are some csvs of metadata around, e.g. there are ones from OpenNeuro and Brain Library, but they crucially don’t include dataset size. So it takes quite a while just to check if a single type of data exists, much less how much data of several different specifications.
Anyone know of other approximations of this task or tool that have been collected?
Thanks,
Connor