Mentors: Tom Gillespie <tgbugs@gmail.com> and Jeff Grethe <jeffrey.grethe@gmail.com>
Skill level: Intermediate or greater
Required skills: Python
Time commitment: Full time (350 hours)
About: There are a number of existing data standards that are actively in use in the neuroscience community. A long standing goal is to enable conversion between different standard formats to increase the visibility of datasets across platforms and to make it possible to leverage existing tooling that expects an alternate format. Examples of these standards are Brain Imaging Data Structure (BIDS), SPARC Dataset Structure (SDS), openMINDS, DANDI, etc.
Aims: In this project you will learn about the tools for specifying data standards (such as LinkML) and use them to create mappings between SDS and BIDS. At the same time, you will learn about building data pipelines for converting from one standard to another using Python and the mappings specified in LinkML. As the project progresses additional standards can be added to the converter and more complete mapping from one data standard to another can be pursued.
Websites:
- Information about the SPARC Dataset Structure (SDS): https://www.incf.org/sparc-data-structure
- Information about the Brain Imaging Data Structure (BIDS): https://www.incf.org/sbp/brain-imaging-data-structure-bids
- Initial LinkML model for the SPARC Dataset Structure: sparc-curation/resources/linkml at master · SciCrunch/sparc-curation · GitHub
- YAML schema specification for the Brain Imaging Data Structure: bids-specification/src/schema at master · bids-standard/bids-specification · GitHub
Tech keywords: Python, YAML, LinkML, data standards, BIDS, SPARC, SDS, JSON Schema