GSoC 2025 Project #31 UCSD Projects :: Development of data standards interchange via LinkML (350h)

Mentors: Tom Gillespie <tgbugs@gmail.com> and Jeff Grethe <jeffrey.grethe@gmail.com>

Skill level: Intermediate or greater

Required skills: Python

Time commitment: Full time (350 hours)

Forum for discussion

About: There are a number of existing data standards that are actively in use in the neuroscience community. A long standing goal is to enable conversion between different standard formats to increase the visibility of datasets across platforms and to make it possible to leverage existing tooling that expects an alternate format. Examples of these standards are Brain Imaging Data Structure (BIDS), SPARC Dataset Structure (SDS), openMINDS, DANDI, etc.

Aims: In this project you will learn about the tools for specifying data standards (such as LinkML) and use them to create mappings between SDS and BIDS. At the same time, you will learn about building data pipelines for converting from one standard to another using Python and the mappings specified in LinkML. As the project progresses additional standards can be added to the converter and more complete mapping from one data standard to another can be pursued.

Websites:

Tech keywords: Python, YAML, LinkML, data standards, BIDS, SPARC, SDS, JSON Schema

Hi @arnab1896,

My name is Tushar Jamdade, and I am currently in my second year at Vishwakarma Institute of Information Technology (VIIT), Pune. I have experience with programming languages such as Python, TypeScript, and JavaScript. Additionally, I have hands-on experience with Machine Learning and have worked with Python frameworks like Scikit-Learn and TensorFlow. I have built various ML projects using Jupyter Notebook and Google Colab.

Beyond ML, I am proficient in web development and well-versed in frameworks such as React.js, Next.js, Express.js, and Node.js. I have also worked with databases like Firebase, MongoDB, and SQL.

I am particularly interested in working on the Development of Data Standards Interchange via LinkML project, as I have been studying Brain Imaging Data Structure (BIDS), SPARC Dataset Structure (SDS), openMINDS, and DANDI and have gained a good understanding of their structures.

I have a question regarding the project:

Are there any other data standards, apart from BIDS, SDS, openMINDS, and DANDI, that we should also consider working on?

Thank you,
Tushar Jamdade