GSoC 2025 Project #39 EBrains :: Build an automated data registration workflow for submitting BIDS-compliant datasets to the EBRAINS Knowledge Graph (350h)

Mentors: Cyril Pernet cyril.pernet@nru.dk; Lyuba Zehl <Lyuba.Zehl@ebrains.eu>; Oliver Schmid; Sophia Pieschnik; Peyman Najafi

Skill level: junior+, mid

Required skills: Python and git required. Familiarity with data formats like BIDS, JSON, YAML, and CSV. Basic understanding of neuroimaging modalities (e.g., MRI, EEG, MEG).

Time commitment: Large (350h)

Forum for discussion

About: The project aims at automating the submission of BIDS datasets to the EBRAINS Knowledge Graph using bids2openminds converter. The common data submission workflow of the EBRAINS research infrastructure, currently assumes an individual (human) data provider. However, an automated data submission workflow from another data registry to EBRAINS is not yet established. In particular, if those data registries enforce BIDS as data model, such an automated workflow could be established using the bids2openminds converter. This important enhancement for the EBRAINS-RI would greatly facilitate integration of existing, validated BIDS dataset to the EBRAINS Knowledge Graph.

The data registration workflow will be built on two established standards: BIDS and openMINDS. BIDS is the community standard to organize, name and share data. openMINDS is EBRAINS metadata schema underlying the knowledge graph. Recently, openMINDS released a prototype for a bids2openminds converter that should be used as base for this project. Although allowing also for isolated data registration from individual researchers, the aim is to allow for automated registrations of BIDS compliant data collections from established health data platforms to the EBRAINS Knowledge Graph to increase their visibility in the research community. As test bed, a data registration workflow between publicneuro.eu and EBRAINS should be established.

The contributor will work in close collaboration with developers from publicneuro.eu, the EBRAINS Knowledge Graph, the EBRAINS Curation, and openMINDS.

Aims:

  • Minimal set of deliverables:
    • Establish a procedure of authentication for the submitter to the knowledge graph allowing recognition of the data submitter
    • Establish automated ingestion of data from publicneuro.eu into the EBRAINS KG using the bids2openminds converter (prototype)
    • Establish automated ingestion of data from publicneuro.eu into the EBRAINS KG using the bids2openminds converter (pilot implementation and testing)
    • Documentation
    • Tutorial demonstration
  • Additional ā€˜if time allowsā€™ deliverables (optional)
    • Batch mode, to submit many datasets at once, rather than serially

Website:

Tech keywords: Python, BIDS, big data, knowledge graphs

1 Like

Dear Mentors,

My name is Sodir, a bachelor student in Engineering Technology - Electronics ICT at KU Leuven preparing for my masters. This project seemed interesting to me, however I do not have any experience working with neuroimaging modalities or the BIDS data format. I do however believe in my ability to learn more about this format, and implement the necessary deliverables. But I was wondering if this would form a limiting factor.

Kind regards,
Sodir

Hey Mentors,

I am Akshat, a masters student in Computer Science at University of Pittsburgh, USA. This project caught my attention as the labs where I am currently working - SNS Lab and Partha Lab require a similar pipeline to submit BIDS datasets to neuro knowledge databases. I have experience in handling BIDS datasets and convert Raw Brainvision EEG and MRI data into BIDS format and I believe this could be mutually beneficial for all the parties. Although, I am unaware of openMINDS and EBRAIN knowledge graphs, I believe my knowledge and my background of computer science would be suffice to establish a pipeline and tackle that issue.

Kindly let me know your thoughts on this. Looking forward to collaborate and learn more about this. Sharing my resume for your reference.
Email : akshatdhamale@gmail.com

Best Regards,
Akshat Dhamale

Dear Sodir, dear Akshat, we are glad to hear you are interested in our project. The project focuses indeed on establishing an automated registration pipeline between the data model (BIDS) of one system (e.g., publicneuro) to the metadata model (openMINDS) of the EBRAINS graph database (Knowledge Graph), likely using https://openid.net/. Therefore, only basic knowledge on the content of the data (neuroimaging) is desired. Both BIDS and openMINDS have documentations available that help forming a base knowledge that is required for this project. We would suggest going through those prior to the project. Important are good programming skills, collaboration skills, and willingness to learn, understand, and then adopt existing technologies for the project.

Thank you for your interest in the project
Cyril, on behalf of the team

Hi!

I hope youā€™re doing well. I was going through the [2025 idea list]GSoC 2025 Project Descriptions - Google Docs, and I noticed that there are 33 projects listed in the document. However, I saw that on Neurostars there are also some additional projects (e.g., #34, #35, #36, #37ā€¦).

I was wondering are these new projects officially part of INCFā€™s GSoC 2025 ideas? And can we also apply for them, or should we only choose from the original 33?

Thank you in advance! :blush:

Hi @Cyril_Pernet , Lyuba, Oliver, Sophia, and Peyman,

Iā€™m Zahir (@Zahhiiir on GitHub), excited to contribute to the GSoC 2025 INCF project on automating BIDS dataset submission to EBRAINS. Iā€™ve forked and cloned bids-standard/bids-website (my fork: GitHub - Zahhiiir/bids-website: Website for the Brain Imaging Data Structure standard.) to get familiar with BIDS, but Iā€™m hitting a snag. When I run npm install, I get an ENOENT error saying package.json is missing, even though package-lock.json is present. The repo seems MkDocs-based (with mkdocs.yml), so Iā€™m unsure if npm is needed or if I should focus on Python setup instead.

Could you clarify the correct setup for bids-website? Also, for the bids2openminds converter, is there a repo I can start exploring? Iā€™d love to test it with a sample like ds005. My goal is to dive into the automation workflow soon!

Thanks,
Zahhiiir

Dear Cyril,

My apologies for the late response. I thought Neurostars would give me a notification when a thread Iā€™m in gets a response, alas. I will look into these data models and how to go about creating these pipelines. I will include this in short in my proposal, if that is fine.

With kind regards,
Sodir

Dear @Renqing_Cuomao

Yes, you can also apply for project ideas #34 - #39. These ideas were added slightly later, but are still eligible for GSoC 2025.

Good luck with your application!

thank you so much Katya! :slight_smile:

Dear @sydon1

of course it would be fully fine if you include this in your proposal.

Best, Lyuba

Dear @Zahir

please find here some more useful links (incl. the link to the bids2openminds repo):

For the issue you run into for BIDS please raise an issue on BIDS.
They provide a very nice support.
(@Cyril_Pernet unless you know the error and would be able to provide a hint)

Best, Lyuba

Thank you so much @LyubaZ. Also, I wanted to submit my proposal is it okay if I share it here, or should I email it personally? Please let me know what is the ideal way

@Zahir I think it is best if you send an email to me and @Cyril_Pernet.

Best, Lyuba

Dear @Zahir,

Please complete the INCF application template and submit your proposal through the Google Summer of Code website.

Link to the INCF application template:

Please scroll down until you see ā€œRegister to become a contributorā€:

Good luck with your application!