GSoC 2024 Project Idea 4.1 Graph Neural Networks (OpenWorm Foundation) (350 h)

Graph neural networks (GNNs) are a potentially powerful method for discovering connectivity in geometrically complex datasets. The DevoWorm group has developed an open-source GNN framework for embryogenetic data called DevoGraph. Developmental GNNs (D-GNNs) allow us to characterize a growing network that undergoes shape transformations along with increases in size. During GSoC 2022, we developed a roadmap for progress in this area, but were not able to develop full integration with our Deep Learning-based pre-trained model (DevoLearn). Ultimately, we aim to tie our D-GNN work into the group’s work on embryo networks, developmental connectomes, and embryo differentiation.

During the project period, you will be involved in three activities: 1) refining a means to segment raw data and incorporate it into the DevoGraph pipeline, 2) refining our method for deriving graph embeddings, using techniques from topological data analysis and complex network theory, and 3) more tightly integrating DevoGraph as a network structure discovery module of DevoLearn. Achieving 1) will require refactoring CNN models and understanding biological training datasets. Activities 2) and 3) require the ability to work with mathematical models and associated algorithms. Knowledge of graph and/or network theory is helpful, but not required.

What can I do before GSoC?

You can ask one of the mentors to direct you to the data source and you can start working on it. Please feel free to join the OpenWorm Slack or attend our meetings to raise questions/discussions regarding your approach to the problem.

OpenWorm Foundation: https://openworm.org/
DevoWorm website: https://devoworm.weebly.com/
DevoGraph (Github): GitHub - DevoLearn/DevoGraph
DevoWorm AI: DevoWorm.AI

Skill level: Advanced

Required skills: All of our existing models are built for PyTorch, so experience with Python and PyTorch/Tensorflow workflows is preferred. The ability to work with datasets, such as segmenting video and generating graph visualizations is essential. An ability to build web interfaces, UI design, basic knowledge of biology, open-source practices, and applied mathematical tools will also be useful.

Time commitment: Full-time (350 hours)

Lead mentor: Bradly Alicea (bradly.alicea@outlook.com)

Project website: https://devoworm.weebly.com/

Backup mentors: Jiahang Li (lspongebobjh@gmail.com)

Tech keywords: GNNs, Computational Biology, Graph Theory, PyTorch

1 Like

I am Nanthakumar currently pursuing computer science in NIT Trichy. well experienced with python, LLMs, tensorflow. Previously build an classification model with tensorflow- https://github.com/alien-cyber/Butterfly-classification/blob/main/models/butterfly_classification_Googlecolab.ipynb
I am familiar with pytorch too.Looking forwad to work on this project,untill then I will look at the codebase

Hi, I am jyothi swaroop. I am interested in this project. I have already made some contributions to devolearn, devoworm and also last year gsoc project cell sam where in I have added multigpu support to sam model.I am eagerly looking forward to work on this project this year.

Hi, I am HarshitSharmaV from the Vellore Institute of Technology. Have made project which started from graph neural network and simplified its complexity for what we required. I am going to apply for the same with name “HarshitSharmaVITVellore”. Followed paper - https://www.sciencedirect.com/science/article/pii/S0097849322000206 Buoyant-Ascension/Another_copy_of_usingData.ipynb at main · harshitsharma-dev/Buoyant-Ascension · GitHub

I am Chirag Sindhwani, a second-year Electrical Engineering student at IIT (BHU), Varanasi. I am passionate about machine learning and deep learning for advancing scientific research. Currently, I am working on enhancing NequIP models (E(3) Graph Neural Networks) by integrating physical parameters into atomistic simulations. I am collaborating with a professor from Texas Tech University and will soon publish a paper on this work.

Referred by the INCF team, I am eager to start working on Developmental GNNs and contribute to your research. Could you guide me on how to begin?
Best regards,
Chirag Sindhwani

Hello everyone,

My name is Shaik Abdus Sattar, and I am a Computer Science undergraduate deeply focused on Machine Learning, Deep Learning, and mathematically grounded modeling. I am very interested in contributing to the Graph Neural Networks (OpenWorm Foundation) project, particularly the integration of DevoGraph with DevoLearn and the refinement of developmental graph embeddings.

My recent work has been centered around applied ML research in scientific domains, especially healthcare and structured data modeling. I have:

  • Built and trained CNN and hybrid deep learning models for real-world scientific datasets
  • Designed custom loss functions and structured evaluation pipelines
  • Implemented optimization algorithms (gradient descent variants, heuristic search, parallelized methods) from scratch
  • Worked extensively with PyTorch-based workflows
  • Developed end-to-end ML pipelines from preprocessing to model validation

This project strongly aligns with my interests in:

  1. Refactoring CNN-based segmentation pipelines – I have experience handling structured and high-dimensional data and would be excited to explore segmentation workflows for embryogenetic datasets.
  2. Graph embeddings & mathematical modeling – I am particularly interested in combining GNNs with topological data analysis and complex network theory for embedding developmental graphs.
  3. Architecture integration – The integration of DevoGraph as a network discovery module within DevoLearn is especially compelling to me from a systems-design and representation learning perspective.

I am comfortable with:

  • Python and PyTorch
  • GNN fundamentals (message passing, spectral methods, attention-based GNNs)
  • Algorithmic implementation and mathematical modeling
  • Structured experimentation and reproducible research workflows
  • Open-source collaboration

Before GSoC, I would be very happy to:

  • Review the DevoGraph repository and understand its current embedding pipeline
  • Explore the embryogenetic dataset structure and prototype segmentation refinements
  • Experiment with alternative graph embedding strategies (e.g., Laplacian-based embeddings, contrastive graph learning, persistent homology-inspired features)
  • Draft a structured integration plan between DevoGraph and DevoLearn

I am highly motivated to contribute at a full-time (350h) level and align my proposal closely with current roadmap priorities.

You can find more about my work here:
GitHub: Seventie (Shaik Abdus Sattar) · GitHub
LinkedIn: https://linkedin.com/in/seventie

I would greatly appreciate any guidance on where to begin in the repository and which datasets would be most appropriate to start experimenting with. Looking forward to contributing and discussing ideas with the community.

Thank you!