GSoC 2020 project idea 14: Pre-trained models for Developmental Neuroscience

This project will center around building a pre-trained model for shapes and processes related to Developmental Biology and Neurobiology and extracted from image data. Our organization’s Machine Learning interest group (DevoWormML) has published a blog post [1] on the advantages and need for pre-trained models in this area. In short, biological development is characterized by characteristic shapes, movements, changes in shape, and temporal processes that define important features. Pre-trained models are used in NLP and Deep Learning for the domains of sequence discovery in language processing (GPT-2) and bounding box methods for segmenting complex images (DeepLabv3). Models specialized for biology, however, do not exist. A suitable pre-trained model would greatly reduce the need for input data without sacrificing the ability to generalize to different contexts.

Our main interest is in extracting spatiotemporal features from image data. We will focus on microscopy data such as that found in the DevoZoo or from more specialized sources [2]. For a typical pre-trained model, the network is pre-trained with non-random weights that approximate the generalized versions of the features we would like to discover. However, we are also interested in a semantic component, particularly the ability to incorporate elements such as meaning assigned to static knowledge (semantics) and multiple meanings for a single feature (polysemy). This will enable relational modeling and the mapping of segmented image data to lineage trees and taxonomies. This will enable relational modeling and the mapping of segmented image data to lineage trees and taxonomies. Our model, tentatively called DevLearningv1, should be applicable to a wide range of neural network and deep learning techniques.

As a student, you will become a contributor at the OpenWorm Foundation, where we are attempting to build a virtual organism. You will learn about developmental neurobiology, and join the DevoWorm group. OpenWorm has an active interest in data science, and DevoWorm in particular has an active interest in machine learning research and education. We seek someone with experience with programming languages C++ and Python, and a machine learning platform such as TensorFlow or Keras.

Mentor: Bradly Alicea (balicea@openworm.org) and Stephen Larson (stephen@openworm.org), OpenWorm Foundation (https://openworm.org).

NOTES

[1] Blogpost on pre-trained models: https://thenode.biologists.com/pre-trained-machine-learning-models-for-developmental-biology/uncategorized/

[2] Crawford-Young, S.J., Dittapongpitch, S., Gordon, R., and Harrington, K.I.S. (2018). Acquisition and reconstruction of 4D surfaces of axolotl embryos with the flipping stage robotic microscope. Biosystems, 173, 214-220. doi:10.1016/j.biosystems. 2018.10.006.

Hello there,

I’m Mayukh Deb, an undergrad student at Amrita University, Kerala, India. I’m a deep learning enthusiast and have been busy making fun deep learning projects. I’ve also been working with image processing, data wrangling and augmentation etc. I’m also a member of amFOSS, a student run community of open source enthusiasts. And I spend most of my time with PyTorch, Numpy and Pandas. Most of my projects are on image datasets from kaggle.

Recently I’ve been working a deep neural network that derives medical inferences from ECG data, which works pretty well already . Another one of my recent projects include using deep learning to steer a car and keep it on the racetrack in a game purely from visual data.

I got really interested in this project because I’ve always been fascinated with application of deep learning in biology and medical sciences, and it intersects with my skillset pretty well.

I’m really looking forward to contribute to this organization and learn a ton of new things on the way.

It would be great if I get a few pointers and microtasks which would help me obtain a better understanding of this project.

Hi @Mayukhdeb,

Thanks for your interest.

There are 3 ways to get pointers/microtasks for the project.

Let me begin with the fastest way :

  1. Visit OpenWorm’s website linked above and you will find an option to join OpenWorm’s Slack channel. This is fast because mentors of OpenWorm are usually very active on Slack, so you will get your queries answered very fast.
  2. Other way is to email the mentors. Their email IDs are listed above.
  3. Else, you can tag them here and ask them any specific questions that you have -> @b.alicea and @Stephen_Larson1
1 Like

Hello Mayukh,

Sounds like you would be a good candidate for this project. This builds on projects from previous years, so please see our previous project presentations for some context. You might also be interested in the DevoWormML course, which was held last Fall and covers some of the topics our group is interested in.

As Arnab has suggested, please join the OpenWorm Slack, and join #devoworm and #devowormml. The blog post (mentioned in the project description) and DevoWormML materials should give you a better idea of what we are looking for. I can guide you with proposal development, so send me a draft version for feedback when you get to that stage. Good luck!

1 Like

Thanks for the reply !

I’ve already read the blog post given in the project description, and I’m planning to go through the previous project presentations and the DevoWormML course today.

Apart from this, since yesterday I’ve also been setting up an image augmentation pipeline and experimenting with different augmentation techniques on images of blood cells.

See you in slack !

Hi @Mayukhdeb. Once you’ve gone through all the links, you can start writing the proposal with the solutions and approaches that you can think of. We can give our feedback on it.

If you are finding it difficult to understand any biological terminology, you can look at Worm Book. You can learn about the C.Elegans worm( its development stage, its adult stage etc). This way you can gain a better understanding of the data we have.

1 Like

@nvinayvarma189 thanks for the reply !

Since I don’t have access to the your dataset for now, I’ll use a dummy dataset containing images of blood cells to produce the examples in the proposal.

Will work on these and keep you guys updated for sure.

Hi everyone, I’m Adarsh Kumar an Undergrad Computer Science student at IIIT-Delhi, India. I’m a Deep Learning enthusiast and have experience of several Machine Learning/ Deep Learning projects. I’ve done a Deep Learning- 5 Course Specialisation on Coursera and a Machine Learning course from Andrew Ng on Coursera as well and have good hold on DL/ML Fundamentals as well as some of the latest advancements in the field.

I have good experience of working with ResNets, Inception networks, object detection(YOLO), several CNN architechture like AlexNet, VGG-16, ResNet-50. I’ve also worked on PCA, SVMs, RNNs, LSTMs, Sequence to Sequence architechture like Attention models. I’m fluent with Python, Java and also with Frameworks like TensorFlow and Keras(as for most of my projects I’ve used either of the two). I love solving challenging problems and learning new things. I’m a hard working and dedicated guy. Since the last couple of days I’ve being doing the following:

  1. I’ve gone through the previous year GSoC projects recommended above (I just realised that GSoC-2017 Project was done by my Senior Siddharth).
  2. I’ve also read the blog provided in the Notes
  3. I have skimmed over the Dataset DevoZoo and now have a good feel of how the data will be.
  4. I’ve also seen the DevoWormML Course(the one about the Pre-trained models) which provided me with even more clarity and after doing all that now I have good enough idea about the project.

I always wanted to work at the intersection of Biology and Deep Learning and this project provides me with that opportunity, I’d love to contribute to the project and learn from others here.
Any pointers on what should I do next?

The next step is to write a draft proposal. I can help you through the process if it is helpful. Base your proposal on the above project announcement. Please include a task timeline for how you will execute the project. Look forward to hearing from you soon.

1 Like

Thanks for the reply @b.alicea

Sure I’ll start with my proposal and will keep you informed.
Thanks

Hi ,this is Yamini ,a third year Computer Science student from Bits Pilani,Goa.I have gone through the above mentioned parts.The data set had microscopy videos and few images of embryos.My question is we are gonna feed the images or videos into the model?

Greetings @b.alicea @arnab1896
I am Satyam sharma, an AI engineer/Researcher specialized in Deep Learning and Natural language processing and a Data scientist specialized in data analytics and machine learning.

With working on 10+ Real-world projects in different fields of Deep learning including Image segmentation, Image augmentation, object detection, image multi-classification, image multi labeling, OCR, etc of different domains like medical, education, finance, manufacturing,

In my work, I have also worked with building very large scalable systems that are powered by deep learning maintaining efficiency, performance, and accuracy.

My experience includes everything from using

  • Frameworks like Tensorflow, PyTorch, Keras
  • Pre-trained models and state of art models
  • Building Custom models
  • Implementing models from Research papers
  • Building pipelines
  • communication and leading internal teams
  • collecting, cleaning, analyzing and visualizing data.

Having in-depth knowledge and different skill sets to offer to this project, i would love to be a part of the OpenWorm Team and help it reach new heights.

  • Devozoo Datasets: completed reviewing
  • DevoWormML course: Had a glance over it

I am not able to join the slack channel, could you please help?
Any pointers on what should I do next? @b.alicea @arnab1896

Thanks for reading such a long text, so patiently,
Satyam sharma.

Hi @satyamsharma. You need to first fill an invitation form which you can find at https://openworm.org/. You’ll receive an invitation on your email after accepting that you can join the channels.
Hope it helps.

Hello,
I am Amrendra Pratap Singh from the Indian Institute of Technology BHU (IIT BHU) Varanasi, India. I am a deep learning enthusiast and want to work on this project.
I have gone through the details and would like to contribute to this project.

I want to ask one question whether we can choose any dataset and model of our choice or will we be provided with a dataset ?

Great! First step: join our Slack channel. Next step: join #devoworm and #devowormml. Third step: write a proposal for the project in question. The proposal should consist of: a problem statement, a unique solution to the problem (something you can address over the course of a summer), and a timetable for completion. I can review if you ask me at least several days before the deadline (March 31). See you in Slack!

You will be provided with data: check out the DevoZoo for more information.

Here is the link to the Slack channel. Then, join #devoworm and #devowormml. As for the proposal, it should consist of: a problem statement, a unique solution to the problem (something you can address over the course of a summer), and a timetable for completion. I can review if you ask me at least several days before the deadline (March 31). See you in Slack!

Thanks @Adarsh_Kumar @b.alicea