GSoC Project Idea 9: Running FindSim experiments on cloud servers

Various XML based formats such as SBML, NeuroML etc are available which makes developing a neuronal signaling model user as well as machine friendly. The most effective way of validating a model is to compare the readouts of a simulation with that of an actual experiment to see how closely the simulation output fits the real experiment readout. Framework for Integrating Neuronal Data and Signaling Models (FindSim) is a tool that enable systematic validation and optimization of a neuronal signaling model by anchoring a model to actual experiment dataset.

We are developing a web-based tool which allow users to use the FindSim pipeline in a user-friendly way. Running a simulation experiment can be a computing extensive processes depending on the size of the model and various parameters involved. The computation time can range from a few seconds to a few hours. Currently, the web server is able to run small model with acceptable efficiency and run time. However, for large models it takes a lot of time. Our plan is to employ a high performance cloud server to run the simulation job (may be use Amazon web services or nsg or anything else) and use the web server to store the data and serve the web tool, for optimal distribution of the workload and smooth functioning of the web tool. The aim of the GSoC project will be to set up data exchange between the web server and the cloud computing server.

Things to be done:

  • Setting up a computation server using docker/?, and installing moose and FindSim on it
  • Implementing a RESTful API to set up talk between the web server and the computation server
  • Sending experiment run request to the computation server and receiving the output on the web server
  • Implementing JavaScript for visualizing the results of the simulation in real time.

Required Skill Set

  • Python, Network/Web programming
  • Familiarity with network streaming, serialization (xml, json), RESTful API.
  • PHP and C++ is plus.

Mentors: Surbhit Wagle ( surbhitw@instem.res.in ), Upinder Bhalla ( bhalla@ncbs.res.in )

For more details about the FindSim please refer to following links:

Hi !
I’m Harendra Singh, and I’m interested in taking up this project under GSoC.

I have been coding in Python primarily and very comfortable with it. Have experience with both Django and Flask.
Also I do some compititive programming so quite familiar with C++ and also how it’s inbuilt functions are implemented and their time complexities,etc. So I would say I have a good knowledge of C++ too.
Learnt PHP in college, haven’t used it much but can get work done by taking some help through StackOverflow.

I’ve also used AWS for multiple projects of mine and have learnt about DevOps stuff like Docker and Kubernetes through courses online.

So overall I feel I fit the project well and I would be able to complete it !

I’ll look into what would be the best way to host this server (i.e. which services of AWS would be most apt to use).

I have a few questions regarding the project though :

  1. What is the frequency of API calls expected?
  2. What kind of visualisation of results of expected?
    What exactly will be the result?
    (Will learn about moose and FindSim and try them out myself too,maybe that would answer this question too.)
  3. What is the tech stack for the webserver?

Any other info you may want to provide?

Hi Harendra.

Thank you for your interest in the project. Can you send me a copy of your CV and your Github account link (at surbhitw@instem.res.in)? I would like to know a little more about your skill set and the projects you have worked on. The next step is to prepare a detailed proposal on how you are planning to do the project. Please refer to GSoc guidelines for writing the proposal (refer: https://google.github.io/gsocguides/student/writing-a-proposal). Feel free to contact me or comment on this thread for any help you might need to draft your proposal.

Also, Here are the answers to your questions:

  1. Initially, the frequency of API calls is expected to be less (in tens in an hour) but we are expecting it to increase it to a few hundred ( thousands in best case). Exceeding the number of API call would not be a problem. What we want is to be able to run time taking and heavy processing jobs on the server with minimal overhead from the API (including the API code as well as the database, if any).
  2. We plan to use the Mpld3 (http://mpld3.github.io/) to generate interactive graphs as a result of the simulation. This would require some basic knowledge of JavaScript and Python. Currently, a complete graph is generated at once, but later we are planning to be able to plot graph in real-time. We would want to use Mpld3 for that as well.
  3. The web server uses AMP on a centos 7.

Please let me know if you have any other concerns.

Best,
Surbhit Wagle