GSoC 2022 Project Idea 10.5: Improved parser for model descriptions (175 h)

malin · February 3, 2022, 10:57am

Computational research in biology commonly consists of describing a model system and its parameters, simulating the system with specialized software, and then analyzing the results. Model descriptions often make use of a domain-specific language that is parsed and interpreted by the simulation software. Such languages have the advantage that they are more accessible to researchers than code written in a general-purpose programming language; they also make it easier to discuss and share models with other researchers.

Parsing such languages is mostly straightforward with standard techniques, but these techniques regularly have at least two shortcomings: 1) For syntactically incorrect descriptions, e.g. a missing parenthesis, error messages are typically not very helpful and of the form “unexpected symbol at position x”, and 2) annotations via comments are usually simply ignored, instead of being used to enrich a model description.

Both of these shortcomings are currently present in the Brian simulator, an open-source simulator for biological spiking neural networks written in Python, developed in our research group and used by researchers world-wide. The Brian simulator describes models with a domain-specific language that uses mathematical notation with additional annotations, e.g. to assure the consistency of physical dimensions.

The goal of this internship is to rewrite the Brian simulator’s parsing code to give clear and helpful error messages for incorrect model descriptions, as well as treating comments in the model descriptions as annotations that are stored for future usage.

Planned effort: 175h

Skills: Python programming, parser/compiler techniques

Skill level: intermediate

Mentors: Marcel Stimberg @mstimberg, Dan Goodman @d.goodman

Tech keywords: Brian, Python, parser

KyraZzz · March 9, 2022, 8:44am

Hi @mstimberg and @d.goodman ! I am interested in contributing to this project, can you give me some guidance on where to start? Thanks in advance!

mstimberg · March 11, 2022, 11:11am

Hi @KyraZzz . Thank you for the interest in our project. As a first step, I’d get familiar with the Brian simulator itself (https://briansimulator.org for general information and links to the documentation, etc.), and in particular with its system for equations (see e.g. the documentation and the model description section in our 2014 paper). I’ll try get back to you with some more suggestions soon.

mstimberg · March 25, 2022, 9:29pm

We have now posted some general recommendations for GSoC applications to our website, please have a look: Recommendations for GSoC 2022 applications | The Brian spiking neural network simulator

Regarding this project more specifically, please include in the application a detailed description of the current equation parsing mechanism. The project goals mention error messages: please come up with a number of “realistic” errors a user could make when writing equations, and document the error messages gives in these situations. Ideally, what kind of error message would the user prefer? In more general terms, what kind of parsing techniques do you know and what are their respective advantages/disadvantages?