GSOC 2026 Project #1 : Brian Simulator - Serialization/deserialization for Brian sim models, results, and input data (175h/350h)

Brian is a clock-driven spiking neural network simulator that is easy to learn, highly flexible, and simple to extend. Written in Python, it allows users to describe and run arbitrary neural and synaptic models without needing to write code in any other programming language. It is built on a code-generation framework that transforms model descriptions into efficient low-level code.

Currently, the Brian simulator lacks advanced support for complex simulation pipelines that require storing or restoring the state of simulations (e.g., checkpointing) or saving a complete network architecture to a file. Such pipelines are particularly important for studies involving machine learning-like workflows, such as running a network on multiple stimuli for training and testing. In recent years, the Brian simulator has introduced several features to support these approaches, but they come with limitations and do not cover the full range of potential use cases.

The aim of this project is to improve and consolidate the Brian simulator’s tools for model and state (de)serialization. Specifically, the goals of this project are to:

  • Extend the current store/restore mechanism to work with Brian’s C++ standalone mode and Brian2CUDA.
  • Refactor the basicexporter from the brian2tools package so that it can be used to serialize a network architecture.
  • Create a corresponding basicimporter to reconstruct a network from such a serialization
  • For the larger 350h project, additional goals include:
    • Implement annotation and metadata for models
    • Investigate data formats for neural simulation/recording data (e.g. NWB or NEO) and provide export tools
    • Investigate data formats for input data (e.g. AEDAT) and provide import tools
    • Identify potential connection points to software from neuromorphic computing/ML and provide tools to facilitate interoperability

Skill level: Intermediate

Required skills: Python, C++

Time commitment: Part-time (175h) or full-time (350h)

Lead mentor: Benjamin Evans (B.D.Evans@sussex.ac.uk)

Project website: GitHub - brian-team/brian2tools: Tools to use with Brian 2, in particular for visualization Ā· GitHub and GitHub - brian-team/brian2: Brian is a free, open source simulator for spiking neural networks. Ā· GitHub

Backup mentors: Dan Goodman (d.goodman@imperial.ac.uk; d.goodman on NeuroStars), Marcel Stimberg (marcel.stimberg@sorbonne-universite.fr; mstimberg on NeuroStars)

Tech keywords: Python, C++, serialization, data formats

1 Like

Hi @arnab1896 and @BenjaminEvans,

I am writing to express my strong interest in the Brian Simulator Serialization project. As a Python and C++ developer with experience in high-performance monitoring systems, I find the challenge of implementing robust checkpointing and cross-platform state restoration particularly exciting.

I have spent some time analyzing the brian2tools repository, specifically the basicexporter implementation. I can see that while the current infrastructure handles basic architectural exports, scaling this to C++ standalone mode and Brian2CUDA will require a more sophisticated handling of memory layouts and state consistency.

A few points I’m currently looking into:

  • How to bridge the gap between Python-side metadata and the low-level C++ state during runtime.
  • Investigating the integration of NWB (Neurodata Without Borders) or HDF5 to ensure the serialized output is interoperable with the broader neuroinformatics ecosystem.
  • Extending the store/restore mechanism to ensure it is thread-safe for large-scale simulations.

I am eager to contribute to Brian’s mission of making neural simulations more reproducible and flexible. I’ll be sharing a more detailed architectural plan soon, but in the meantime, I would love to hear your thoughts on the priority order of the data formats mentioned (NWB vs. NEO).

Looking forward to a productive discussion!

Best regards,
ARYBHATT

Hi @Ary_Bhatt. Thank you for your interest in our project. I will post some more general recommendations for an application and what to do to prepare for this project below. Regarding your question about data formats: I wouldn’t worry about that too much for now, since the work to implement support for either of them is probably quite similar. It would be good to look a bit into the different data formats and see what their differences and similarities are, and whether this would affect who well they fit ā€œBrian’s way of doing thingsā€.

Hi everyone. Here’s a general write-up about the application process and what we would like to see in your application: GSoC 2026 | The Brian spiking neural network simulator

For the serialization/deserialization project specifically, I’d like you to include two more things:

  1. A brief description of how the existing store/restore mechanism could be integrated into the C++ standalone mode, in particular whether/where it needs changes in the dynamically generated C++ code, in the C++ code templates, in the Python code, … (or in several/all of them).
  2. Consider the CUBA example from the documentation (Example: CUBA — Brian 2 2.10.1 documentation) and how it gets exported with the baseexporter. Discuss what changes if the initial values for P.v are not given as 'Vr + rand() * (Vt - Vr)' (as in the current example), but as P.v = Vr + np.random.rand(len(P)) * (Vt - Vr), and how the baseexporter could be extended to better deal with this situation.

Obviously, please do not hesitate to ask in case anything is unclear.

I forgot to mention: regarding the baseexporter, it might be interesting for you to have a look at the blog of the GSoC student (@VigneswaranC) who implemented that feature back in 2020 :blush: Vigneswaran's GSoC-2020 – Diary of my summer GSoC project with Brian team

1 Like

Hi @mstimberg,
I am Srita, a junior undergraduate in Electrical Engineering, and I am really interested in this project.

From what I saw, the existing store/restore seems to work for both the C++ standalone and Python runtime options. However, when Poisson processes or randomness are involved in the example, it doesn’t seem to work then. It looks like this issue comes from get_random_state and set_random_state not being implemented for the C++ standalone case.

So implementing these in cpp_standalone/device.py should fix the problem. For get_random_state, we can probably just read the file contents directly since the random generator states are already saved in the output_directory as random_state_generators.

Please let me know if this makes sense or if I am missing something

Hi Marcel, I went through Vigneshwaran’s GSoC project and the evolution of the baseexporter, especially how it builds a structured representation of models for export.

While looking at the CUBA example, I noticed that baseexporter seems to work well with scalar expressions like Vr + rand() * (Vt - Vr), but NumPy-style vectorized initialization (e.g., np.random.rand(len(P))) doesn’t map directly to the exported representation or C++ code.

It seems like the current design assumes expressions rather than array-based initialization. Would it make more sense to handle such cases by generating explicit loops in C++ during export, or by extending the exporter abstraction to represent vectorized initialization more explicitly?

I’m planning to explore this further as part of my proposal and contributions.

Thanks for the guidance!

hi @mstimberg

I am excited to apply for the Brian Simulator Serialization project for GSoC. With experience in Python and C++, I’m particularly interested in improving Brian’s ability to save, restore, and share simulation states. While exploring the brian2tools repository, especially the basicexporter, I noticed that Brian currently struggles with fully exporting network architectures and restoring states across different backends like C++ standalone and Brian2CUDA.

Some key challenges I identified include:

  • connecting Python-side metadata with the C++/CUDA runtime state,

  • choosing the right data formats (NWB, NEO, HDF5) for broader compatibility, and

  • making the store/restore mechanism safe and reliable for larger simulations.

I’m looking forward to helping solve these issues by extending the exporter, creating a matching importer, and improving checkpointing so Brian becomes more reproducible and easier to integrate with other neuroscience tools. I would also love to hear your thoughts on which data format should be prioritized first.

Hi @mstimberg,

I am Pinky Kumari , BTECH CSE 2nd year student

I’ve been exploring Brian’s exporter and the challenges around serialization, especially with vectorized initialization and cross-backend state restoration.

For my proposal, I’m considering introducing a Unified Intermediate Representation (UIR) to decouple model definition from backend execution (Python, C++ standalone, Brian2CUDA). For cases like NumPy-based initialization, instead of only translating expressions or generating loops, I’m thinking of representing them in an abstract form (e.g., distribution-based metadata), allowing each backend to handle execution efficiently.

I’m also interested in addressing:

  • consistent state mapping across backends,

  • using HDF5 for scalable serialization,

  • and implementing a reliable (possibly incremental) checkpointing system.

Would this kind of abstraction align with Brian’s design direction?

Thank you!

Hi @mstimberg and @arnab1896

I’m very interested in the Brian Simulator Serialization/Deserialization project for GSoC 2026.

I’m currently working as an Associate Data Scientist at Sigmoid (1+ year experience), where I’ve built ML pipelines involving structured data flow, checkpointing, and reproducibility (e.g., RAG systems with embeddings and FAISS). This makes the problem of saving/restoring simulation state particularly relevant to my work. I primarily work in Python and have experience with C++ also.

I’ve started exploring the brian2 and brian2tools repositories, along with the CUBA example. From my understanding, extending store/restore to C++ standalone mode would likely require coordination across Python-level definitions, code generation templates, and runtime C++ state handling.

For the CUBA example, I noticed that initialization like:

P.v = ā€˜Vr + rand() * (Vt - Vr)’

fits well with the current exporter since it is symbolic. However, if we instead use:

P.v = Vr + np.random.rand(len(P)) * (Vt - Vr)

this introduces explicit array-based initialization that is not directly representable.

I’m currently exploring two approaches:

  • generating explicit loops in C++ during export to handle per-neuron initialization

  • extending the exporter abstraction to represent vectorized initialization more explicitly

I’m also looking into how Python-side state maps to generated C++ structures and evaluating formats like HDF5/NWB/NEO for storing both architecture and dynamic state.

One question I had:

For standalone mode, would you prefer a template-level extension (injecting serialization into generated C++ code), or a more backend-agnostic serialization interface defined at the Python level?

I’ll share a more detailed architecture and implementation approach in my proposal.

Looking forward to your guidance!

Best,

Anurag Mishra

Just a quick note about that: an initialization with a string expression such as neurons.V = 'Vr + rand() * (Vt - Vr)' is expressed as a scalar expression per neuron, but over all neurons it is vectorized (implemented as array operations for Python code generation with numpy, or as a C++ loop with C++ code generation). It gives the same result as neurons.V = Vr + np.random.rand(len(neurons)) * (Vt - Vr). The first variant has a few advantages in particular for serializing the state of course (we discuss this a bit in our 2014 paper: Frontiers | Equation-oriented specification of neural models for simulations ). But both ways have to be supported by our exporter, there are valid use cases to express initial values as an array of values and not by an expression!

This goes beyond the scope of this project, I’d say – what we call ā€œabstract codeā€ is already an intermediate representation. There are different ways of doing things like this (indeed, one could have special objects to represent random numbers for example), but we don’t consider changing the fundamental approach that Brian uses for this project.

I’d be happy with abstracting things to handle them consistently across backends, but please have a detailed look at the current capabilities of brian2 and brian2tools, and how they’d fit into your proposed abstraction (while keeping backwards-compatibility where possible).

See my reply about this point above. I agree that the symbolic initialization is preferable for an exporter (in particular if the goal is a human-readable description like in the markdown exporter), but both types have to be supported. Initial values for variables may also represent things like input stimuli, which might be loaded from disk as concrete values instead of being generated via a mathematical expression.

There are already bits and pieces in the C++ code that serialize model state to disk, and there are also ways to load variable initializations from disk at the start of a run. The main work would be about putting everything together, and most of the work would be on the Python side.

Hello @arnab1896 and @mstimberg

I’ve been following the discussion regarding the Intermediate Representation. I appreciate the clarification that the focus should be on practical integration rather than changing Brian’s fundamental abstraction.

Coming from a Software Engineering background with experience in distributed state management (Raft) and embedded C++ (STM32), I’m particularly interested in the ā€œbits and piecesā€ you mentioned that already exist in the C++ backend for state serialization.

My proposed approach focuses on the Python-side coordination:

  1. Unified Checkpointing: Developing a Python interface that triggers the existing C++ serialization hooks for ā€œStandalone Modeā€ while managing the model metadata in Python.

  2. Hybrid Serialization: Handling ā€œSymbolicā€ initializations (via the refactored ā€œbasicexporterā€) and ā€œConcreteā€ array data (via binary/HDF5 streams) to ensure reproducibility.

  3. Validation Layer: Implementing a consistency check to ensure a saved state matches the current network architecture during a restore() call.

I am focusing my proposal on making these existing tools work together seamlessly. Do you have a specific list of the ā€œbits and piecesā€ in the C++ templates that you’d like to see unified first?

Best,
Ravin Muthukumarane

Hi @mstimberg,

I’m Mushkan Rana, and I’ve recently started exploring Brian2 by working through issue #1769.

Before making changes, I first tried to understand what the issue was really about and what behavior was expected. After going through the relevant test/code area, I worked on a small fix in my fork and opened a PR. That process helped me get a much better understanding of Brian2’s testing workflow and how small behavioral issues can affect multiple CI jobs across platforms.

While doing that, I also became very interested in Brian’s exporter / backend side — especially around serialization, initialization behavior, and how model state can be represented consistently across different execution targets.

For a possible GSoC proposal, I’ve been thinking about whether a more backend-independent intermediate representation could help with things like:

  • vectorized / NumPy-based initialization,

  • cross-backend state restoration,

  • serialization / deserialization,

  • and possibly checkpoint / restart support.

At this stage, I’m still trying to understand what would actually be most useful for Brian2, so I wanted to ask:

Would this general direction align with Brian2’s design goals, or would you recommend focusing on a different area first?

Thank you !

Hi @mstimberg and @arnab1896,

I’m very interested in the Brian Simulator Serialization/Deserialization project for GSoC 2026. While going through Brian’s examples—such as adaptive threshold models
dVT/dt = (VT0 āˆ’ VT)/Ļ„t
with spike-triggered updates) and STDP mechanisms involving differential traces
I realized how powerful Brian is in letting researchers implement these rules directly from equations. This makes serialization even more important: being able to save and reload dynamic states like threshold variables, plasticity traces, synaptic weights, and delays would significantly improve reproducibility and long-running experiments.

I’m currently working as an Associate Data Scientist at Sigmoid, where I often deal with system checkpoints, workflow reproducibility, and structured state management—skills that map well to building a clean and reliable serialization layer for Brian. I work primarily in Python and also have hands-on experience with C++.

I’ve begun exploring the Brian2 codebase and the way neuron groups, synapses, and state variables are represented internally. My goal is to design a lightweight but extensible serialization interface that can capture full simulation state across Python execution and standalone mode, while keeping Brian’s philosophy intact: equations first, complexity hidden. I’m evaluating approaches like structured metadata storage plus binary snapshots, and how event-driven variables (like STDP traces) can be consistently represented.

One question I had:
For standalone mode, would you prefer serialization logic to be embedded directly in the generated C++ code, or should the API stay Python-level while standalone implements the backend-specific details?

I’ll share a more detailed design sketch in my proposal.

Looking forward to your guidance!

Hi @arnab1896

Why I want to work on this

I’ve been interested in how the brain works for a long time, and when I found out you could actually simulate spiking neurons in Python, I spent a whole evening just reading about Brian2. The more I explored it, the more I noticed this one frustrating gap — you can build a beautiful network, run a simulation, but you can’t easily save it and pick up where you left off. That bothered me. It’s the kind of problem I actually want to solve.

What the problem is

Right now if you’re running a long Brian2 simulation and something goes wrong, you start over. There’s no proper way to checkpoint your simulation state, export your network architecture**, or import it back later. For anyone doing serious neuroscience research, that’s a huge limitation.**

What I plan to do

I’ll extend the existing store/restore system so it works with Brian2’s C++ standalone mode and Brian2CUDA. Then I’ll refactor the basicexporter so it can properly capture a full network architecture, and build a matching basicimporter so you can reconstruct that network from scratch. By the end, saving and loading a Brian2 simulation should feel as natural as saving a file.

How I’ll do it

The first two weeks I’ll spend really understanding the codebase — reading the exporter code, talking to mentors, figuring out the best serialization format. Then I’ll work through the exporter refactor, build the importer, and spend the last few weeks writing solid tests and documentation so it actually gets used.

Why I’m the right person

I’ve built ML and neural network projects so I already understand how training pipelines work, why checkpointing matters, and what breaks when serialization is done poorly. I know Python and C++ well enough to work across both. I’m not a neuroscience expert yet but I’m genuinely curious about it — and I think that curiosity matters more than anything else for a project like this.