Hi @mstimberg,
I am following up on my previous interest in the serialization/deserialization project for Brian2.
I have spent the last 24 hours setting up my environment and conducting proof-of-concept tests. I successfully implemented a basic serialization test for state variables, but I’ve identified a core challenge: while data serialization is straightforward, the true bottleneck for model portability lies in the serialization of the underlying model equations and parameter namespaces. Given my current work on Apache Fury (specifically focusing on high-performance serialization and Cython-based data streams), I am interested in how we might bridge the gap between Brian2’s internal representation and a portable, efficient format.
I’ve begun examining the basexporter in brian2tools to understand how we currently handle model structure. I have two specific questions to help me focus my proposal:
- Is the priority for this GSoC to focus on adopting a specific standard like HDF5/NWB for results, or are you looking for a more generalized serialization framework that supports multiple backends (like Apache Fury)?
- Are there specific simulation edge cases (e.g., complex spiking networks with heavy parameter namespaces) that you consider the ‘benchmark’ for serialization efficiency?
I look forward to your thoughts as I finalize my proposal.
Hi @Siddhant_Ulekar. Thank you for your interest in our project (but note that this should rather go into the GSOC 2026 Project #1 : Brian Simulator - Serialization/deserialization for Brian sim models, results, and input data (175h/350h) topic to avoid getting dispersed; also, the GDocs tag is a bit confusing).
Regarding your questions, I do not consider efficiency the main problem at the moment, but I agree that at some point it might make sense to have a modular system that allows for different backends. From my point of view, there are two big tasks:
- Bridging the gap between the
store/restore system (for simulation states) and the baseexporter (for model equations and architecture)
- Implementing
store/restore-like behaviour (or potentially something slightly different) for the C++ standalone mode
Supporting a standard like HDF5, NWB, NEO, … could be useful, but these should be rather straightforward when we have everything else in place.
Hi Marcel,
Thank you for the clear guidance! That helps focus my proposal significantly. Based on your feedback, I am restructuring my plan to prioritize the core architectural gaps:
- Unified Serialization Bridge: I will focus on integrating the
baseexporter logic with the store/restore system. My goal is to ensure that a ‘snapshot’ includes not just the variables, but the full model architecture required to reconstruct the network from scratch.
- C++ Standalone Support: I am particularly interested in the challenge of implementing
store/restore for the C++ standalone mode. I’ve been looking at how the CPPStandaloneDevice handles results, and I see the complexity in restoring the spike queue and state variables without a Python loop.
- Modular Backends: I understand that efficiency isn’t the primary bottleneck yet. I will frame the use of Apache Fury(or HDF5) as optional, modular backends for the unified system—providing the ‘storage’ layer once the ‘logic’ layer (bridging the gaps) is solid.
I’ll move my future updates to this topic as requested. I’m currently digging deeper into the brian2tools.export source code to see how to best link it with the Network state."
Hi Marcel (@mstimberg), I’ve been diving into the CPPStandaloneDevice and baseexporter logic. Here is my initial vision for tackling the technical gaps we discussed. I’d love to know if this direction aligns with the complexity you’re expecting.
To address the priorities mentioned, I am drafting my implementation plan around three main pillars:
-
The Unified ‘BrianArchive’ (Bridging the Gap) I propose a unified object structure that links brian2tools.baseexport (the model architecture) with Network.get_states() (the numerical state). By wrapping these in a single container, we ensure that a ‘restore’ operation has the full context needed to reconstruct the Network even if the original script is unavailable.
-
C++ Standalone Native Persistence Since the standalone mode runs outside of Python, I plan to modify the CPPStandaloneDevice templates. The goal is to generate native C++ code that:
- Iterates through
ArrayDeviceVariable buffers.
- Writes/Reads binary streams directly to disk.
- Uses a metadata “sidecar” file to map these binary blobs back to the Python-side variable names upon re-import.
- Modular Backend Architecture I will design the storage layer to be backend-agnostic. While the core logic focuses on the Brian2-specific gaps, users will be able to choose between:
Standard HDF5: For cross-platform interoperability (MATLAB/R).
Apache Fury: For ultra-fast binary serialization, especially useful for large-scale input data and results.
I am currently exploring how to best hook the serialize_state() calls into the generated main.cpp. Any thoughts on the preferred file format for the C++ side’s raw binary output?
At the moment, I’d opt for staying with the same file format that C++ standalone currently uses at the end of a run, i.e. simple raw binary streams without any header/metadata.