GSoC 2020 project idea 12: NIX file system backend support

The NIX[1] project aims to develop standardized methods and models for storing electrophysiology and other neuroscience data together with their metadata in a common, open file format. Currently implementations of NIX use HDF5[2] to store data, however the NIX specification does not define a storage format. Instead, NIX was designed to be file format agnostic, so that storage formats can be chosen based on the usage requirements.

The current libraries were developed to allow alternate storage backends to be implemented with little modifications to the existing code. This project will aim to implement a filesystem-based backend for storing NIX data. Some preliminary work has already been done on this backend, which stores data in multiple files in a nested directory tree, using appropriate binary file formats for data, like NumPy’s npy format[3] for numeric data, and YAML for metadata.

The aim of this GSoC project would be to complete the implementation of the file system backend such that it is at feature-parity with the HDF5 backend and to implement methods for copying data between files of possibly different backends. Skills needed: C++ (C++11).

Mentors: G-Node & NIX Core Team (Achilleas Koutsou, @achilleas-k; Jan Grewe, @jgrewe; Michael Sonntag, @mpsonntag)

[1] https://github.com/G-Node/nix
[2] https://en.wikipedia.org/wiki/Hierarchical_Data_Format
[3] http://docs.scipy.org/doc/numpy/neps/npy-format.html

Hi @malin and @mpsonntag,
I am a math and cs undergraduate student at Ecole Polytechnique, France and I am interested in this project. I have prior experience in C++ and HDF5 file format due to my involvement in GSoC 2019. I have currently installed NIX by building the source code in Linux and I am trying to run some examples. I was wondering if there an evaluation task that one needs to do before applying for this project?

Hi @Shrey_Aryan. Sorry for not getting back to you sooner.

We don’t have any evaluation tasks in mind for preparing for the application. You could have a look at the provided links, in particular the npy format (sorry, the link in the original post seems to be outdated).

As mentioned in the original proposal (original post here), there is a prototype / proof of concept for the filesystem backend in NIX already in the repository but it is incomplete. The most important feature is actually missing and that is support for storing data, which we would like to be NumPy compatible, hence the npy format requirement.

Thanks for linking your code from last year’s GSoC. If you have any other publicly available code that might be relevant, please link it here.

Thanks.