Category : Software / Pipeline
Hello Everyone,
As for my studies I am currently a Bachelor student in Computer Science and my research interests are in high performance computing and neuroinformatics. I have been investigating the memory constraints of loading and registration of huge multi gigabyte sets (for example Allen Neuropixels) together with behavioral video and BIDS compliant fMRI.
To overcome typical RAM constraints in multimodal synchronization I have created and released as open-source an architectural prototype: NeuroAlign
The fundamental methodology employs:
Operating System-Level Memory Mapping: Overcoming the traditional memory allocation with zero-copy mmap for binary ephys files and nibabel proxy objects for NIfTI files.
Scheduled Temporal Coordination: An Object-Oriented (via Python Abstract Base Classes) design for the mathematical alignment of different sampling rates (30kHz vs 60 frames per second) to a single event-oriented timeline.
Data Persistence: Converting the synchronized data segments into HDF5 format for subsequent machine learning processing.
The package includes a CLI for testing and is fully BIDS-aware for TR extraction.
Created it, in part, to investigate the building parameters given in the ecosystem of the Experanto project, but I want to make it as good as possible for general use.
Any input from developers dealing with out-of-core data pipelines right now would be very much appreciated. Particularly:
Are there any other temporal edge cases in BIDS temporal metadata besides standard TR that I should be aware of in the loader?
What is the community doing about floating point precision drift when time aligning 30kHz data over long periods of time?
Thank you for your time and any insights you can provide.