Existing multi-modal machine learning techniques have been developed for relatively abundant
data, with overall high signal-to-noise ratio (SNR): text, natural images, videos, sound. These
data are most often non-ambiguous, while brain data typically are, due to the low SNR per image
and, more crucially, poor annotation quality. We propose to tackle this by adapting machine
learning solutions to this low-SNR regime: introduction of priors, aggressive dimension reduction,
aggregation approaches and data augmentation to reduce overfit.
While data sources contain lots of implicit information that could be used as targets in supervised
learning, there is most often no obvious way to extract it. We propose to tackle this by using
additional, ill- or not-annotated data, relying on self-supervision methods.
This project is part of the Karaib AI chair. It develops representation techniques for low-SNR data to
couple brain data with descriptions of behavior or diseases in order to extract semantic structure.Eventually, one should be able to reason about the information extracted within this project. For this, we will develop dedicated statistical, causal and the formal (ontology-based), namely the
Neurolang system.
Please find more information here: https://team.inria.fr/parietal/files/2021/08/karaib_post_doc.pdf