GSoC 2024 Project Idea 12.1 Writing a R package for the computation of the O-information (350 h)

greg_incf · February 23, 2024, 4:59pm

Real-world systems are often characterized by higher-order interactions (HOIs) within multiplets i.e. groups of three or more units (Battiston et al., 2021). In neuroscience, most pieces of evidence we have about brain networks come from the interactions between pairs of brain regions but little is known about what type of information remains hidden in the non-pairwise interactions (Luppi et al. 2023, Luppi et al. 2024). Interestingly, recent findings suggest that HOIs might be a better neural marker of neurodegeneration than standard pairwise approaches (Herzog et al., 2022).

Several methods have been proposed to estimate HOIs, from popular fields like graph- and information-theory. The O-information (short name for “information about Organisational structure”) is an information-theoretical quantity to characterize statistical interdependencies within multiplets of three and more variables (Rosas et al., 2019). It allows us to not only quantify how much information multiplets of brain regions are carrying but also informs us on the nature of the information i.e. whether multiplets are carrying mainly redundant or synergistic information.

Estimating HOIs is computationally intensive. As an example, a cortical parcellation dividing the brain into 80 distinct regions involves estimating HOIs in 80.000 triplets, in 1.5 million quadruplets, in 24 million quintuplets, etc. The computational burden of the O-information only relies on simple quantities like entropies, which makes the O-information an ideal candidate to estimate HOIs in a reasonable time. Still, there is yet no neuroinformatic gold standard to estimate HOIs, in a decent amount of time and accessible to network enthusiasts encompassing experts and non-experts.

Currently an R implementation is missing, limiting the adoption to a relevant part of the community, in particular colleagues working with behavioral data and psychometrics.

Project aims and tasks
This project aims at building a R package, missing at the moment, for the computation of this quantity.

We divided this project into five main tasks:

Test current implementation in Matlab and Python
Build R functions to compute the Total Correlation and the Dual Total correlation
Implement and test statistical validation for the multiplets
Data simulation: add a function to simulate HOIs
Explore plotting solutions, in R or preparing the output for plotting with existing packages such as XGI
Explore interfaces with other R packages used in psychometrics (https://lavaan.ugent.be/ http://psychonetrics.org/ CRAN - Package psychotools)
Prepare a package to be submitted to CRAN

Ultimately, this project could lead to the establishment of a gold standard to go beyond pairwise interactions by measuring HOIs, accessible to R experts such as to users with little programming knowledge.

Skill level: Intermediate/advanced

Required skills: R, some Python

Time commitment: Full-time (350 h)

Lead mentor: Daniele Marinazzo (daniele.marinazzo@gmail.com)

Project website:

Backup mentors: Fernando E. Rosas (f.rosas@imperial.ac.uk), Pedro Martinez Mediano (p.mediano@imperial.ac.uk)

Tech keywords: R, Python

Mingcong_Tang · March 1, 2024, 4:34am

Dear mentors @daniele.marinazzo@gmail.com, @f.rosas@imperial.ac.uk, @p.mediano@imperial.ac.uk. I’m Mingcong. This is a really interesting project and I want to express my great interests in contributing to it. Could you please guide me on how I might get started? Thank you very much!

Best Regards,
Mingcong

Summary

This text will be hidden

Daniele_Marinazzo · March 1, 2024, 8:43am

Dear Mingcong
thanks a lot for your interest.
You can have a look at the paper introducing the measure [1902.11239] Quantifying High-order Interdependencies via Multivariate Extensions of the Mutual Information, and at current implementations
GitHub - danielemarinazzo/HOI: Retrieving high-order information multiplets from data using the O-information
GitHub - brainets/hoi: Higher-Order Interactions

Mingcong_Tang · March 6, 2024, 2:06pm

Dear @daniele.marinazzo@gmail.com

Thank you very much for your prompt response and sharing these interesting resources. I have started to familiarize myself with the materials you provided. I realized I hadn’t mentioned earlier that I am currently a psychology master student at Boston University, and I usually use R (sometimes Python or other softwares) in my research. Could you please advise on the next step or particular areas within the project that you would like me to focus on? Thank you very much!

Daniele_Marinazzo · March 6, 2024, 4:22pm

You can explore current approaches used in R to compute entropy, and then look at the Gaussian entropy used here GitHub - robince/gcmi: Functions for calculating mutual information and other information theoretic quantities using a parametric Gaussian copula. (theory here Entropy of the Gaussian or in the paper linked to the repo, the Gaussian copula is not relevant in this case for now), and see if you can implement it in R.

Mingcong_Tang · March 6, 2024, 5:01pm

Thank you very much for your guidance! I will delve into the analyses in R and will keep you updated on my progress.

Best regards,
Mingcong

Federico_Brancasi · March 7, 2024, 3:40pm

Hello everyone!!! My name is Federico and I am a passionate computer science student with a fervent curiosity for scientific research and artificial intelligence. I am currently pursuing a double master’s degree in Computer Science at the University of Trento, Italy, and Eötvös Loránd University in Budapest, Hungary.

I would like to introduce why I want to actively participate in this Google Summer of Code project. First, I have always been interested in participating in open source projects, as I strongly believe in the importance of collaboration and free exchange of knowledge in the technology community. This is an excellent opportunity for me to contribute to a real project, one that has the potential to have a significant impact in the field of neuroscience and psychometrics.

The idea of going beyond pairwise interactions in the field of neural networks particularly fascinates me. It is a novel approach that can lead to a deeper understanding of how the human brain works and potentially to new treatments for neurodegenerative diseases.

In addition, this project offers me the opportunity to apply theoretical knowledge gained during my graduate studies in a practical and meaningful context. I am excited about the idea of working on computationally intensive problems and developing efficient solutions that can be used by a wide range of users, including experts and non-experts.

One of the main reasons I feel attracted to this project is the opportunity to be mentored by experts in the field. The prospect of learning from qualified professionals and working closely with them is extremely exciting for me. I am confident that this experience will allow me to grow both professionally and personally.

Here I attach a link to my CV: link.

Daniele_Marinazzo · March 7, 2024, 8:26pm

Dear Federico

thanks a lot for your interest!
This platform is to exchange information on technical aspects of the project. Please feel free to ask questions in this direction, after looking at the material already shared.
When time comes to apply for a project, you can do so through the dedicated GSoC portal.
Kind regards

drahmdshahn · March 20, 2024, 10:04am

Dear Dr. Daniele,
I am interested in joining this project.
How can I join it?

Thanks

Daniele_Marinazzo · March 20, 2024, 10:22am

Dear Ahmed
Thanks for your interest.
you can browse the reading material and the existing repo pasted above in this thread.
If you want you can already start working on some code implementation.
Then, until April 2 you should formally submit your proposal in the Google Summer of Code portal.