Classification of natural sounds with EEG and Python


I was trying to build a simple pipeline in Python to classify sounds using a linear model based on FIR filters. I am following this paper/tutorial:

My question is: if I want to classify sounds using based on EEG signals, which electrodes should I use to do my analysis? The data that I have has the following electrodes:

Should I use the ones measuring the temporal lobe, that is, T7 and T8? Those are the ones that are near the auditory cortex.

Thank you in advance,

Ahoi hoi @blancoarnau,

thanks for the very interesting post!

IHMO there are several factors that might guide or even determine the electrodes you might want to use for your analyses.

  1. Signal quality: how does the signal of each electrode look after preprocessing (e.g. filtering, ICA, artifact removal, etc.)? If some are very noisy, show prominent artifacts and so on, you of course don’t wan to use them in your analyses.
  2. What is your hypothesis? Sure, electrodes around/close to the temporal lobe might be a good target because of their proximity to the auditory cortex. However, there might be certain processes you won’t capture.

Assuming all electrodes exhibit a signal you can in principal work with and given that there are not that many electrodes in total, you could also use all and evaluate the distribution of classification performance (also over time). Just in case, mne is a fantastic package to do this (and many other) kind(s) of analysis. The respective docs have great tutorials, e.g. decoding or encoding, on that. Then again, I don’t work a lot with EEG, so take all of that with a grain of salt. Hopefully, other more experienced folks, will chime in as well.

HTH, cheers, Peer

Hoi @PeerHerholz,

Thank you for the advice and the links regarding decoding and encoding.

I will try to see if I am able to do it myself.


Hello again @PeerHerholz,

So in your opinion, would you use all the electrodes with the model?

Thanks in advance.

Ahoi hoi @blancoarnau,

as mentioned in my previous answer it heavily depends on the data quality and hypothesis. If the data from all sensors is “in principle good” and you don’t have an hypothesis that includes/postulates certain assumptions about spatial distribution and/or topography, you could use all sensors.
For example, if you’re using a linear model you could obtain respective topographical information on the sensors' "performance" as outlined in this tutorial from MNE. For other potentially interesting examples/analyses please check the tutorials: here, here and here. As your goal is to evaluate if distinct auditory categories/sounds result in distinct “neural patterns (over time)” you could restrict your analyses to electrodes near the auditory cortex, but assuming that participants had to perform a certain task and/or some kind of modulations was applied to the sounds, other electrodes, e.g. frontal ones, might be of interested as well. The important part is that you can motivate your electrode and thus feature selection accordingly and do not run several models deciding on the “best” one (or if you do that be clear about the exploratory approach).

HTH, cheers, Peer