How can I load the IBL LFP data? Spikeglx does not seem to be available for python

babou_osellot · December 8, 2023, 9:32pm

I am trying to do some spectral analysis on the IBL LFP and relate that to spiking and behavioural data in the IBL but I am having issues loading the LFP even after reading the tutorials and documentation online tutorials. I have two questions.

1) Spikeglx is required to load the IBL LFP files and supposedly this is a python api, but I don’t see it. Pip install doesn’t find it either. Your tutorials don’t show how to get this either.

It seems like to me that I need a package that is not available online. Please note I’m not a member of the IBL. I have installed the ONE-api, brainbox and ibllib but I cannot get spikeglx from anywhere and it seems to be required to work with your lfp data. Is it possible to use npyx in some way instead?

Also I’m curious about where the information about things like sample rate and time offset is stored so I can align the LFP to the spiking and behavioural data later down the line. I’ve read the documentation and this is not really clear from it.

Should I just copy paste the python folder from the Spikeglx Github Repo file and use that? https://github.com/jenniferColonell/SpikeGLX_Datafile_Tools/blob/main/Python/DemoReadSGLXData/readSGLX.py

I saw there is this repository as well: https://github.com/jenniferColonell/SpikeGLX_Datafile_Tools/blob/main/Python/read_SGLX_analog.ipynb
And brainbox.io.spikeglx as well as ibllib.io.spikeglx both exist, but it’s not clear from the tutorial on LFP and the documentation if that’s what running import spikeglx is getting.

In the word documentation on the ALyx files https://docs.google.com/document/d/1OqIqqakPakHXRAwceYLwFY9gOrm8_P62XIfCTnHwstg/edit on page 25 there is example code showing a way to read the lfp files into python via ibllib.io import spikeglx but this returns an error for me:

from ibllib.io import spikeglx:

--------------------------------------------------------------------------- ImportError Traceback (most recent call last) /home/acampbell/Stienmetz2019Reanalyzed/ExtractingSWRs/ibl_swr_data/ibl_swr_detector_testing.ipynb Cell 5 line 1 ----> 1 from ibllib.io import spikeglx ImportError: cannot import name ‘spikeglx’ from ‘ibllib.io’ (/home/acampbell/miniconda3/envs/ONE_ibl_env/lib/python3.10/site-packages/ibllib/io/init.py)

Overall it is not clear at all from the tutorial which option for this library we are meant to use. I could just try reading the file in myself which brings me to the next question:

2) LFP Preprocessing Questions (bit-to volts, accessing sampling rates, reference electrode etc etc)

If I go the route of just writing my own code to load the LFP files I want to make sure I am doing the preprocessing correctly. I’d prefer to stick to IBL recommendations rather than juryrigging my own solution, in case I miss something and mess the data up somehow.

I was downloading LFP from the figshare repo for the first dataset the IBL released (from the 2017 paper Distributed coding… Steinmetz et al., 2017) by hand and then running this:

with open(r'/space/scratch/steinmetz2019data/LFP/Tatum_2017-12-06/Tatum_2017-12-06_K2_g0_t0.imec.lf.bin', 'rb')
as fid:

    probe_K0_2500hz = np.fromfile(fid, np.int16).reshape((-1, 385))

This gives me a numpy array, and then I have to make a 1d array for sampling times. For the I then multiple the values by 1.95 for the conversion bit to volts rate and subtract the median values for each channel. Not sure whether that’s the proper processing I should be doing. Don’t I need to use the reference channel as well somehow? Also I need to align the 1d array of sampling times to the clock being used by the spikes, trials, wheels (etc) objects.

I’m at the stage in my analysis where I want to automate this process of pulling data, running it through an event detector and converting the output into an Alyx formated .npy files (with events and times or intervals stored as 1d vectors) which I can then analyze against trial and spike data.

Gaelle_Chapuis · December 11, 2023, 9:36am

Hello, regarding the first question

Have you tried using the spikeglx.Reader module from the ibl-neuropixel repository as explained in the documentation here:
https://int-brain-lab.github.io/iblenv/notebooks_external/loading_raw_ephys_data.html

?

Gaelle_Chapuis · December 11, 2023, 9:42am

Regarding the second question:

This is not data from the IBL, this is data collected personally by Nick Steinmetz prior the IBL was formed. I would advise to contact him directly for question on his dataset (I believe you have his email).

Otherwise, have you tried doing processing with IBL data and encountered issues? Is the roadblock mainly the spikeglx module (hopefully the first answer helps)?

What you need to do before using the LFP data is to apply the destriping as shown in the documentation here:
https://int-brain-lab.github.io/iblenv/notebooks_external/loading_raw_ephys_data.html#Example-1:-Destripe-AP-data

Kind regards

babou_osellot · December 11, 2023, 9:23pm

Thank you very much for your reply.

from brainbox.io.spikeglx import Streamer works fine and so does importing from ibllib.plots import Density how ever Streamer asks me to then manually enter my password for the IBL. I’m not a member of the IBL and I didn’t think I needed a password to access the data. How can I automate this process if I am required to provide a password, is there a way to code that step into my script?

I just tried to load a different modality with this tutorial: Loading Spike Waveforms — IBL Library documentation and I am also asked to provide a password. I thought the IBL data was publicly available? Have I missed a setup step?

This was the code that froze at the line that calls Streamer(). It is identical to the code from your tutorial except I call ‘lf’ for the typ parameter instead of ‘ap’:

one = ONE()

pid = 'da8dfec1-d265-44e8-84ce-6ae9c109b8bd'

time0 = 100 # timepoint in recording to stream

time_win = 1 # number of seconds to stream

band = 'lf' # either 'ap' or 'lf'

sr = Streamer(pid=pid, one=one, remove_cached=False, typ=band)

s0 = time0 * sr.fs

tsel = slice(int(s0), int(s0) + int(time_win * sr.fs))

# Important: remove sync channel from raw data, and transpose

raw = sr[tsel, :-sr.nsync].T

Giving me this error:

---------------------------------------------------------------------------
HTTPError                                 Traceback (most recent call last)
Cell In[5], line 9
      6 time_win = 1 # number of seconds to stream
      7 band = 'lf' # either 'ap' or 'lf'
----> 9 sr = Streamer(pid=pid, one=one, remove_cached=False, typ=band)
     10 s0 = time0 * sr.fs
     11 tsel = slice(int(s0), int(s0) + int(time_win * sr.fs))

File ~/miniconda3/envs/ONE_ibl_env/lib/python3.10/site-packages/brainbox/io/spikeglx.py:127, in Streamer.__init__(self, pid, one, typ, cache_folder, remove_cached)
    125 self.cache_folder = cache_folder or Path(self.one.alyx._par.CACHE_DIR).joinpath('cache', typ)
    126 self.remove_cached = remove_cached
--> 127 self.eid, self.pname = self.one.pid2eid(pid)
    128 self.file_chunks = self.one.load_dataset(self.eid, f'*.{typ}.ch', collection=f"*{self.pname}")
    129 meta_file = self.one.load_dataset(self.eid, f'*.{typ}.meta', collection=f"*{self.pname}")

File ~/miniconda3/envs/ONE_ibl_env/lib/python3.10/site-packages/one/util.py:161, in refresh.<locals>.wrapper(self, *args, **kwargs)
    159     mode = self.mode
    160 self.refresh_cache(mode=mode)
--> 161 return method(self, *args, **kwargs)

File ~/miniconda3/envs/ONE_ibl_env/lib/python3.10/site-packages/one/api.py:1869, in OneAlyx.pid2eid(self, pid, query_type)
   1867 if query_type == 'local' and 'insertions' not in self._cache.keys():
   1868     raise NotImplementedError('Converting probe IDs required remote connection')
-> 1869 rec = self.alyx.rest('insertions', 'read', id=str(pid))
...
--> 675     raise requests.HTTPError(rep.status_code, rep.url, message, response=rep)
    676 else:
    677     rep.raise_for_status()

HTTPError: [Errno 400] https://openalyx.internationalbrainlab.org/auth-token: 'Alyx authentication failed with credentials: user = intbrainlab, password = None'

Please help.

Gaelle_Chapuis · December 12, 2023, 2:30pm

Hello,

Have you tried setting up ONE as per the instructions here:

https://int-brain-lab.github.io/iblenv/notebooks_external/one_quickstart.html

from one.api import ONE
ONE.setup(base_url='https://openalyx.internationalbrainlab.org', silent=True)
one = ONE(password='international')

Mayo · December 12, 2023, 3:24pm

To expand a little on the answers above.

spikeglx is a module that is included in the package ibl-neuropixel. You can install it using pip like so pip install ibl-neuropixel. This should have been installed as a requirement when installing ibllib.

You can use the spikeglx Reader in this module to read in any lf or ap data that has been collected using spikeglx. For example you could read in the steinmetz dataset that you reference in the following way.

import spikeglx
from pathlib import Path
file_path = Path(r'/space/scratch/steinmetz2019data/LFP/Tatum_2017-12-06/Tatum_2017-12-06_K2_g0_t0.imec.lf.bin')
sr = spikeglx.Reader(file_path)

# access metadata
meta = sr.meta

# access the first 10s of data on all non-sync channels
data = [0:int(10 * sr.fs), :-sr.nsync]

Note that the data returned here has already been converted to volts, so you don’t need to do this step.

The brainbox.io.spikeglx module provides the Streamer class that allows you to stream portions of that raw data. This is useful when you don’t want to download the whole file, but just access some snippets throughout the data. This module can only be used with IBL data and you will not be able to use it with the Steinmetz data. As Gaelle wrote above, the very first time you want to access IBL data you will need to configure and setup ONE with code that she sent above.

The import from ibllib.io import spikeglx is deprecated and is now replaced by import spikeglx. Apologies for that, I have updated the documentation to correct this.

To convert between raw data times and event times (trials, wheel events) you will need to use the spikes.times.npy and spikes.samples.npy datasets. The spikes.times gives the timing of the spikes aligned to behavioral events. The spikes.samples gives the sample of each spike that corresponds to the sample found in the raw data files. In this way you can link the raw data to the trial events.

babou_osellot · December 12, 2023, 4:39pm

Yes I did those, I already set that as my password. It seems like I need to run one = ONE(password='international') every time I load the module. I’ve done the setup before, I assumed that was my default password.

babou_osellot · December 12, 2023, 4:58pm

import spikeglx works now as well after running pip install neuropixels-ibl. Thank you.

To convert between raw data times and event times (trials, wheel events) you will need to use the spikes.times.npy and spikes.samples.npy datasets. The spikes.times gives the timing of the spikes aligned to behavioral events. The spikes.samples gives the sample of each spike that corresponds to the sample found in the raw data files. In this way you can link the raw data to the trial events.

Excellent thanks for letting me know. Can I align the behavioural measures and task variables to the LFP time in a similar manner to the spikes objects? For instance is there a wheelmoves.samples as well?

I think this is all I need to proceed. Thanks to both of you.

Gaelle_Chapuis · December 19, 2023, 10:50am

Hello,

We do not have the wheel in samples, and so the best practice for now is to use the .times values and convert them to .samples using the frame rate (from the sr ).

I have asked internally if we have a method that is more precise, and will update here once we have an answer.

Kind regards

Edit : @owinter is working on it and hopefully a transfer function will be available in a few days.
Please note that we have not vouched for the quality of the LFP, and have released data blindly. We are in the process of vetting this data, but it will take time and we will be done hopefully in 2024.

babou_osellot · December 19, 2023, 6:17pm

I can also try interpolating a times array for the .sample, that should work for now.

owinter · December 20, 2023, 9:02pm

Hello,
There is a dataset which each recording that describes the relationship between ephys time and task time.

Here you can get this dataset as such

dsets = one.list_datasets(eid, collection=f'raw_ephys_data/{pname}', filename='*sync.npy')

The samples refers to AP samples, but LF samples are always sampled at 1/12 the AP rate.

Hope this helps.