SpikeSortingLoader.load_spike_sorting() gets stuck repeatedly

Hello!!

I’m trying to load spikes through the SpikeSortingLoader.load_spike_sorting() for many sessions using a for loop, but it gets stuck loading on something (e.g. spikes.depths.npy). It’s not the same object that gets stuck everytime.

Is there something about my setup or code that could be causing this issue? I’m in jupyter notebook extension for visual studio code, ssh into a remote server.

from one.api import ONE
one = ONE(base_url='https://openalyx.internationalbrainlab.org', silent=True, password='international', cache_dir='/space/scratch/IBL_data_cache_katie')

def sessions_to_csv(session_list, filename):
    """  
    Write a session id, probe id and CA1 clusters counts to a csv 
    :param session_list: list of session IDs 
    :param filename: string name of csv to write to 
    """
    df = pd.read_csv(filename)  

    for session_ind in range(len(session_list)):
        session_id = session_list[session_ind]
        probe_ids = get_insertion_indices(insertions, eidlist = session_id, datafield = 'id')  
        
         
        for pid in probe_ids:

            if pid in df.probe_id.values: ## if probe has already been recorded, move to next probeid
                continue
            
            sl = SpikeSortingLoader(pid=pid, one=one, atlas=ba)
            spikes, clusters, channels = sl.load_spike_sorting()
            clusters_labeled = SpikeSortingLoader.merge_clusters(spikes, clusters, channels)
            
            if clusters_labeled is None:
                print('EMPTY: session ' + session_id + ', probe ' + pid + " is none :(")
                continue
            
            ca1scores = np.where((clusters_labeled['acronym'] == 'CA1'),clusters_labeled['label'],-1) #array of ca1 cluster scores, = -1 if not a ca1 cluster
            count_ca = len(np.argwhere(clusters_labeled['acronym'] == 'CA1')) #number of ca1 units 

            if (count_ca > 0): 
                count1 = len(np.argwhere(ca1scores == 1)) #the number of ca1 units that score 1 
                avgscore = np.sum(ca1scores[np.argwhere(ca1scores >= 0)]) / len(np.argwhere(ca1scores >= 0)) #average score all ca1 units
                percent1 = count1 / len(np.argwhere(ca1scores >= 0)) #percent of ca1 units that score 1, out of all ca1 units 
            
            else: 
                count1 = 0
                avgscore = 0
                percent1 = 0
                    
            with open(filename,'a',newline='') as csvfile:
                sesswriter = csv.writer(csvfile, delimiter=',', quotechar='|', quoting=csv.QUOTE_MINIMAL)
                sesswriter.writerow([session_id, pid, count_ca ,count1,percent1,avgscore])




sessions_to_csv(ibl_sessions_with_ca1,'sessions_with_CA1.csv')

Thanks.

Hello,

This happened to one of researchers as well recently.

It could be that the Vscode / Jupyter plugin doesn’t interact well with the threading used for downloads. We will have to investigate a bit more to understand what is really going on.

In the short term, she solved it by separating input functions (such as the make csv above) from analysis function which is a good practice anyways.

Then she ran the loading function in the terminal :

  • open a terminal
  • activate your python environment
  • python make_csv.py

If you’d rather have an interactive console, you could also do:

  • ipython
    then you can copy/paste the code in interactive mode in the python console

In the meantime I have created an issue here: GitHub - int-brain-lab/ibllib: IBL core shared libraries

Cheers,
Olivier

This worked for me, thank you very much!!