Nimare Neurosynth complex features meta-analysis

Alexandre · April 30, 2021, 1:08pm

Hi all,

I think that my question is quite similar to this one, please let me know if my post has to be moved.
I would like to perform a complex features-based meta analysis. A quick tutorial is presented here. However, as Neurosynth is now deprecated, I have installed Nimare.
However, I didn’t find any tuto dealing with complex features selection (using for example AND, OR, NOT, etc.).

Is there a way to perform such meta-analysis under Nimare ?

Best,

Alex

tsalo · April 30, 2021, 2:37pm

Unfortunately, Neurosynth’s search feature was somewhat buggy, and we decided not to incorporate it into NiMARE. While we do plan to support more advanced searching in Neurostore (which was originally tentatively named NeuroStuff), that new service won’t be up and running for a while. If you want to know more about our plans, you can see NiMARE’s About page, which briefly describes a proposed ecosystem.

At the moment, I think your best approach would be to write some custom code combining lists from individual feature searches. I know it’s suboptimal, but on the bright side it would be much more transparent than a complex search function.

For example, if you have a list of IDs from two searches (e.g., fear_ids and happiness_ids), the following should work:

NOT fear: sorted(set(dset.ids).difference(fear_ids))
fear AND happiness: sorted(set(fear_ids).intersection(happiness_ids))
fear OR happiness: sorted(set(fear_ids).union(happiness_ids))

I hope that helps.

Best,
Taylor

Alexandre · April 30, 2021, 2:49pm

Oh, I see…
Ok, I’ll try your solution, thanks!
Another question: does get_studies_by_label return exact terms?
For example, if I set neurosynth_dataset.get_studies_by_label(“Neurosynth_TFIDF_tom”) will it search for IDs that only have “tom” feature or IDs that have the “tom” character string?

tsalo · April 30, 2021, 3:29pm

Yes, it does return exact terms. The method’s documentation could use some improvement.

If you want to search using the equivalent of a wildcard (e.g., Neurosynth_TFIDF_tom), I think you’ll need to do something like:

tom_labels = [label for label in dset.get_labels() if label.startswith("Neurosynth_TFIDF_tom")]

Alexandre · May 1, 2021, 9:11am

I’ll try, thanks!

Alex

Alexandre · May 1, 2021, 4:06pm

Hi,

It seems working.
However, I’ve got another problem with feature including a space in their names.
Actually, the mind_tom_ids=dset.get_studies_by_label("Neurosynth_TFIDF__mind_tom") returns all the ids.
Is there a specific way to deal with features including spaces?

Best,

Alex

tsalo · May 3, 2021, 2:58pm

That’s odd… it sounds like a bug. The only time when get_studies_by_label should return all studies is when the label argument is None (which, now that I think about it, isn’t a very well-documented behavior). I will try to reproduce this locally, but in the meantime can you share the full script you used, along with the NiMARE version?

These features are still strings, so you should be able to use them as they appear in the dset.annotations DataFrame (e.g., "Neurosynth_TFIDF__label with spaces").

One other minor thing that came up in another recent Neurostars post is that Neurosynth TFIDF features should be used with a label_threshold of 0.001, instead of NiMARE’s default of 0.5. 0.001 basically translates to “did this term appear at least once in the study’s abstract”.

Alexandre · May 3, 2021, 4:03pm

The problem is certainly mine, I’m not very familiar with python language.
I had reinstalled nimare (nimare.__version__ returns 0.0.7)

I then used some commands :
import nimare dset_file="neurosynth_dataset.pkl.gz" dset=nimare.dataset.Dataset.load(dset_file) mind_tom_ids=dset.get_studies_by_label("Neurosynth_TFIDF__mind tom")
Strangely, this time len(mind_tom) returns 0.

mind_tom_05_ids=dset.get_studies_by_label("Neurosynth_TFIDF__mind tom, label_threshold = .05")
len(mind_tom_05_ids) returns 73

mind_tom_001_ids=dset.get_studies_by_label("Neurosynth_TFIDF__mind tom, label_threshold = .001")
len(mind_tom_05_ids) returns 74

I guess I made something wrong in previous attempts but I can’t figure where…

tsalo · May 3, 2021, 4:13pm

It does make a lot of sense that the default threshold (0.5) would return zero studies. The authors would have to use “mind tom” a lot of times in their abstract for it to have a TF-IDF value of >= 0.5.

These numbers make sense to me. I’m glad the results are looking better!

Alexandre · May 3, 2021, 5:08pm

That you for your help & tips!!

Best,
Alex