I would like to know what is the best way to complement a Neurosynth style term-based meta-analysis with individual studies.
After running nimare.io.convert_neurosynth_to_dataset("database.txt", "features.txt") we are able to generate dataset.Dataset type nimare variable, but it is unclear how one would add additional entries as there does not appear to any “add()” method for this datatype.
There’s no way to add individual studies to a Dataset, but you can merge two Datasets with the new Dataset.merge() method. I haven’t made a release since adding that method, though, so if you want to use it you’ll need to install the main version from GitHub.
I think your best move would be to either (1) directly add the studies to the Neurosynth files (which might muddy the waters when it comes to describing your Dataset, unfortunately) or (2) create a Dataset for your added studies and use Dataset.merge.
use the merge function to combine those two datasets
right?
Once I merge these, ideally all my studies would be in the same “annotation space” right? Am I correct to think that ranking my manually imputed studies according to the “neurosynth annotation space” would be a hassle? I should instead run one of NiMARE annotations functions of the final merged dataset…
All of the annotations should be in the same Dataset.annotations table, and as long as you use the same column/feature names the merge should work fine. However, Neurosynth label weights will probably not follow the same scale. For example, the standard TFIDF weights are scaled not only by the term counts in each study’s abstract, but also the counts for those terms across the whole Neurosynth database. Unless you plan to apply a threshold that would be fairly consistent across the two Datasets (e.g., the threshold of 0.001 corresponds to the term appearing at least once in the abstract), then I wouldn’t consider the weights directly comparable.
Yeah, trying to scale your weights to match the scaling of Neurosynth’s term TFIDF weights or its topic weights would be very difficult, if not impossible.
You might want to (1) download the abstracts for the Neurosynth corpus (NiMARE has a function for it), (2) generate term count annotations, (3) merge with your Dataset, and then (4) apply the TFIDF transform to the combined counts, if you want to directly compare TFIDF weights across the two Datasets.