Clarifying the Neurovault "is_valid" attribute and intention of "perc_bad_voxels"

I recently crawled all the metadata from Neurovault and had a few questions about the metadata associated with each of the images. First of all, some summary statistics from my crawl can be found in this notebook.

  • is_valid : This attribute is present for all scans as far as I can tell, however, is False for more than half (45,000) of them. Looking at the Neurovault/Neurovault repository on GitHub, the closest thing I could glean is that this parameter will be set to True for all forms that pass validation. Does this mean that I shouldn’t use any maps that are invalid? Or that they were added before this field was added?

  • perc_bad_voxels : This parameter seems to be counting the number of “zero” intensity or NaN voxels in the image relative to the total number of voxels in the image. To my understanding, there will always be a lot, because outside of the brain should be zeroes… Is there a more intuitive interpretation of this field than the above?

Thanks!

This is an internal parameter - you can ignore it. Those maps are good (although might be missing some metadata we added after the launch).

Your understanding is correct - that’s why the heuristic that uses this variable to detect thresholded maps uses cutoff of 85% of nan/zero voxels.

1 Like