Hi @orko,
With regards to your second question: “I still not quite understand what is the meaning of the mask?”
In order to convert your series of 3D volumes to one 2D data-array you (always) need a brain mask that ‘tells’ which voxels are brain-voxels and which voxels are just empty space in your 3D images. So now you have two options how to do this using NiftiMasker: Either you provide a mask on your own or you set mask_img = None. If you decide for the latter case, NiftiMasker will compute a brain mask according to mask_strategy (have a look at this keyword-argument for more information). In short, there are three ways how to end up with a computed brain-mask:
masking.compute_background_mask,masking.compute_epi_mask,masking.compute_gray_matter_mask
While the first two strategies are data-driven, i.e. they estimate a brain-mask build using your images, the latter one uses an already set up MNI-152 whole-brain-mask to mask your data (please have a look at this post, the function name is a little bit misleading.
how can I “translate” a column to a specific brain location of this voxel?
If I understand you correctly, it seems like you are interested in extracting regions of interest (i.e. group voxels that belong to certain brain-regions based on an atlas?): In this case, nilearn has an own function for this called NiftiLabelsMasker (https://nilearn.github.io/modules/generated/nilearn.input_data.NiftiLabelsMasker.html)