Normalization and group-averaging of count connectivity matrices from QSIRecon (mrtrix_multishell_msmt_ACT-fast)

JohannesWiesner · October 27, 2025, 10:55am

Summary of what happened:

I used qsirecon 1.1.2.dev0+gfa8673957.d20250819 and the mrtrix_multishell_msmt_ACT-fast workflow to compute adjacency matrices for my subjects. Everything worked fine, but now I need to compute a group average adjacency matrix. For this it is necessary to normalize the matrices in such a way that I get rid of absolute differences between subjects (e.g. in term of total fiber counts, etc).

For the connectivity measure, I decided to stick with sift_radius2_count_connectivity for now, as both the radius and sift corrections seem reasonable to me (However, I am not bound by this metric, I could also choose another one if this helps with my issue)

For now I only set the diagonal to 0 (to exclude self-connections, which is needed for the downstream analysis). Then I plotted the matrices using nilearn.plotting.plot_matrix.

Problem: Currently it seems that the matrices are heavily biased towards within-subcortical and/or within-cerebellar connectivity.

I wonder:

1.) If and how I could process the matrices in such a way that they are more “balanced” (within subject)
2.) How should I then should compute the group-average matrix (might or might not be related to 1., so can be ignored if 2. does not depend on 1.)

This might be related to this thread?

Command used (and if a helper script was used, a link to the helper script or the command generated):

I only set --recon-spec mrtrix_multishell_msmt_ACT-fast (otherwise just defaults)

Version:

qsirecon 1.1.2.dev0+gfa8673957.d20250819

Screenshots / relevant information:

This is how the connectivity matrix for one subject currently looks like. Note:

It does not matter which connectivity measure I take (the matrix will more or less look the same). So I get a similarily looking matrix with:

sift_radius2_count_connectivity
radius2_count_connectivity
sift_invnodevol_radius2_count_connectivity

However, the matrix for radius2_meanlength_connectivity looks much “better”:

Steven · October 27, 2025, 12:42pm

Hi @JohannesWiesner,

Mu should now be output by current versions of qsirecon for you to use. You can simply multiply your subject matrix by this coefficient after the fact. Also, HSVS recon specs are favored over FAST.

Best,
Steven

JohannesWiesner · October 27, 2025, 2:47pm

Mu should now be output by current versions of qsirecon for you to use. You can simply multiply your subject matrix by this coefficient after the fact.

So, just to get this right, if this is my folder structure:

qsirecon-MRtrix3_act-FAST/
├── dataset_description.json
├── logs
│   ├── CITATION.bib
│   ├── CITATION.html
│   ├── CITATION.md
│   └── CITATION.tex
├── sub-SNIP6IECX01
│   ├── dwi
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_connectivity.mat
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_exemplarbundles.zip
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-ifod2_streamlines.tck.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-CSF_dwimap.mif.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-CSF_dwimap.txt
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-GM_dwimap.mif.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-GM_dwimap.txt
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-WM_dwimap.mif.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-msmtcsd_param-fod_label-WM_dwimap.txt
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-mtnorm_param-inliermask_dwimap.nii.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-mtnorm_param-norm_dwimap.nii.gz
│   │   ├── sub-SNIP6IECX01_dir-PA_space-ACPC_model-sift2_mu.txt
│   │   └── sub-SNIP6IECX01_dir-PA_space-ACPC_model-sift2_streamlineweights.csv
│   └── figures
│       ├── sub-SNIP6IECX01_desc-about_T1w.html
│       ├── sub-SNIP6IECX01_desc-summary_T1w.html
│       ├── sub-SNIP6IECX01_dir-PA_space-ACPC_desc-MRtrix3Connectivity_matrices.svg
│       ├── sub-SNIP6IECX01_dir-PA_space-ACPC_desc-wmFOD_odfs.png
│       └── sub-SNIP6IECX01_dir-PA_space-ACPC_desc-wmFOD_peaks.png

I would load sub-SNIP6IECX01_dir-PA_space-ACPC_connectivity.mat as a dictionary using scipy.io.loadmat and multiply all values of the atlas_XXX_sift_radius2_count_connectivity matrix by sub-SNIP6IECX01_dir-PA_space-ACPC_model-sift2_mu.txt? If yes, this does not change anything visually. The resulting adjacency matrices are still heavily biased towards intra-cerebellum connections.

Also, HSVS recon specs are favored over FAST.

1.) I actually wanted to use dsi_studio_gqi but I ended up with this issue
2.) I can’t really use mrtrix_multishell_msmt_ACT-hsvs because of this issue (? not sure)

Steven · October 27, 2025, 3:06pm

Hi @JohannesWiesner,

Yes, that would make connectivity matrices more comparable between subjects.

It might help to change your color bar limits. If just one connection is hitting that 60,000 mark, then all the other connections will look to be low when they may be a perfectly normal connectivity distribution.

We’re working on fixing this in next release but in mean time rolling back just to 1.1.0 should work.

Perhaps FreeSurfer 8+ could help? It’s a fairly substantial rework to recon-all.

Best,
Steven

JohannesWiesner · October 28, 2025, 9:12am

Just as a sanity check:

atlas_<atlas>_sift.radius<N>.count.connectivity

corresponds to (pseudo code):

tck2connectome tracks.tck atlas.nii.gz out.csv \
  -tck_weights_in sift2_streamlineweights.csv \
  -assignment_radial_search 2 \
  -stat_edge sum \

?

In that case, I could also try to make it even more strict by computing the connectivity matrix myself and also adding:

  -assignment_end_voxels \  # more strict assignment (don't use radius search)
  -scale_invnodevol \              # fixes large-parcel inflation
  -scale_invlength \               # damps short-path dominance

It might help to change your color bar limits. If just one connection is hitting that 60,000 mark, then all the other connections will look to be low when they may be a perfectly normal connectivity distribution.

Yeah, but this will still cause problems in the downstream-analyses as my colleague will use the matrices as priors for his analyses (and few very large numbers would caus troubles for this analysis)

Steven · October 28, 2025, 1:32pm

Hi @JohannesWiesner,

If you’re not using tracking with a strict stopping parameter (e.g., HSVS), then you may get streamlines ending outside of atlas parcels. Also, streamline geometry gets wonky towards the GM-WM interface, so I tend to not trust this strict assignment as much. Also the radial search will still only select one parcel, it just finds the nearest one in that radius. I don’t see much reason to switch to this parameter.

That will damp long-path dominance.

This is also a default output of qsirecon.

Best,
Steven

JohannesWiesner · October 28, 2025, 2:09pm

I would say this definitely helps a bit to counterbalance biases towards intra-cerebellar connections.

1.) atlas_4S256Parcels_sift_radius2_count_connectivity (one of the default options of qsirecon):

2.) Manually computed approach (same as above but also penalizes short paths and big rois):

tck2connectome \
sub-SNIPYL4AS05_dir-PA_space-ACPC_model-ifod2_streamlines.tck \
sub-SNIPYL4AS04_dir-PA_space-ACPC_seg-4S256Parcels_dseg.mif \
conn_mat.csv \
-tck_weights_in sub-SNIPYL4AS05_dir-PA_space-ACPC_model-sift2_streamlineweights.csv \
-assignment_radial_search 2  \
-scale_invnodevol  \
-scale_length \
-zero_diagonal \
-stat_edge sum \
-symmetric \
-f

which would correspond to:

atlas_4S256Parcels_sift_radius2_invnodevol_length_count_connectivity

Any plan to include this option in the default outputs? I also wondered about the current selection of measures in the .mat file. Are these just the default results from MRtrix?

Steven · October 28, 2025, 2:29pm

Hi @JohannesWiesner,

We do not have plans but if you want to open a feature request on the github it can be discussed there. You can also look an existing recon spec and design your own connectivity measure in a custom spec: qsirecon/qsirecon/data/pipelines/mrtrix_multishell_msmt_ACT-hsvs.yaml at main · PennLINC/qsirecon · GitHub

Just to clarify, it penalizes long paths, and de-penalized short paths.

Best,
Steven

JohannesWiesner · October 28, 2025, 2:42pm

Wait…

So --scale_length should penalize short paths / reward long paths and scale_invlength should penalize long paths / reward short paths right? I used the former…

Perfect, then I will just create a custom recon-spec

Might also open a feature request, I do think it would make sense to have this option when using atlases with cerebellar regions.

Steven · October 28, 2025, 3:15pm

Hi @JohannesWiesner,

Sorry, read too quickly.—scale length does bias against smaller ones, and invlength basies against larger ones. I don’t recommend —scale length because there’s already a general length bias in tractography that will be further amplified by this argument.

Best,
Steven

JohannesWiesner · October 28, 2025, 4:56pm

Sorry, I don’t get it. In my understanding --scale_length (what I used in my final version) means that the sum of the connections gets multiplied with the average length of the connections. That would reward long connections (which is what I want) and penalize short connections like the one in the cerebellum (which is also what I want).

But according to you, the opposite is true?

Steven · October 28, 2025, 5:11pm

Hi @JohannesWiesner,

I said that --scale_length biases against smaller connections / biases towards longer connections, which is also what you said. I also mentioned that this may end up further inflating the preexisting length bias in tractography (in which short streamlines are already harder to tract).

Best,
Steven

Steven · October 28, 2025, 5:25pm

Hi @JohannesWiesner,

If you haven’t yet, you should read this: Heuristic scaling of connectome edge strengths

Also keep in mind from this article:

Firstly, I hope it is clear from the description of this scaling mechanism that it is only applicable in the specific case of homogeneous white matter seeding (whether deterministic, e.g. tckgen -seed_grid_per_voxel , or probabilistic, e.g. tckgen -seed_image or tckgen -seed_random_per_voxel ). If any other seeding mechanism is used, then the mechanistic bias this heuristic is tailored to address is not present in the data in the first place, so its application would be highly erroneous.

This is not the case in QSIRecon, which uses -seed_dynamic fod_image (see the SIFT2 paper for more details on that: Shibboleth Authentication Request)

Best,
Steven