NMA curated HCP dataset additional information

raul.rodriguezcruces · July 15, 2020, 5:15pm

At the present moment we are working on our project proposal in our pod, and we kind of need to know if we can have access to the subject’s HCP real ID to match our curated NMA-dataset with the behavioural responses provided by HCP.

Is it possible to get the subjects HCP ID in order to relate them to the behavioural data from the HCP full data set please.

Now, we only have subjects IDs from 0 to 343 and they don’t match with the real HCP ID.

thanks

aarondzt · July 15, 2020, 5:57pm

My project group is also wondering about this, please advise! Maybe it will get cleared up in the meetings with the project mentors.

michaelwaskom · July 15, 2020, 8:06pm

I am not sure about the “real” IDs. I can ask, but the answer may be that the dataset has been semi-anonymized on purpose. Given how broadly it is being distributed, the organizers are taking the potential human subjects privacy risks very seriously.

But I believe efforts are underway to distribute the behavioral data from the tasks themselves in a more useful format, I hope that will be available by tomorrow.

raul.rodriguezcruces · July 15, 2020, 8:23pm

HCP data is already anonymized, we only want to know what are the HCP subjects that were used for the NMA-curation. Something like a subject’s look-up table.
by “real HCP id” I mean the HCP Subject… for example:

NMAindex - Subject
0 - 100004
1 - 100206
…
347 - 101915

michaelwaskom · July 15, 2020, 8:27pm

I mean with respect to the HCP IDs. Of course the original participant names are not present in the original dataset, but you would be surprised how few demographic details are necessary for de-anonymization. To gain access to the original HCP dataset, it is necessary to sign a form attesting that you will not attempt to de-anonymize that participants. But that is not possible to enforce here, hence the extra steps required…

raul.rodriguezcruces · July 15, 2020, 8:36pm

I have signed that form before for my personal research. Could a possible solution be to do this extra steps within our pods for our specific project aims, give you the proofs sign in HCP and obtain the subject’s lookup table from you? all backup, protected and private in our local repositories.

michaelwaskom · July 15, 2020, 8:53pm

I don’t know. I am not responsible for the decision about what to release or not to release. I am just passing information back and forth. What I can say at the moment is that the organizers were firm in their decision not to release demographic information and, for the time being, to discourage project ideas that assume this will be available.

raul.rodriguezcruces · July 15, 2020, 9:13pm

Thank you so much for your reply. We are not interested in the demographic but behavioural data related to the fMRI tasks. As you stated before some efforts are underway to distribute it maybe tomorrow. I think we’ll just wait.
However I don’t think that taking this decision will discourage good project ideas from students, HCP is widely know by the neuroimaging community. I strongly believe that is better to foster good scientific practices than try to avoid the extra steps (anonymize, consents, ethics).

azg · July 16, 2020, 2:17pm

Hi! Our group is also wondering if we will be able to get additional behavioral data.

In the social cognition task data, “unsure” and “random/non-social” responses are lumped together - we were hoping to get the behavioral data to disentangle this a bit.

michaelcohen · July 16, 2020, 4:32pm

Hi,
Our group also was hoping to link neural measures with behavioral measures (e.g., social well-being) that are available in the unrestricted HCP dataset.

Because these measures in the unrestricted dataset, and as far as I can tell there would be no way to identify participants using any of these measures, we’re hoping that this could be accessible. Is that a part of what’s being worked on to be released?

Or alternatively, would it be possible to share the linkage from the subject numbers in the NMA version of the HCP dataset, to the main HCP subject numbers? This way, we would download the behavioral data spreadsheet ourselves from the HCP server, which would require agreeing to the data use policies on there.

robbisg · July 16, 2020, 4:48pm

Also one of my groups would like to use only behavioural data (not demographic!) for their idea.

So I’m following the discussion!

Cheers,

michaelcohen · July 16, 2020, 5:08pm

Our group had discussed E-mailing John Murray, who seems to be the one who coordinated the NMA version of the HCP dataset. Have you or anyone else reached out to him?
(Not sure if it’s reasonable to do that or if we should keep the discussion on here instead…)

michaelcohen · July 17, 2020, 12:45am

Hi all,
I E-mailed John Murray this afternoon, and just got the following E-mail back:

“I’ve just passed along to Michael Waskom this mapping between NMA index and HCP subject IDs, so it should be available to you soon.”

So if all goes according to plan, hopefully we’re all in luck!

johnmurray · July 17, 2020, 12:52am

Y’all can also message me here! Happy to discuss

raul.rodriguezcruces · July 17, 2020, 2:05pm

thank you for the update to HCP dataset!