We are about to start a study with two 7-minute runs of film-viewing task, with ~100 13-23 year old participants, and I was wondering if anyone had any recommendations for age-appropriate stimuli.
Here are a few elements that I’m hoping will be possible:
Overlap with existing open film-viewing datasets (e.g., cNeuroMod, ForrestGump). If there are existing manual annotations, that would be even better.
Good integration with automated annotation tools (i.e., pliers), so I think live action would be better than animation.
Something we could run functional alignment analysis on, so hopefully it would be fairly rich in terms of different object classes, emotion, etc. I realize that 14 minutes might be too short for certain analyses.
The stimuli included in Neuroscout could be a good place to look as those are all open movie datasets and additionally already have features already extracted and available for use.
The stimuli from the HBN are also age-appropriate and combined ~ 14 min, though they are animated and not live-action films.
I had looked through the Neuroscout datasets (and actually the Life documentary was one of the ones I was homing in on), but I must have skipped over HBN for some reason. The description of the stimuli (10 minute clip of Despicable Me and the 4 minute long short film The Present) looks just about perfect, thank you!
My only concern is that automated annotation tools might not work well on animated stimuli. Do you have a sense of how well pliers worked on those stimuli?
My only concern is that automated annotation tools might not work well on animated stimuli. Do you have a sense of how well pliers worked on those stimuli?
In my experience the automated features seem pretty reasonable for the HBN movies. I’ve mostly looked at audio features which of course are not really different from live-action but the visual features looked alright as well.
Hi @tsalo!
Here are some personal thoughts on this topic:
in my experience, 30 minutes of movieclip-watching data can already help derive meaningful functional alignments. I guess 14 minutes would probably work too
from a decoding perspective (i.e. training algorithms that predict what a person is seeing from brain activity), 14 minutes per individual is too short to train a good decoder. You would need at least 1-2 hours (and I would actually recommend much more). I don’t think it is possible to leverage 100 people x 14 minutes of data to obtain a good training dataset either
from a decoding perspective still, people often try to optimise for retrieval metrics: choosing movie clips that can be easily distinguished from one another (because they deal with different topics and come from different movies) could help. In my experience, things get harder if all movie clips come from the same movie
I don’t know how well pliers work on animated stimuli, but I’d be interested in hearing more about it if someone tries it