I am interested in getting the classifier permormance for each trial. The idea is to get the accuracy for each one, then, rank-order all of them to know in which ones the accuracies are higher. Is this possible? If so, are there some way to also get the average probability that the classifier assigned to the target category (label) of each trial.

I already know how to get the different outputs for all the trials that I establish in the design as testing set, but not in an specific way (trial by trial)

Since a trial is either correct or incorrect, ranking the accuracies will give you only 1 and 0, so the information may be less useful for you. I think you might be looking for the decision values, they provide a distance from the separating hyperplane. Values > 0 are one class, and value < 0 the other. Note that this is a different function for each different classifier, i.e. if classifiers are based on vastly different data, then the absolute values are not comparable. If you want a probability, you can add -b to the training and testing parameters, this will convert the decision values to probabilities.

In TDT, you want to use cfg.results.output = 'decision_values';
(see help decoding_transform_results for all available options)

I think average probabilities are perhaps not ideal, but you could multiply them and take the log to get a useful numeric range. But you can get the mean signed decision value (i.e. for a given label the mean distance to the hyperplane). This is quite a useful measure and sounds like what you are looking for. This is cfg.results.output = 'signed_decision_values';

If you want to do more advanced things that are not implemented, check HOWTOEXTEND.txt. You can easily generate your own transres_ function that can take the decision values or predicted labels or true labels and does all sorts of manipulations.

@Martin Hi Martin, I’m confused that if the decision value of TDT results output and distance to the hyperplane mentioned in some articles (such as: Qiao et al., (2017). Dynamic trial-by-trial recoding of task-set representations in the frontoparietal cortex mediates behavioral flexibility. The Journal of Neuroence, 37(45).) are the same concept?

The decision value goes from -inf to inf. Think of the datapoints being projected onto the weight vector that lies orthogonal to the separating hyperplane. This way, points will just lie on one axis that goes from -inf through the separating hyperplane at 0 to plus inf. Points that are closer to (i.e. with smaller distances from) the separating hyperplane will reflect less clear cut decision values and will be smaller in absolute numbers. The distance to the hyperplane goes from inf through 0 to inf since distances must always be positive. In other words, distance = |dv|

While decision values reflect a larger distance from the separating hyperplane, larger distances need not reflect more evidence in favor of one (even when the classifier tells you so). Of course, it depends on the nature of the measured distributions.