Many classification algorithms generate probabilistic estimates of whether a given sample belongs to a given class. Various scoring metrics have been developed to assess the quality of such probabilistic estimates. In many domains, the area under the receiver-operating-characteristic curve (AUC) is predominantly used. When applied to two-class problems, the AUC can be interpreted as the frequency at which two randomly selected samples are ranked correctly, according to their assigned probabilities. As its name implies, the AUC is derived from receiver-operating-characteristic (ROC) curves, which illustrate the relationship between the true positive rate and false positive rate. However, ROC curves—which have their roots in signal processing—are difficult for many people to interpret. For example, in medical settings, ROC curves can identify the probability threshold that achieves an optimal balance between over- and under-diagnosis for a particular disease; yet it is unintuitive to evaluate such thresholds visually. I have developed a scoring approach, Performance Above Random Expectation (PARE), which assesses classification accuracy at various probability thresholds and compares it against the accuracy obtained with random class labels. Across all thresholds, this information can be summarized as a metric that evaluates probabilistic classifiers in a way that is qualitatively equivalent to the AUC metric. However, because the PARE method uses classification accuracy as its core metric, it is more intuitively interpretable. It can also be used to visually identify a probability threshold that maximizes accuracy—thus effectively balancing true positives with false positives. This method generalizes to various other applications.
Joseph is a Program Manager at Microsoft having come to Microsoft with the acquisition of Revolution Analytics. He is a data scientist and R language evangelist passionate about analyzing data and teaching people about R. He is a regular contributor to the Revolutions blog and an... Read More →