Independent Validation as a Validation Method for Classification

Tina Braun; Hannes Eckert; Timo von Oertzen

doi:10.5964/qcmb.12069

Independent Validation as a Validation Method for Classification

Tina Braun
Charlotte-Fresenius University, Wiesbaden, Germany
Hannes Eckert
University of the Bundeswehr, Munich, Germany
Timo von Oertzen
Max Planck Institute for Human Development, Berlin, Germany

Abstract

The use of classifiers provides an alternative to conventional statistical methods. This involves using the accuracy with which data is correctly assigned to a given group by the classifier to apply tests to compare the performance of classifiers. The conventional validation methods for determining the accuracy of classifiers have the disadvantage that the distribution of correct classifications does not follow any known distribution, and therefore, the application of statistical tests is problematic. Independent validation circumvents this problem and allows the use of binomial tests to assess the performance of classifiers. However, independent validation accuracy is subject to bias for small training datasets. The present study shows that a hyperbolic function can be used to estimate the loss in classifier accuracy for independent validation. This function is used to develop three new methods to estimate the classifier accuracy for small training sets more precisely. These methods are compared to two existing methods in a simulation study. The results indicate overall small errors in the estimation of classifier accuracy and indicate that independent validation can be used with small samples. A least square estimation approach seems best suited to estimate the classifier accuracy.

PDF HTML XML

Published at

22. December 2023
https://doi.org/10.5964/qcmb.12069
Issue:

2023
Section:

Method Dissemination Articles
Keywords:

independent validation classifiers classifier accuracy simulation
Share:

Braun, T., Eckert, H., & von Oertzen, T. (2023). Independent validation as a validation method for classification. Quantitative and Computational Methods in Behavioral Sciences, 3, Article e12069. https://doi.org/10.5964/qcmb.12069