AUTHOR=Finch W. Holmes , Bolin Jocelyn H. , Kelley Ken TITLE=Group membership prediction when known groups consist of unknown subgroups: a Monte Carlo comparison of methods JOURNAL=Frontiers in Psychology VOLUME=5 YEAR=2014 URL=https://www.frontiersin.org/journals/psychology/articles/10.3389/fpsyg.2014.00337 DOI=10.3389/fpsyg.2014.00337 ISSN=1664-1078 ABSTRACT=
Classification using standard statistical methods such as linear discriminant analysis (LDA) or logistic regression (LR) presume knowledge of group membership prior to the development of an algorithm for prediction. However, in many real world applications members of the same nominal group, might in fact come from different subpopulations on the underlying construct. For example, individuals diagnosed with depression will not all have the same levels of this disorder, though for the purposes of LDA or LR they will be treated in the same manner. The goal of this simulation study was to examine the performance of several methods for group classification in the case where within group membership was not homogeneous. For example, suppose there are 3 known groups but within each group two unknown classes. Several approaches were compared, including LDA, LR, classification and regression trees (CART), generalized additive models (GAM), and mixture discriminant analysis (MIXDA). Results of the study indicated that CART and mixture discriminant analysis were the most effective tools for situations in which known groups were not homogeneous, whereas LDA, LR, and GAM had the highest rates of misclassification. Implications of these results for theory and practice are discussed.