Sniatynski MJ, Shepherd JA, Ernst T, Wilkens LR, Hsu DF, Kristal BS. Ranks underlie outcome of combining classifiers: Quantitative roles for
diversity and
accuracy.
PATTERNS (NEW YORK, N.Y.) 2022;
3:100415. [PMID:
35199065 PMCID:
PMC8848007 DOI:
10.1016/j.patter.2021.100415]
[Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 09/20/2021] [Accepted: 11/24/2021] [Indexed: 11/22/2022]
Abstract
Combining classifier systems potentially improves predictive accuracy, but outcomes have proven impossible to predict. Classification most commonly improves when the classifiers are "sufficiently good" (generalized as " accuracy ") and "sufficiently different" (generalized as " diversity "), but the individual and joint quantitative influence of these factors on the final outcome remains unknown. We resolve these issues. Beginning with simulated data, we develop the DIRAC framework (DIversity of Ranks and ACcuracy), which accurately predicts outcome of both score-based fusions originating from exponentially modified Gaussian distributions and rank-based fusions, which are inherently distribution independent. DIRAC was validated using biological dual-energy X-ray absorption and magnetic resonance imaging data. The DIRAC framework is domain independent and has expected utility in far-ranging areas such as clinical biomarker development/personalized medicine, clinical trial enrollment, insurance pricing, portfolio management, and sensor optimization.
Collapse