Matthew J. Sniatynski, John A. Shepherd, Lynne R. Wilkens, D. Frank Hsu, Bruce S. Kristal
{"title":"The DIRAC framework: Geometric structure underlies roles of diversity and accuracy in combining classifiers","authors":"Matthew J. Sniatynski, John A. Shepherd, Lynne R. Wilkens, D. Frank Hsu, Bruce S. Kristal","doi":"10.1016/j.patter.2024.100924","DOIUrl":null,"url":null,"abstract":"<p>Combining classification systems potentially improves predictive accuracy, but outcomes have proven impossible to predict. Similar to improving binary classification with fusion, fusing ranking systems most commonly increases Pearson or Spearman correlations with a target when the input classifiers are “sufficiently good” (generalized as “<span><em><strong>accuracy</strong></em></span>”) and “sufficiently different” (generalized as “<span><em><strong>diversity</strong></em></span>”), but the individual and joint quantitative influence of these factors on the final outcome remains unknown. We resolve these issues. Building on our previous empirical work establishing the DIRAC (<em>DI</em><span><em>versity</em></span> of Ranks and <em>AC</em><span><em>curacy</em></span>) framework, which accurately predicts the outcome of fusing binary classifiers, we demonstrate that the DIRAC framework similarly explains the outcome of fusing ranking systems. Specifically, precise geometric representation of <span><em><strong>diversity</strong></em></span> and <span><em><strong>accuracy</strong></em></span> as angle-based distances within rank-based combinatorial structures (permutahedra) fully captures their synergistic roles in rank approximation, uncouples them from the specific metrics of a given problem, and represents them as generally as possible.</p>","PeriodicalId":36242,"journal":{"name":"Patterns","volume":null,"pages":null},"PeriodicalIF":6.7000,"publicationDate":"2024-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Patterns","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1016/j.patter.2024.100924","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Combining classification systems potentially improves predictive accuracy, but outcomes have proven impossible to predict. Similar to improving binary classification with fusion, fusing ranking systems most commonly increases Pearson or Spearman correlations with a target when the input classifiers are “sufficiently good” (generalized as “accuracy”) and “sufficiently different” (generalized as “diversity”), but the individual and joint quantitative influence of these factors on the final outcome remains unknown. We resolve these issues. Building on our previous empirical work establishing the DIRAC (DIversity of Ranks and ACcuracy) framework, which accurately predicts the outcome of fusing binary classifiers, we demonstrate that the DIRAC framework similarly explains the outcome of fusing ranking systems. Specifically, precise geometric representation of diversity and accuracy as angle-based distances within rank-based combinatorial structures (permutahedra) fully captures their synergistic roles in rank approximation, uncouples them from the specific metrics of a given problem, and represents them as generally as possible.