General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known

IF 0.4 4区 经济学 Q4 BUSINESS, FINANCE Journal of Risk Model Validation Pub Date : 2022-01-01 DOI:10.21314/jrmv.2022.019
Roger M. Stein
{"title":"General bounds on the area under the receiver operating characteristic curve and other performance measures when only a single sensitivity and specificity point is known","authors":"Roger M. Stein","doi":"10.21314/jrmv.2022.019","DOIUrl":null,"url":null,"abstract":"Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited","PeriodicalId":43447,"journal":{"name":"Journal of Risk Model Validation","volume":"1 1","pages":""},"PeriodicalIF":0.4000,"publicationDate":"2022-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Risk Model Validation","FirstCategoryId":"96","ListUrlMain":"https://doi.org/10.21314/jrmv.2022.019","RegionNum":4,"RegionCategory":"经济学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"BUSINESS, FINANCE","Score":null,"Total":0}
引用次数: 0

Abstract

Receiver operating characteristic (ROC) curves are often used to quantify the performance of predictive models used in diagnosis, risk stratification and rating systems. The ROC area under the curve (AUC) summarizes the ROC in a single statistic, which also provides a probabilistic interpretation that is isomorphic to the Mann– Whitney–Wilcoxon test. In many settings, such as those involving diagnostic tests for diseases or antibodies, information about the ROC is not reported;instead the true positive. TP / and true negative. TN / rates are reported for a single threshold value. We demonstrate how to calculate the upper and lower bounds for the ROC AUC, given a single. TP;TN / pair. We use simple geometric arguments only, and we present two examples of real-world applications from medicine and finance, involving Covid-19 diagnosis and credit card fraud detection, respectively. In addition, we introduce formally the notion of “pathological” ROC curves and “well-behaved” ROC curves. In the case of well-behaved ROC curves, the bounds on the AUC may be made tighter. In certain special cases involving pathological ROC curves that result from what we term “George Costanza” classifiers, we may transform predictions to obtain well-behaved ROC curves with higher AUC than the original decision process. Our results also enable the calculation of other quantities of interest, such as Cohen’s d or the Pearson correlation between a diagnostic outcome and an actual outcome. These results facilitate the direct comparison of reported performance when model or diagnostic performance is reported for only a single score threshold. © 2022. Infopro Digital Risk (IP) Limited
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
当只知道单个灵敏度和特异度点时,接收器工作特性曲线下的面积和其他性能指标的一般界限
受试者工作特征(ROC)曲线通常用于量化诊断、风险分层和评级系统中使用的预测模型的性能。ROC曲线下面积(AUC)在单个统计量中总结了ROC,它也提供了与Mann - Whitney-Wilcoxon检验同构的概率解释。在许多情况下,例如涉及疾病或抗体的诊断测试,有关ROC的信息不报告,而是报告真正的阳性。TP /和真阴性。TN /速率报告单个阈值。我们演示了如何计算ROC AUC的上界和下界。TN /对。我们仅使用简单的几何参数,并给出了两个来自医学和金融的实际应用示例,分别涉及Covid-19诊断和信用卡欺诈检测。此外,我们正式引入了“病态”ROC曲线和“行为良好”ROC曲线的概念。在表现良好的ROC曲线的情况下,AUC的界限可能会更紧。在涉及病理ROC曲线的某些特殊情况下,我们称之为“George Costanza”分类器,我们可以转换预测以获得比原始决策过程具有更高AUC的表现良好的ROC曲线。我们的结果还可以计算其他感兴趣的量,例如科恩d或诊断结果与实际结果之间的皮尔逊相关性。当仅为单个评分阈值报告模型或诊断性能时,这些结果有助于对报告的性能进行直接比较。©2022。盈富数码风险(知识产权)有限公司
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
1.20
自引率
28.60%
发文量
8
期刊介绍: As monetary institutions rely greatly on economic and financial models for a wide array of applications, model validation has become progressively inventive within the field of risk. The Journal of Risk Model Validation focuses on the implementation and validation of risk models, and aims to provide a greater understanding of key issues including the empirical evaluation of existing models, pitfalls in model validation and the development of new methods. We also publish papers on back-testing. Our main field of application is in credit risk modelling but we are happy to consider any issues of risk model validation for any financial asset class. The Journal of Risk Model Validation considers submissions in the form of research papers on topics including, but not limited to: Empirical model evaluation studies Backtesting studies Stress-testing studies New methods of model validation/backtesting/stress-testing Best practices in model development, deployment, production and maintenance Pitfalls in model validation techniques (all types of risk, forecasting, pricing and rating)
期刊最新文献
Value-at-risk and the global financial crisis Does the asymmetric exponential power distribution improve systemic risk measurement? A modified hybrid feature-selection method based on a filter and wrapper approach for credit risk forecasting What can we expect from a good margin model? Observations from whole-distribution tests of risk-based initial margin models Internet financial risk assessment in China based on a particle swarm optimization–analytic hierarchy process and fuzzy comprehensive evaluation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1