测试集 AUROC 的奇特情况

IF 23.9 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Nature Machine Intelligence Pub Date : 2024-04-04 DOI:10.1038/s42256-024-00817-7

Michael Roberts, Alon Hazan, Sören Dittmer, James H. F. Rudd, Carola-Bibiane Schönlieb

{"title":"测试集 AUROC 的奇特情况","authors":"Michael Roberts, Alon Hazan, Sören Dittmer, James H. F. Rudd, Carola-Bibiane Schönlieb","doi":"10.1038/s42256-024-00817-7","DOIUrl":null,"url":null,"abstract":"The area under the receiver operating characteristic curve (AUROC) of the test set is used throughout machine learning (ML) for assessing a model’s performance. However, when concordance is not the only ambition, this gives only a partial insight into performance, masking distribution shifts of model outputs and model instability.","PeriodicalId":48533,"journal":{"name":"Nature Machine Intelligence","volume":"6 4","pages":"373-376"},"PeriodicalIF":23.9000,"publicationDate":"2024-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The curious case of the test set AUROC\",\"authors\":\"Michael Roberts, Alon Hazan, Sören Dittmer, James H. F. Rudd, Carola-Bibiane Schönlieb\",\"doi\":\"10.1038/s42256-024-00817-7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The area under the receiver operating characteristic curve (AUROC) of the test set is used throughout machine learning (ML) for assessing a model’s performance. However, when concordance is not the only ambition, this gives only a partial insight into performance, masking distribution shifts of model outputs and model instability.\",\"PeriodicalId\":48533,\"journal\":{\"name\":\"Nature Machine Intelligence\",\"volume\":\"6 4\",\"pages\":\"373-376\"},\"PeriodicalIF\":23.9000,\"publicationDate\":\"2024-04-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Nature Machine Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://www.nature.com/articles/s42256-024-00817-7\",\"RegionNum\":1,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nature Machine Intelligence","FirstCategoryId":"94","ListUrlMain":"https://www.nature.com/articles/s42256-024-00817-7","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

摘要

在整个机器学习（ML）过程中，测试集的接收者工作特征曲线下面积（AUROC）被用来评估模型的性能。然而，当一致性不是唯一的目标时，这只能部分反映模型的性能，掩盖模型输出的分布偏移和模型的不稳定性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

The curious case of the test set AUROC

The area under the receiver operating characteristic curve (AUROC) of the test set is used throughout machine learning (ML) for assessing a model’s performance. However, when concordance is not the only ambition, this gives only a partial insight into performance, masking distribution shifts of model outputs and model instability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Nature Machine Intelligence Multiple-

CiteScore

36.90

自引率

2.10%

发文量

127

期刊介绍： Nature Machine Intelligence is a distinguished publication that presents original research and reviews on various topics in machine learning, robotics, and AI. Our focus extends beyond these fields, exploring their profound impact on other scientific disciplines, as well as societal and industrial aspects. We recognize limitless possibilities wherein machine intelligence can augment human capabilities and knowledge in domains like scientific exploration, healthcare, medical diagnostics, and the creation of safe and sustainable cities, transportation, and agriculture. Simultaneously, we acknowledge the emergence of ethical, social, and legal concerns due to the rapid pace of advancements. To foster interdisciplinary discussions on these far-reaching implications, Nature Machine Intelligence serves as a platform for dialogue facilitated through Comments, News Features, News & Views articles, and Correspondence. Our goal is to encourage a comprehensive examination of these subjects. Similar to all Nature-branded journals, Nature Machine Intelligence operates under the guidance of a team of skilled editors. We adhere to a fair and rigorous peer-review process, ensuring high standards of copy-editing and production, swift publication, and editorial independence.

期刊最新文献

A domain-adapted large language model to support clinicians in psychiatric clinical practice A multimodal large language model for materials science Fluid thinking about collective intelligence Programmable RNA translation through deep learning-driven IRES discovery and de novo generation From embodied intelligence to physical AI