Specification overfitting in artificial intelligence

IF 13.9 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Artificial Intelligence Review Pub Date : 2024-12-20 DOI:10.1007/s10462-024-11040-6
Benjamin Roth, Pedro Henrique Luz de Araujo, Yuxi Xia, Saskia Kaltenbrunner, Christoph Korab
{"title":"Specification overfitting in artificial intelligence","authors":"Benjamin Roth,&nbsp;Pedro Henrique Luz de Araujo,&nbsp;Yuxi Xia,&nbsp;Saskia Kaltenbrunner,&nbsp;Christoph Korab","doi":"10.1007/s10462-024-11040-6","DOIUrl":null,"url":null,"abstract":"<div><p>Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology’s potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, imperfect proxies that capture isolated aspects of the underlying requirements. Given possible trade-offs between different metrics and their vulnerability to over-optimization, integrating specification metrics in system development processes is not trivial. This paper defines <i>specification overfitting</i>, a scenario where systems focus excessively on specified metrics to the detriment of high-level requirements and task performance. We present an extensive literature survey to categorize how researchers propose, measure, and optimize specification metrics in several AI fields (e.g., natural language processing, computer vision, reinforcement learning). Using a keyword-based search on papers from major AI conferences and journals between 2018 and mid-2023, we identify and analyze 74 papers that propose or optimize specification metrics. We find that although most papers implicitly address specification overfitting (e.g., by reporting more than one specification metric), they rarely discuss which role specification metrics should play in system development or explicitly define the scope and assumptions behind metric formulations.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"58 2","pages":""},"PeriodicalIF":13.9000,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-11040-6.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-11040-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Machine learning (ML) and artificial intelligence (AI) approaches are often criticized for their inherent bias and for their lack of control, accountability, and transparency. Consequently, regulatory bodies struggle with containing this technology’s potential negative side effects. High-level requirements such as fairness and robustness need to be formalized into concrete specification metrics, imperfect proxies that capture isolated aspects of the underlying requirements. Given possible trade-offs between different metrics and their vulnerability to over-optimization, integrating specification metrics in system development processes is not trivial. This paper defines specification overfitting, a scenario where systems focus excessively on specified metrics to the detriment of high-level requirements and task performance. We present an extensive literature survey to categorize how researchers propose, measure, and optimize specification metrics in several AI fields (e.g., natural language processing, computer vision, reinforcement learning). Using a keyword-based search on papers from major AI conferences and journals between 2018 and mid-2023, we identify and analyze 74 papers that propose or optimize specification metrics. We find that although most papers implicitly address specification overfitting (e.g., by reporting more than one specification metric), they rarely discuss which role specification metrics should play in system development or explicitly define the scope and assumptions behind metric formulations.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
人工智能中的规范过拟合
机器学习(ML)和人工智能(AI)方法因其固有的偏见以及缺乏控制、问责和透明度而经常受到批评。因此,监管机构在遏制这种技术的潜在负面影响方面举步维艰。公平性和稳健性等高层次要求需要正式转化为具体的规范指标,即捕捉基本要求孤立方面的不完美代理。考虑到不同指标之间可能存在的权衡及其过度优化的脆弱性,将规范指标整合到系统开发流程中并非易事。本文定义了 "规范过拟合",即系统过度关注指定指标,从而损害高层次需求和任务性能的情况。我们进行了广泛的文献调查,对多个人工智能领域(如自然语言处理、计算机视觉、强化学习)的研究人员如何提出、测量和优化规范指标进行了分类。通过对 2018 年至 2023 年中期的主要人工智能会议和期刊论文进行基于关键词的搜索,我们识别并分析了 74 篇提出或优化规范度量的论文。我们发现,虽然大多数论文隐含地解决了规范过拟合问题(例如,通过报告一个以上的规范度量),但它们很少讨论规范度量在系统开发中应扮演何种角色,也很少明确定义度量表述背后的范围和假设。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Artificial Intelligence Review
Artificial Intelligence Review 工程技术-计算机:人工智能
CiteScore
22.00
自引率
3.30%
发文量
194
审稿时长
5.3 months
期刊介绍: Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.
期刊最新文献
Evaluating large language models effectiveness for flow-based intrusion detection: a comparative study with ML and DL baselines Synergizing blockchain and AI to fortify IoT security: a comprehensive review Multi-objective genetic programming-based algorithmic trading, using directional changes and a modified sharpe ratio score for identifying optimal trading strategies Multi-agent generalized cooperative optimization scheduling for multi-energy complementarity in microgrids Emerging computational intelligence based techniques for lung cancer diagnosis and classification on chest CT scan images: a comprehensive survey
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1