克服数据质量差的问题:优化先例关系数据的验证

IF 6 2区 管理学 Q1 OPERATIONS RESEARCH & MANAGEMENT SCIENCE European Journal of Operational Research Pub Date : 2025-05-01 Epub Date: 2024-11-14 DOI:10.1016/j.ejor.2024.11.009
Benedikt Finnah , Jochen Gönsch , Alena Otto
{"title":"克服数据质量差的问题:优化先例关系数据的验证","authors":"Benedikt Finnah ,&nbsp;Jochen Gönsch ,&nbsp;Alena Otto","doi":"10.1016/j.ejor.2024.11.009","DOIUrl":null,"url":null,"abstract":"<div><div>Insufficient data quality prevents data usage by decision support systems (DSS) in many areas of business. This is the case for data on precedence relations between tasks, which is relevant, for instance, in project scheduling and assembly line balancing. Inaccurate data on unnecessary precedence relations cannot be used, otherwise the recommendations of DSS may turn infeasible. So, unnecessary relations must be satisfied, diminishing the baseline problem’s solution space and the business result. Experts can validate the data, but their time is limited.</div><div>We apply an optimization lens and formulate the data validation problem (DVP). Restricted by the available time budget, an expert dynamically receives queries about specific data entries and corrects or validates them. The DVP searches for an interview policy that states queries to the expert, each using up some of the time budget, in a way that maximizes the (weighted) number of removed precedence relations. We model the DVP as a dynamic program, derive optimal policies for several important special cases and design a heuristic interview policy LSTD. In a case study of an automobile manufacturer, this policy substantially reduces the stations’ idle time after selectively addressing about 8% of the data entries.</div><div>We prove theoretically and numerically that data validation by experts can lead to significant savings. The number of queries required to validate the data exhaustively is much less than naive estimates. Additionally, the probability to remove an unnecessary precedence relation per query in a series of queries is high, even for simple interview policies.</div></div>","PeriodicalId":55161,"journal":{"name":"European Journal of Operational Research","volume":"322 3","pages":"Pages 740-752"},"PeriodicalIF":6.0000,"publicationDate":"2025-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Overcoming poor data quality: Optimizing validation of precedence relation data\",\"authors\":\"Benedikt Finnah ,&nbsp;Jochen Gönsch ,&nbsp;Alena Otto\",\"doi\":\"10.1016/j.ejor.2024.11.009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><div>Insufficient data quality prevents data usage by decision support systems (DSS) in many areas of business. This is the case for data on precedence relations between tasks, which is relevant, for instance, in project scheduling and assembly line balancing. Inaccurate data on unnecessary precedence relations cannot be used, otherwise the recommendations of DSS may turn infeasible. So, unnecessary relations must be satisfied, diminishing the baseline problem’s solution space and the business result. Experts can validate the data, but their time is limited.</div><div>We apply an optimization lens and formulate the data validation problem (DVP). Restricted by the available time budget, an expert dynamically receives queries about specific data entries and corrects or validates them. The DVP searches for an interview policy that states queries to the expert, each using up some of the time budget, in a way that maximizes the (weighted) number of removed precedence relations. We model the DVP as a dynamic program, derive optimal policies for several important special cases and design a heuristic interview policy LSTD. In a case study of an automobile manufacturer, this policy substantially reduces the stations’ idle time after selectively addressing about 8% of the data entries.</div><div>We prove theoretically and numerically that data validation by experts can lead to significant savings. The number of queries required to validate the data exhaustively is much less than naive estimates. Additionally, the probability to remove an unnecessary precedence relation per query in a series of queries is high, even for simple interview policies.</div></div>\",\"PeriodicalId\":55161,\"journal\":{\"name\":\"European Journal of Operational Research\",\"volume\":\"322 3\",\"pages\":\"Pages 740-752\"},\"PeriodicalIF\":6.0000,\"publicationDate\":\"2025-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"European Journal of Operational Research\",\"FirstCategoryId\":\"91\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S0377221724008609\",\"RegionNum\":2,\"RegionCategory\":\"管理学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/11/14 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"OPERATIONS RESEARCH & MANAGEMENT SCIENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Journal of Operational Research","FirstCategoryId":"91","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0377221724008609","RegionNum":2,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/11/14 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"OPERATIONS RESEARCH & MANAGEMENT SCIENCE","Score":null,"Total":0}
引用次数: 0

摘要

在许多业务领域,数据质量不高会妨碍决策支持系统(DSS)使用数据。例如,在项目调度和流水线平衡中,任务之间的优先关系数据就属于这种情况。不能使用不准确的不必要优先关系数据,否则决策支持系统的建议可能会变得不可行。因此,必须满足不必要的关系,从而缩小基线问题的求解空间和业务结果。专家可以验证数据,但他们的时间有限。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Overcoming poor data quality: Optimizing validation of precedence relation data
Insufficient data quality prevents data usage by decision support systems (DSS) in many areas of business. This is the case for data on precedence relations between tasks, which is relevant, for instance, in project scheduling and assembly line balancing. Inaccurate data on unnecessary precedence relations cannot be used, otherwise the recommendations of DSS may turn infeasible. So, unnecessary relations must be satisfied, diminishing the baseline problem’s solution space and the business result. Experts can validate the data, but their time is limited.
We apply an optimization lens and formulate the data validation problem (DVP). Restricted by the available time budget, an expert dynamically receives queries about specific data entries and corrects or validates them. The DVP searches for an interview policy that states queries to the expert, each using up some of the time budget, in a way that maximizes the (weighted) number of removed precedence relations. We model the DVP as a dynamic program, derive optimal policies for several important special cases and design a heuristic interview policy LSTD. In a case study of an automobile manufacturer, this policy substantially reduces the stations’ idle time after selectively addressing about 8% of the data entries.
We prove theoretically and numerically that data validation by experts can lead to significant savings. The number of queries required to validate the data exhaustively is much less than naive estimates. Additionally, the probability to remove an unnecessary precedence relation per query in a series of queries is high, even for simple interview policies.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
European Journal of Operational Research
European Journal of Operational Research 管理科学-运筹学与管理科学
CiteScore
11.90
自引率
9.40%
发文量
786
审稿时长
8.2 months
期刊介绍: The European Journal of Operational Research (EJOR) publishes high quality, original papers that contribute to the methodology of operational research (OR) and to the practice of decision making.
期刊最新文献
The Nesterov–Spokoiny acceleration achieves strict o(1/k2) convergence An exact method for storage space assignment at airport cargo terminal: A temporal-spatial packing problem Expansion strategies of retail service sharing platform operations with service quality considerations Primal-dual algorithm for weakly convex functions under sharpness conditions Sorting goods with an industrial robot arm: Robot scheduling, destination assignment, and layout design
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1