Detecting and surviving data races using complementary schedules

K. Veeraraghavan, Peter M. Chen, J. Flinn, S. Narayanasamy
{"title":"Detecting and surviving data races using complementary schedules","authors":"K. Veeraraghavan, Peter M. Chen, J. Flinn, S. Narayanasamy","doi":"10.1145/2043556.2043590","DOIUrl":null,"url":null,"abstract":"Data races are a common source of errors in multithreaded programs. In this paper, we show how to protect a program from data race errors at runtime by executing multiple replicas of the program with complementary thread schedules. Complementary schedules are a set of replica thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Our system, called Frost, uses complementary schedules to cause at least one replica to avoid the order of racing instructions that leads to incorrect program execution for most harmful data races. Frost introduces outcome-based race detection, which detects data races by comparing the state of replicas executing complementary schedules. We show that this method is substantially faster than existing dynamic race detectors for unmanaged code. To help programs survive bugs in production, Frost also diagnoses the data race bug and selects an appropriate recovery strategy, such as choosing a replica that is likely to be correct or executing more replicas to gather additional information. Frost controls the thread schedules of replicas by running all threads of a replica non-preemptively on a single core. To scale the program to multiple cores, Frost runs a third replica in parallel to generate checkpoints of the program's likely future states --- these checkpoints let Frost divide program execution into multiple epochs, which it then runs in parallel. We evaluate Frost using 11 real data race bugs in desktop and server applications. Frost both detects and survives all of these data races. Since Frost runs three replicas, its utilization cost is 3x. However, if there are spare cores to absorb this increased utilization, Frost adds only 3--12% overhead to application runtime.","PeriodicalId":20672,"journal":{"name":"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2011-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"93","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2043556.2043590","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 93

Abstract

Data races are a common source of errors in multithreaded programs. In this paper, we show how to protect a program from data race errors at runtime by executing multiple replicas of the program with complementary thread schedules. Complementary schedules are a set of replica thread schedules crafted to ensure that replicas diverge only if a data race occurs and to make it very likely that harmful data races cause divergences. Our system, called Frost, uses complementary schedules to cause at least one replica to avoid the order of racing instructions that leads to incorrect program execution for most harmful data races. Frost introduces outcome-based race detection, which detects data races by comparing the state of replicas executing complementary schedules. We show that this method is substantially faster than existing dynamic race detectors for unmanaged code. To help programs survive bugs in production, Frost also diagnoses the data race bug and selects an appropriate recovery strategy, such as choosing a replica that is likely to be correct or executing more replicas to gather additional information. Frost controls the thread schedules of replicas by running all threads of a replica non-preemptively on a single core. To scale the program to multiple cores, Frost runs a third replica in parallel to generate checkpoints of the program's likely future states --- these checkpoints let Frost divide program execution into multiple epochs, which it then runs in parallel. We evaluate Frost using 11 real data race bugs in desktop and server applications. Frost both detects and survives all of these data races. Since Frost runs three replicas, its utilization cost is 3x. However, if there are spare cores to absorb this increased utilization, Frost adds only 3--12% overhead to application runtime.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
使用互补调度检测和保存数据竞争
数据竞争是多线程程序中常见的错误来源。在本文中,我们展示了如何在运行时通过执行具有互补线程调度的程序的多个副本来保护程序免受数据竞争错误的影响。互补调度是一组副本线程调度,旨在确保只有在发生数据争用时副本才会发散,并使有害的数据争用很可能导致发散。我们的系统称为Frost,它使用互补调度来产生至少一个副本,以避免在大多数有害的数据竞争中导致程序执行错误的指令顺序。Frost引入了基于结果的竞争检测,它通过比较执行互补调度的副本的状态来检测数据竞争。我们证明,对于非托管代码,这种方法比现有的动态竞争检测器要快得多。为了帮助程序在生产环境中幸存下来,Frost还诊断数据竞争错误并选择适当的恢复策略,例如选择可能正确的副本或执行更多副本以收集额外信息。Frost通过在单个核心上非抢占地运行副本的所有线程来控制副本的线程调度。为了将程序扩展到多个核心,Frost并行运行第三个副本,以生成程序可能的未来状态的检查点——这些检查点让Frost将程序执行分为多个时代,然后并行运行。我们使用桌面和服务器应用程序中的11个真实数据竞赛错误来评估Frost。Frost既能检测到这些数据竞争,又能存活下来。因为Frost运行三个副本,它的使用成本是3x。然而,如果有备用核来吸收增加的利用率,Frost只会在应用程序运行时增加3- 12%的开销。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ResilientFL '21: Proceedings of the First Workshop on Systems Challenges in Reliable and Secure Federated Learning, Virtual Event / Koblenz, Germany, 25 October 2021 SOSP '21: ACM SIGOPS 28th Symposium on Operating Systems Principles, Virtual Event / Koblenz, Germany, October 26-29, 2021 Application Performance Monitoring: Trade-Off between Overhead Reduction and Maintainability Efficient deterministic multithreading through schedule relaxation SILT: a memory-efficient, high-performance key-value store
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1