A Topological Approach to Hardware Bug Triage

Rico Angell, Ben Oztalay, A. DeOrio
{"title":"A Topological Approach to Hardware Bug Triage","authors":"Rico Angell, Ben Oztalay, A. DeOrio","doi":"10.1109/MTV.2015.10","DOIUrl":null,"url":null,"abstract":"Verification is a critical bottleneck in the time to market of a new digital design. As complexity continues to increase, post-silicon validation shoulders an increasing share of the verification/validation effort. Post-silicon validation is burdened by large volumes of test failures, and is further complicated by root cause bugs that manifest in multiple test failures. At present, these failures are prioritized and assigned to validation engineers in an ad-hoc fashion. When multiple failures caused by the same root cause bug are debugged by multiple engineers at the same time, scarce, time-critical engineering resources are wasted. Our scalable bug triage technique begins with a database of test failures. It extracts defining features from the failure reports, using a novel, topology-aware approach based on graph partitioning. It then leverages unsupervised machine learning to extract the structure of the failures, identifying groups of failures that are likely to be the result of a common root cause. With our technique, related failures can be debugged as a group, rather than individually. Additionally, we propose a metric for measuring verification efficiency as a result of bug triage called Unique Debugging Instances (UDI). We evaluated our approach on the industrial-size OpenSPARC T2 design with a set of injected bugs, and found that our approach increased average verification efficiency by 243%, with a confidence interval of 99%.","PeriodicalId":273432,"journal":{"name":"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MTV.2015.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Verification is a critical bottleneck in the time to market of a new digital design. As complexity continues to increase, post-silicon validation shoulders an increasing share of the verification/validation effort. Post-silicon validation is burdened by large volumes of test failures, and is further complicated by root cause bugs that manifest in multiple test failures. At present, these failures are prioritized and assigned to validation engineers in an ad-hoc fashion. When multiple failures caused by the same root cause bug are debugged by multiple engineers at the same time, scarce, time-critical engineering resources are wasted. Our scalable bug triage technique begins with a database of test failures. It extracts defining features from the failure reports, using a novel, topology-aware approach based on graph partitioning. It then leverages unsupervised machine learning to extract the structure of the failures, identifying groups of failures that are likely to be the result of a common root cause. With our technique, related failures can be debugged as a group, rather than individually. Additionally, we propose a metric for measuring verification efficiency as a result of bug triage called Unique Debugging Instances (UDI). We evaluated our approach on the industrial-size OpenSPARC T2 design with a set of injected bugs, and found that our approach increased average verification efficiency by 243%, with a confidence interval of 99%.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
硬件Bug分类的拓扑方法
验证是新数字设计推向市场的关键瓶颈。随着复杂性的不断增加,后硅验证承担了越来越多的验证/确认工作。硅后验证被大量的测试失败所负担,并且由于在多个测试失败中出现的根本原因错误而进一步复杂化。目前,这些故障被按优先级排序,并以一种特别的方式分配给验证工程师。当由同一个根本原因bug引起的多个故障由多个工程师同时调试时,稀缺的、时间紧迫的工程资源就被浪费了。我们可扩展的bug分类技术从一个测试失败数据库开始。它使用一种基于图划分的新颖的拓扑感知方法,从故障报告中提取定义特征。然后,它利用无监督机器学习来提取故障的结构,识别可能由共同根本原因导致的故障组。使用我们的技术,相关的故障可以作为一个组进行调试,而不是单独调试。此外,我们提出了一个度量标准,用于度量作为错误分类结果的验证效率,称为唯一调试实例(Unique Debugging Instances, UDI)。我们在工业规模的OpenSPARC T2设计上用一组注入的错误评估了我们的方法,发现我们的方法将平均验证效率提高了243%,置信区间为99%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Enhancing the Stress and Efficiency of RIS Tools Using Coverage Metrics Specification-Based Test Program Generation for ARM VMSAv8-64 Memory Management Units Performance of a SystemVerilog Sudoku Solver with VCS Novel MC/DC Coverage Test Sets Generation Algorithm, and MC/DC Design Fault Detection Strength Insights Hybrid Post Silicon Validation Methodology for Layerscape SoCs involving Secure Boot: Boot (Secure & Non-secure) and Kernel Integration with Randomized Test
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1