{"title":"硬件Bug分类的拓扑方法","authors":"Rico Angell, Ben Oztalay, A. DeOrio","doi":"10.1109/MTV.2015.10","DOIUrl":null,"url":null,"abstract":"Verification is a critical bottleneck in the time to market of a new digital design. As complexity continues to increase, post-silicon validation shoulders an increasing share of the verification/validation effort. Post-silicon validation is burdened by large volumes of test failures, and is further complicated by root cause bugs that manifest in multiple test failures. At present, these failures are prioritized and assigned to validation engineers in an ad-hoc fashion. When multiple failures caused by the same root cause bug are debugged by multiple engineers at the same time, scarce, time-critical engineering resources are wasted. Our scalable bug triage technique begins with a database of test failures. It extracts defining features from the failure reports, using a novel, topology-aware approach based on graph partitioning. It then leverages unsupervised machine learning to extract the structure of the failures, identifying groups of failures that are likely to be the result of a common root cause. With our technique, related failures can be debugged as a group, rather than individually. Additionally, we propose a metric for measuring verification efficiency as a result of bug triage called Unique Debugging Instances (UDI). We evaluated our approach on the industrial-size OpenSPARC T2 design with a set of injected bugs, and found that our approach increased average verification efficiency by 243%, with a confidence interval of 99%.","PeriodicalId":273432,"journal":{"name":"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"A Topological Approach to Hardware Bug Triage\",\"authors\":\"Rico Angell, Ben Oztalay, A. DeOrio\",\"doi\":\"10.1109/MTV.2015.10\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Verification is a critical bottleneck in the time to market of a new digital design. As complexity continues to increase, post-silicon validation shoulders an increasing share of the verification/validation effort. Post-silicon validation is burdened by large volumes of test failures, and is further complicated by root cause bugs that manifest in multiple test failures. At present, these failures are prioritized and assigned to validation engineers in an ad-hoc fashion. When multiple failures caused by the same root cause bug are debugged by multiple engineers at the same time, scarce, time-critical engineering resources are wasted. Our scalable bug triage technique begins with a database of test failures. It extracts defining features from the failure reports, using a novel, topology-aware approach based on graph partitioning. It then leverages unsupervised machine learning to extract the structure of the failures, identifying groups of failures that are likely to be the result of a common root cause. With our technique, related failures can be debugged as a group, rather than individually. Additionally, we propose a metric for measuring verification efficiency as a result of bug triage called Unique Debugging Instances (UDI). We evaluated our approach on the industrial-size OpenSPARC T2 design with a set of injected bugs, and found that our approach increased average verification efficiency by 243%, with a confidence interval of 99%.\",\"PeriodicalId\":273432,\"journal\":{\"name\":\"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)\",\"volume\":\"4 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/MTV.2015.10\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 16th International Workshop on Microprocessor and SOC Test and Verification (MTV)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/MTV.2015.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Verification is a critical bottleneck in the time to market of a new digital design. As complexity continues to increase, post-silicon validation shoulders an increasing share of the verification/validation effort. Post-silicon validation is burdened by large volumes of test failures, and is further complicated by root cause bugs that manifest in multiple test failures. At present, these failures are prioritized and assigned to validation engineers in an ad-hoc fashion. When multiple failures caused by the same root cause bug are debugged by multiple engineers at the same time, scarce, time-critical engineering resources are wasted. Our scalable bug triage technique begins with a database of test failures. It extracts defining features from the failure reports, using a novel, topology-aware approach based on graph partitioning. It then leverages unsupervised machine learning to extract the structure of the failures, identifying groups of failures that are likely to be the result of a common root cause. With our technique, related failures can be debugged as a group, rather than individually. Additionally, we propose a metric for measuring verification efficiency as a result of bug triage called Unique Debugging Instances (UDI). We evaluated our approach on the industrial-size OpenSPARC T2 design with a set of injected bugs, and found that our approach increased average verification efficiency by 243%, with a confidence interval of 99%.