Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413148
Ravikishore Gandikota, D. Blaauw, D. Sylvester
Process induced variations in the interconnect capacitance and resistance have resulted in significant uncertainly in the interconnect delay. In this work, we propose a new method to compute the interconnect corner considering coupling-noise due to simultaneous switching of aggressors. In prior approaches, the interconnect corners were computed under the assumption that the aggressor nets are not switching and no coupling-noise is injected on the victim net. In this paper, we first show that the interconnect corners obtained under such assumptions could in reality be much different from the true interconnect corner and could therefore result in optimistic delay analysis, particularly for fast-path analysis performed to check hold time violations. We also show that in some cases, the interconnect corner may not lie at an extreme point of the process variation range. In this work, we use the Elmore delay metric to efficiently search for the correct interconnect corner of the victim stage considering delay noise. We then show experimental results to verify the effectiveness of our proposed approach and demonstrate that the traditional approaches of computing the interconnect corners could lead to errors of up to 60% on a net by net basis.
{"title":"Interconnect performance corners considering crosstalk noise","authors":"Ravikishore Gandikota, D. Blaauw, D. Sylvester","doi":"10.1109/ICCD.2009.5413148","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413148","url":null,"abstract":"Process induced variations in the interconnect capacitance and resistance have resulted in significant uncertainly in the interconnect delay. In this work, we propose a new method to compute the interconnect corner considering coupling-noise due to simultaneous switching of aggressors. In prior approaches, the interconnect corners were computed under the assumption that the aggressor nets are not switching and no coupling-noise is injected on the victim net. In this paper, we first show that the interconnect corners obtained under such assumptions could in reality be much different from the true interconnect corner and could therefore result in optimistic delay analysis, particularly for fast-path analysis performed to check hold time violations. We also show that in some cases, the interconnect corner may not lie at an extreme point of the process variation range. In this work, we use the Elmore delay metric to efficiently search for the correct interconnect corner of the victim stage considering delay noise. We then show experimental results to verify the effectiveness of our proposed approach and demonstrate that the traditional approaches of computing the interconnect corners could lead to errors of up to 60% on a net by net basis.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114411318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413152
Jongyoon Jung, Taewhan Kim
This work proposes a new yield computation technique dedicated to HLS, which is an essential component in timing variation-aware HLS research field. The SSTAs used by the current timing variation-aware HLS techniques cannot support the following two critical factors at all: (i) non-Gaussian delay distribution of ‘module patterns’ used in scheduling and binding and (ii) correlation of timing variation between module patterns. However, without considering these factors, the synthesis results would be far less accurate in timing, being very likely to fail in timing closure. Even though there are advances in the logic level for SSTAs that support (i) and (ii), the manipulation and computation of (i) and (ii) in the course of scheduling and binding in HLS are unique in that there are no concepts of module sharing and performance yield computation in the logic level. Specifically, we propose a novel yield computation technique to handle the non-Gaussian timing variation of module patterns, where the sum and max operations are closed-form formulas and the timing correlation between modules used in computing performance yield is preserved to the first-order form. Experimental results show that our synthesis using the proposed yield computation technique reduces the latency by 24.1% and 28.8% under 95% and 90% performance yield constraints over that by the conventional HLS, respectively. Further, it is confirmed that our synthesis results are near optimal with less than 3.1% error on average.
{"title":"Timing variation-aware high-level synthesis considering accurate yield computation","authors":"Jongyoon Jung, Taewhan Kim","doi":"10.1109/ICCD.2009.5413152","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413152","url":null,"abstract":"This work proposes a new yield computation technique dedicated to HLS, which is an essential component in timing variation-aware HLS research field. The SSTAs used by the current timing variation-aware HLS techniques cannot support the following two critical factors at all: (i) non-Gaussian delay distribution of ‘module patterns’ used in scheduling and binding and (ii) correlation of timing variation between module patterns. However, without considering these factors, the synthesis results would be far less accurate in timing, being very likely to fail in timing closure. Even though there are advances in the logic level for SSTAs that support (i) and (ii), the manipulation and computation of (i) and (ii) in the course of scheduling and binding in HLS are unique in that there are no concepts of module sharing and performance yield computation in the logic level. Specifically, we propose a novel yield computation technique to handle the non-Gaussian timing variation of module patterns, where the sum and max operations are closed-form formulas and the timing correlation between modules used in computing performance yield is preserved to the first-order form. Experimental results show that our synthesis using the proposed yield computation technique reduces the latency by 24.1% and 28.8% under 95% and 90% performance yield constraints over that by the conventional HLS, respectively. Further, it is confirmed that our synthesis results are near optimal with less than 3.1% error on average.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123949475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413136
A. Chatterjee, Donghoon Han, Vishwanath Natarajan, S. Devarakond, Shreyas Sen, H. Choi, R. Senguttuvan, S. Bhattacharya, A. Goyal, Deuk Lee, M. Swaminathan
Design and test of high-speed mixed-signal/RF circuits and systems is undergoing a transformation due to the effects of process variations stemming from the use of scaled CMOS technologies that result in significant yield loss. To this effect, postmanufacture tuning for yield recovery is now a necessity for many high-speed electronic circuits and systems and is typically driven by iterative test-and-tune procedures. Such procedures create new challenges for manufacturing test and built-in self-test of advanced mixed-signal/RF systems. In this paper, key test challenges are discussed and promising solutions are presented in the hope that it will be possible to design, manufacture and test “truly self-healing” systems in the near future.
{"title":"Iterative built-in testing and tuning of mixed-signal/RF systems","authors":"A. Chatterjee, Donghoon Han, Vishwanath Natarajan, S. Devarakond, Shreyas Sen, H. Choi, R. Senguttuvan, S. Bhattacharya, A. Goyal, Deuk Lee, M. Swaminathan","doi":"10.1109/ICCD.2009.5413136","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413136","url":null,"abstract":"Design and test of high-speed mixed-signal/RF circuits and systems is undergoing a transformation due to the effects of process variations stemming from the use of scaled CMOS technologies that result in significant yield loss. To this effect, postmanufacture tuning for yield recovery is now a necessity for many high-speed electronic circuits and systems and is typically driven by iterative test-and-tune procedures. Such procedures create new challenges for manufacturing test and built-in self-test of advanced mixed-signal/RF systems. In this paper, key test challenges are discussed and promising solutions are presented in the hope that it will be possible to design, manufacture and test “truly self-healing” systems in the near future.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121487548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413182
N. Shimizu
In this paper, I introduce my and my students projects to reincarnate historic systems on FPGA. Our projects are not replica nor paper-model of historic systems, but reorganized and working system on FPGA with novel and progressive design methodology. I mean progressive as under the development, because I have developed them and I am still improving the methodology and tools very often to use them by myself. In this paper, I also introduce my design methodology and tools which is used in my and my students projects.
{"title":"Reincarnate historic systems on FPGA with novel design methodology","authors":"N. Shimizu","doi":"10.1109/ICCD.2009.5413182","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413182","url":null,"abstract":"In this paper, I introduce my and my students projects to reincarnate historic systems on FPGA. Our projects are not replica nor paper-model of historic systems, but reorganized and working system on FPGA with novel and progressive design methodology. I mean progressive as under the development, because I have developed them and I am still improving the methodology and tools very often to use them by myself. In this paper, I also introduce my design methodology and tools which is used in my and my students projects.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123683594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413131
Fahad Ahmed, L. Milor
Scaling of device sizes has reduced gate oxide thickness to a few atomic layers, increasing the vulnerability of the gate oxide to breakdown. During breakdown, devices go through a gradual wearout process after an initial gate leakage increase leading to device failure. It is proposed that if wearout can be monitored, cache arrays with failing cells can be reliably operated after reconfiguration given available memory redundancy. Using experimentally verified gate oxide breakdown models, a detailed analysis of the effect of progressive gate oxide breakdown on the performance of a conventional 6T SRAM cell is presented for 45nm predictive technology. The DC margin trends (Read, Write and Retention) and access times (Read and Write) during wearout are analyzed, and a cell breakdown point due to degradation in each of these parameters is defined. A combination of these results is used to formulate a practical definition for the hard-breakdown point of a cell. Using an on-chip PVT (process, voltage, and temperature) tolerant monitoring scheme, it has been shown that gradual wearout in SRAM cells, due to gate oxide breakdown, is detectible, and cell failure can be predicted before its occurrence.
{"title":"Reliable cache design with detection of gate oxide breakdown using BIST","authors":"Fahad Ahmed, L. Milor","doi":"10.1109/ICCD.2009.5413131","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413131","url":null,"abstract":"Scaling of device sizes has reduced gate oxide thickness to a few atomic layers, increasing the vulnerability of the gate oxide to breakdown. During breakdown, devices go through a gradual wearout process after an initial gate leakage increase leading to device failure. It is proposed that if wearout can be monitored, cache arrays with failing cells can be reliably operated after reconfiguration given available memory redundancy. Using experimentally verified gate oxide breakdown models, a detailed analysis of the effect of progressive gate oxide breakdown on the performance of a conventional 6T SRAM cell is presented for 45nm predictive technology. The DC margin trends (Read, Write and Retention) and access times (Read and Write) during wearout are analyzed, and a cell breakdown point due to degradation in each of these parameters is defined. A combination of these results is used to formulate a practical definition for the hard-breakdown point of a cell. Using an on-chip PVT (process, voltage, and temperature) tolerant monitoring scheme, it has been shown that gradual wearout in SRAM cells, due to gate oxide breakdown, is detectible, and cell failure can be predicted before its occurrence.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"67 23","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121001683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413149
Yusuke Tanaka, H. Ando
Two-step physical register deallocation (TSD) is an architectural scheme, which enhances memory-level parallelism (MLP) by pre-executing instructions. Ideally, the TSD allows MLP under the unlimited number of physical registers to be exploited, and consequently only a small register file is necessary for MLP. In practice, however, the amount of MLP exploitable is limited, because there are cases where pre-execution is not performed or timing of pre-execution is delayed. This is caused by data dependencies among the pre-executed instructions. This paper proposes the use of value prediction to solve these problems. Our way of the value prediction usage has the advantage over the conventional way of the usage for enhancing ILP, that there is no need to recover from misspeculation. Our evaluation results using SPECfp2000 benchmark show that our scheme can achieve equivalent performance to that of the previous TSD scheme without value prediction, with 75% of the register file size.
{"title":"Reducing register file size through instruction pre-execution enhanced by value prediction","authors":"Yusuke Tanaka, H. Ando","doi":"10.1109/ICCD.2009.5413149","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413149","url":null,"abstract":"Two-step physical register deallocation (TSD) is an architectural scheme, which enhances memory-level parallelism (MLP) by pre-executing instructions. Ideally, the TSD allows MLP under the unlimited number of physical registers to be exploited, and consequently only a small register file is necessary for MLP. In practice, however, the amount of MLP exploitable is limited, because there are cases where pre-execution is not performed or timing of pre-execution is delayed. This is caused by data dependencies among the pre-executed instructions. This paper proposes the use of value prediction to solve these problems. Our way of the value prediction usage has the advantage over the conventional way of the usage for enhancing ILP, that there is no need to recover from misspeculation. Our evaluation results using SPECfp2000 benchmark show that our scheme can achieve equivalent performance to that of the previous TSD scheme without value prediction, with 75% of the register file size.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121567361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413168
R. Kumar, Kalyana C. Bollapalli, Rajesh Garg, Tarun Soni, S. Khatri
Delay faults are frequently encountered in nanometer technologies. Therefore, it is critical to detect these faults during factory test. Testing for a delay fault requires the application of a pair of test vectors in an at-speed manner. To maximize the delay fault detection capability, it is desired that the vectors in this pair are independent. Independent vector pairs cannot always be applied to a circuit implemented with standard scan design approaches. However, this can be achieved by using enhanced scan flip-flops, which store two bits of data. This paper has two contributions. First, we develop a pulsed flip-flop (PFF) design. Second, we present an enhanced scan flipflop design, based on our PFF circuit. We have compared the performance of our pulse based flip-flop with recently published pulse based flip-flop designs, as well as a traditional master-slave D flip-flop. Our PFF shows significant improvements in power and timing compared to the other designs. Our pulse based enhanced scan flip-flop (PESFF) has 13% lower power dissipation and 26% better timing than a conventional D flipflop based enhanced scan flip-flop (DESFF). The layout area of our PESFF is 5.2% smaller than the DESFF. Monte Carlo simulations demonstrate that our design is more robust to process variations than the DESFF.
{"title":"A robust pulsed flip-flop and its use in enhanced scan design","authors":"R. Kumar, Kalyana C. Bollapalli, Rajesh Garg, Tarun Soni, S. Khatri","doi":"10.1109/ICCD.2009.5413168","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413168","url":null,"abstract":"Delay faults are frequently encountered in nanometer technologies. Therefore, it is critical to detect these faults during factory test. Testing for a delay fault requires the application of a pair of test vectors in an at-speed manner. To maximize the delay fault detection capability, it is desired that the vectors in this pair are independent. Independent vector pairs cannot always be applied to a circuit implemented with standard scan design approaches. However, this can be achieved by using enhanced scan flip-flops, which store two bits of data. This paper has two contributions. First, we develop a pulsed flip-flop (PFF) design. Second, we present an enhanced scan flipflop design, based on our PFF circuit. We have compared the performance of our pulse based flip-flop with recently published pulse based flip-flop designs, as well as a traditional master-slave D flip-flop. Our PFF shows significant improvements in power and timing compared to the other designs. Our pulse based enhanced scan flip-flop (PESFF) has 13% lower power dissipation and 26% better timing than a conventional D flipflop based enhanced scan flip-flop (DESFF). The layout area of our PESFF is 5.2% smaller than the DESFF. Monte Carlo simulations demonstrate that our design is more robust to process variations than the DESFF.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126039076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413185
Nada Amin, W. Thies, Saman P. Amarasinghe
Microfluidic chips are emerging as a powerful platform for automating biology experiments. As it becomes possible to integrate tens of thousands of components on a single chip, researchers will require design automation tools to push the scale and complexity of their designs to match the capabilities of the substrate. However, to date such tools have focused only on droplet-based devices, leaving out the popular class of chips that are based on multilayer soft lithography. In this paper, we develop design automation techniques for microfluidic chips based on multilayer soft lithography. We focus our attention on the control layer, which is driven by pressure actuators to invoke the desired flows on chip. We present a language in which designers can specify the Instruction Set Architecture (ISA) of a microfluidic device. Given an ISA, we automatically infer the locations of valves needed to implement the ISA. We also present novel algorithms for minimizing the number of control lines needed to drive the valves, as well as for routing valves to control ports while admitting sharing between the control lines. To the microfluidic community, we offer a free computer-aided design tool, Micado, which implements a subset of our algorithms as a practical plug-in to AutoCAD. Micado is being used successfully by microfluidic designers. We demonstrate its performance on three realistic chips.
{"title":"Computer-aided design for microfluidic chips based on multilayer soft lithography","authors":"Nada Amin, W. Thies, Saman P. Amarasinghe","doi":"10.1109/ICCD.2009.5413185","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413185","url":null,"abstract":"Microfluidic chips are emerging as a powerful platform for automating biology experiments. As it becomes possible to integrate tens of thousands of components on a single chip, researchers will require design automation tools to push the scale and complexity of their designs to match the capabilities of the substrate. However, to date such tools have focused only on droplet-based devices, leaving out the popular class of chips that are based on multilayer soft lithography. In this paper, we develop design automation techniques for microfluidic chips based on multilayer soft lithography. We focus our attention on the control layer, which is driven by pressure actuators to invoke the desired flows on chip. We present a language in which designers can specify the Instruction Set Architecture (ISA) of a microfluidic device. Given an ISA, we automatically infer the locations of valves needed to implement the ISA. We also present novel algorithms for minimizing the number of control lines needed to drive the valves, as well as for routing valves to control ports while admitting sharing between the control lines. To the microfluidic community, we offer a free computer-aided design tool, Micado, which implements a subset of our algorithms as a practical plug-in to AutoCAD. Micado is being used successfully by microfluidic designers. We demonstrate its performance on three realistic chips.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128069381","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413132
S. Gupta, Amin Ansari, Shuguang Feng, S. Mahlke
With growing semiconductor integration, the reliability of individual transistors is expected to rapidly decline in future technology generations. In such a scenario, processors would need to be equipped with fault tolerance mechanisms to tolerate in-field silicon defects. Periodic online testing is a popular technique to detect such failures; however, it tends to impose a heavy testing penalty. In this paper, we propose an adaptive online testing framework to significantly reduce the testing overhead. The proposed approach is unique in its ability to assess the hardware health and apply suitably detailed tests. Thus, a significant chunk of the testing time can be saved for the healthy components. We further extend the framework to work with the StageNet CMP fabric, which provides the flexibility to group together pipeline stages with similar health conditions, thereby reducing the overall testing burden. For a modest 2.6% sensor area overhead, the proposed scheme was able to achieve an 80% reduction in software test instructions over the lifetime of a 16-core CMP.
{"title":"Adaptive online testing for efficient hard fault detection","authors":"S. Gupta, Amin Ansari, Shuguang Feng, S. Mahlke","doi":"10.1109/ICCD.2009.5413132","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413132","url":null,"abstract":"With growing semiconductor integration, the reliability of individual transistors is expected to rapidly decline in future technology generations. In such a scenario, processors would need to be equipped with fault tolerance mechanisms to tolerate in-field silicon defects. Periodic online testing is a popular technique to detect such failures; however, it tends to impose a heavy testing penalty. In this paper, we propose an adaptive online testing framework to significantly reduce the testing overhead. The proposed approach is unique in its ability to assess the hardware health and apply suitably detailed tests. Thus, a significant chunk of the testing time can be saved for the healthy components. We further extend the framework to work with the StageNet CMP fabric, which provides the flexibility to group together pipeline stages with similar health conditions, thereby reducing the overall testing burden. For a modest 2.6% sensor area overhead, the proposed scheme was able to achieve an 80% reduction in software test instructions over the lifetime of a 16-core CMP.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130643450","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2009-10-04DOI: 10.1109/ICCD.2009.5413157
A. M. Gharehbaghi, M. Fujita
This paper presents a debug method for system communications in post-silicon verification. First, we extract transaction sequences at run-time using on-chip circuits and store them in a trace buffer. Then, we read the stored transactions and analyze them with software. The analysis software tries to find certain patterns in the extracted transactions that are defined by our transaction debug pattern specification language (TDPSL). We have also defined a number of standard patterns for common communication problems such as race and deadlock in TDPSL. To show the feasibility of the method, it is applied to a number of on chip buses. It is shown that the area overhead of the method is very low. Also we have implemented the analysis software and shown that it is memory efficient, scalable and effective to find bugs. The proposed method can also be applied to fault analysis including transient faults.
{"title":"Transaction-based debugging of system-on-chips with patterns","authors":"A. M. Gharehbaghi, M. Fujita","doi":"10.1109/ICCD.2009.5413157","DOIUrl":"https://doi.org/10.1109/ICCD.2009.5413157","url":null,"abstract":"This paper presents a debug method for system communications in post-silicon verification. First, we extract transaction sequences at run-time using on-chip circuits and store them in a trace buffer. Then, we read the stored transactions and analyze them with software. The analysis software tries to find certain patterns in the extracted transactions that are defined by our transaction debug pattern specification language (TDPSL). We have also defined a number of standard patterns for common communication problems such as race and deadlock in TDPSL. To show the feasibility of the method, it is applied to a number of on chip buses. It is shown that the area overhead of the method is very low. Also we have implemented the analysis software and shown that it is memory efficient, scalable and effective to find bugs. The proposed method can also be applied to fault analysis including transient faults.","PeriodicalId":256908,"journal":{"name":"2009 IEEE International Conference on Computer Design","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2009-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130659058","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}