首页 > 最新文献

2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)最新文献

英文 中文
A Runtime Fault-Tolerant Routing Scheme for Partially Connected 3D Networks-on-Chip 部分连接3D片上网络的运行时容错路由方案
A. Coelho, A. Charif, N. Zergainoh, R. Velazco
Three-dimensional Networks-on-Chip (3D-NoC) have emerged as an effective solution to the scalability and latency issues in modern complex System-On-Chips. Through-Silicon Via (TSV) is usually adopted as a viable technology enabling vertical connection among NoC layers. However, TSV-based architectures typically exhibit high vulnerability to transient and permanent faults, calling for robust routing solutions capable of sustaining operation under unpredictable failure patterns. In this paper, we introduce a complete routing solution that guarantees 100% packet delivery under an unconstrained set of runtime and permanent vertical link failures. This scheme features a baseline fully-connected low-latency deadlock-free routing algorithm, and a runtime mechanism to dynamically and progressively reconfigure the network without any packet loss. Simulation results demonstrate the effectiveness of our approach in terms of performance and reliability when compared with the state-of-the-art. Furthermore, the hardware synthesis performed using commercial 28nm technology library shows a reasonable area and power overhead with respect to the non-fault-tolerant baseline.
三维片上网络(3D-NoC)已成为解决现代复杂片上系统可扩展性和延迟问题的有效方法。TSV (Through-Silicon Via)是实现NoC层间垂直连接的可行技术。然而,基于tsv的架构通常表现出对瞬时和永久故障的高度脆弱性,需要能够在不可预测的故障模式下维持运行的健壮路由解决方案。在本文中,我们介绍了一个完整的路由解决方案,保证在一组不受约束的运行时和永久垂直链路故障下100%的数据包传输。该方案具有基线全连接低延迟无死锁路由算法和运行时机制,可以动态渐进地重新配置网络而不会丢包。仿真结果证明了我们的方法在性能和可靠性方面与最先进的方法相比是有效的。此外,使用商业28nm技术库进行的硬件综合显示,相对于非容错基线,面积和功耗开销是合理的。
{"title":"A Runtime Fault-Tolerant Routing Scheme for Partially Connected 3D Networks-on-Chip","authors":"A. Coelho, A. Charif, N. Zergainoh, R. Velazco","doi":"10.1109/DFT.2018.8602971","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602971","url":null,"abstract":"Three-dimensional Networks-on-Chip (3D-NoC) have emerged as an effective solution to the scalability and latency issues in modern complex System-On-Chips. Through-Silicon Via (TSV) is usually adopted as a viable technology enabling vertical connection among NoC layers. However, TSV-based architectures typically exhibit high vulnerability to transient and permanent faults, calling for robust routing solutions capable of sustaining operation under unpredictable failure patterns. In this paper, we introduce a complete routing solution that guarantees 100% packet delivery under an unconstrained set of runtime and permanent vertical link failures. This scheme features a baseline fully-connected low-latency deadlock-free routing algorithm, and a runtime mechanism to dynamically and progressively reconfigure the network without any packet loss. Simulation results demonstrate the effectiveness of our approach in terms of performance and reliability when compared with the state-of-the-art. Furthermore, the hardware synthesis performed using commercial 28nm technology library shows a reasonable area and power overhead with respect to the non-fault-tolerant baseline.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126500985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Evaluating the Resilience of Parallel Applications 评估并行应用程序的弹性
Mark Wilkening, Fritz G. Previlon, D. Kaeli, S. Gurumurthi, Steven E. Raasch, Vilas Sridharan
Reliability is a significant design constraint for supercomputers and large-scale data centers. Modeling the effects of faults on applications targeted to such systems allows system architects and software designers to provision resilience features, that improve fidelity of results and reduce runtimes. In this paper, we propose mechanisms to improve existing techniques to model the effect of transient faults on realistic applications. First, we extend the existing Program Vulnerability Factor metric to model multi-threaded applications. Then we demonstrate how to measure the multi-threaded PVF of an application in simulation and introduce the ability to account for software detection of hardware faults, differentiating faults that cause detected, uncorrected errors (DUE) from faults that cause silent data corruption (SDC).
可靠性是超级计算机和大型数据中心的重要设计约束。对针对此类系统的应用程序的故障影响进行建模,允许系统架构师和软件设计人员提供弹性特性,从而提高结果的保真度并减少运行时间。在本文中,我们提出了改进现有技术的机制,以模拟暂态故障对实际应用的影响。首先,我们扩展现有的程序漏洞因子度量来建模多线程应用程序。然后,我们演示了如何在模拟中测量应用程序的多线程PVF,并介绍了考虑硬件故障的软件检测的能力,区分导致检测到的未纠正错误(DUE)的故障和导致静默数据损坏(SDC)的故障。
{"title":"Evaluating the Resilience of Parallel Applications","authors":"Mark Wilkening, Fritz G. Previlon, D. Kaeli, S. Gurumurthi, Steven E. Raasch, Vilas Sridharan","doi":"10.1109/DFT.2018.8602987","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602987","url":null,"abstract":"Reliability is a significant design constraint for supercomputers and large-scale data centers. Modeling the effects of faults on applications targeted to such systems allows system architects and software designers to provision resilience features, that improve fidelity of results and reduce runtimes. In this paper, we propose mechanisms to improve existing techniques to model the effect of transient faults on realistic applications. First, we extend the existing Program Vulnerability Factor metric to model multi-threaded applications. Then we demonstrate how to measure the multi-threaded PVF of an application in simulation and introduce the ability to account for software detection of hardware faults, differentiating faults that cause detected, uncorrected errors (DUE) from faults that cause silent data corruption (SDC).","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131592504","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance-Based and Aging-Aware Resource Allocation for Concurrent GPU Applications 基于性能和感知老化的并发GPU应用资源分配
Zois-Gerasimos Tasoulas, Ryan Guss, Iraklis Anagnostopoulos
GPUs are an important part in the effort to overcome performance thresholds and unlock the true potential of computing as they offer increased computational capabilities and are cost efficient. Until now, GPUs are designed to execute one application at a time so the field of concurrent GPU applications is not exhaustively explored. When multiple applications that belong to different types, e.g., compute or memory intensive, are executed on the same platform concurrently, significant performance degradation and imbalances in terms of component aging may occur. These imbalances can lead to weak system reliability, further performance degradation and acceleration of failure time. In this paper, we propose a resource allocating algorithm that mitigates the aging imbalances without inserting overhead during the execution, limiting aging imbalance among Streaming Multiprocessors (SMs) to a standard deviation of 0.4%. Additionally, the proposed algorithm improves SM allocation for each application, achieving up to 33% higher throughput.
gpu是努力克服性能阈值和释放计算真正潜力的重要组成部分,因为它们提供了更高的计算能力并且具有成本效益。到目前为止,GPU被设计为一次执行一个应用程序,因此并发GPU应用程序领域并没有被彻底探索。当属于不同类型的多个应用程序(例如,计算或内存密集型应用程序)在同一平台上并发执行时,可能会出现显著的性能下降和组件老化方面的不平衡。这些不平衡可能导致系统可靠性降低、性能进一步下降和故障时间加速。在本文中,我们提出了一种资源分配算法,该算法可以减轻老化不平衡,而不会在执行期间插入开销,将流多处理器(SMs)之间的老化不平衡限制在0.4%的标准差内。此外,该算法改进了每个应用的SM分配,实现了高达33%的吞吐量提高。
{"title":"Performance-Based and Aging-Aware Resource Allocation for Concurrent GPU Applications","authors":"Zois-Gerasimos Tasoulas, Ryan Guss, Iraklis Anagnostopoulos","doi":"10.1109/DFT.2018.8602850","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602850","url":null,"abstract":"GPUs are an important part in the effort to overcome performance thresholds and unlock the true potential of computing as they offer increased computational capabilities and are cost efficient. Until now, GPUs are designed to execute one application at a time so the field of concurrent GPU applications is not exhaustively explored. When multiple applications that belong to different types, e.g., compute or memory intensive, are executed on the same platform concurrently, significant performance degradation and imbalances in terms of component aging may occur. These imbalances can lead to weak system reliability, further performance degradation and acceleration of failure time. In this paper, we propose a resource allocating algorithm that mitigates the aging imbalances without inserting overhead during the execution, limiting aging imbalance among Streaming Multiprocessors (SMs) to a standard deviation of 0.4%. Additionally, the proposed algorithm improves SM allocation for each application, achieving up to 33% higher throughput.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116688998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Method to Model Statistical Path Delays for Accurate Defect Coverage 为精确的缺陷覆盖建立统计路径延迟模型的方法
Pavan Kumar Javvaji, S. Tragoudas
The statistical delay of a path is traditionally modeled as a Gaussian random variable assuming that the path is always sensitized by a test pattern. Its sensitization in various circuit instances varies among its test patterns and the pattern induced delay is non-Gaussian. It is modeled using probability mass functions. The defect coverage is improved by test pattern selection using machine learning. Experimental results demonstrate accuracy in defect coverage when comparing to existing methods.
路径的统计延迟传统上被建模为高斯随机变量,假设路径总是被测试模式敏化。其在各种电路实例中的敏化程度因其测试模式而异,并且模式诱导延迟是非高斯的。它是用概率质量函数建模的。使用机器学习的测试模式选择提高了缺陷覆盖率。实验结果表明,与现有方法相比,该方法的缺陷覆盖率是准确的。
{"title":"A Method to Model Statistical Path Delays for Accurate Defect Coverage","authors":"Pavan Kumar Javvaji, S. Tragoudas","doi":"10.1109/DFT.2018.8602962","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602962","url":null,"abstract":"The statistical delay of a path is traditionally modeled as a Gaussian random variable assuming that the path is always sensitized by a test pattern. Its sensitization in various circuit instances varies among its test patterns and the pattern induced delay is non-Gaussian. It is modeled using probability mass functions. The defect coverage is improved by test pattern selection using machine learning. Experimental results demonstrate accuracy in defect coverage when comparing to existing methods.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122227783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
State Recovery for Coarse-Grain TMR Designs in FPGAs Using Partial Reconfiguration fpga中使用部分重构的粗粒度TMR设计状态恢复
Markus Schütz, A. Steininger, F. Huemer, J. Lechner
The operation of field-programmable gate arrays (FPGAs) in harsh environments like space entails the need for suitable fault-tolerance techniques of which Triple-Modular Redundancy (TMR) is most commonly deployed. While TMR is undoubtedly effective in masking faults, state recovery remains a problematic issue: Fine-grain TMR allows safe recovery, but incurs prohibitive area and performance penalties. In contrast, coarse-grain TMR has little overhead, but cannot safely provide recovery without roll-back or reset. We use the dynamic reconfiguration feature of modern FPGAs to augment an initially coarse-grain TMR with the ability of temporarily loading a fine-grain TMR design for forward-state-recovery. Therefore, we can seamlessly resume correct (fully redundant) operation in case of data-as well as configuration faults that occurred in the FPGA. As a proof of concept, the paper presents a showcase design and discusses distinctive properties of this new approach.
现场可编程门阵列(fpga)在太空等恶劣环境中的运行需要合适的容错技术,其中最常用的是三模冗余(TMR)。虽然TMR在掩盖故障方面无疑是有效的,但状态恢复仍然是一个有问题的问题:细粒度TMR允许安全恢复,但会带来令人望而望而难的面积和性能损失。相比之下,粗粒度TMR开销很小,但不能在没有回滚或重置的情况下安全地提供恢复。我们使用现代fpga的动态重构特征来增强初始粗粒度TMR,并具有临时加载细粒度TMR设计以进行前向状态恢复的能力。因此,在FPGA中发生数据和配置错误的情况下,我们可以无缝地恢复正确(完全冗余)操作。作为概念验证,本文提出了一个展示设计,并讨论了这种新方法的独特特性。
{"title":"State Recovery for Coarse-Grain TMR Designs in FPGAs Using Partial Reconfiguration","authors":"Markus Schütz, A. Steininger, F. Huemer, J. Lechner","doi":"10.1109/DFT.2018.8602984","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602984","url":null,"abstract":"The operation of field-programmable gate arrays (FPGAs) in harsh environments like space entails the need for suitable fault-tolerance techniques of which Triple-Modular Redundancy (TMR) is most commonly deployed. While TMR is undoubtedly effective in masking faults, state recovery remains a problematic issue: Fine-grain TMR allows safe recovery, but incurs prohibitive area and performance penalties. In contrast, coarse-grain TMR has little overhead, but cannot safely provide recovery without roll-back or reset. We use the dynamic reconfiguration feature of modern FPGAs to augment an initially coarse-grain TMR with the ability of temporarily loading a fine-grain TMR design for forward-state-recovery. Therefore, we can seamlessly resume correct (fully redundant) operation in case of data-as well as configuration faults that occurred in the FPGA. As a proof of concept, the paper presents a showcase design and discusses distinctive properties of this new approach.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124793558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Effects of Voltage and Temperature Variations on the Electrical Masking Capability of Sub-65 nm Combinational Logic Circuits 电压和温度变化对sub - 65nm组合逻辑电路电掩蔽能力的影响
Semiu A. Olowogemo, W. H. Robinson, D. Limbrick
Single Event Transients (SETs) induced from radiation strikes on an integrated circuit (IC) can be masked electrically by logic gates while propagating through the circuit towards a storage element (e.g., flip-flop). With the continuous scaling of CMOS technology, there are simultaneous reductions in voltage, cell size, and internal capacitances that impact the properties of the gates. The combined impact causes a reduction in the electrical masking capability of the gates. The reduction in electrical masking means that transients are more likely to reach the storage elements. In addition, variations in voltage and temperature could enhance the propagation of transient towards the storage elements. This paper describes the effects of temperature and voltage variations on the electrical masking of sub-65 nm combinational logic circuits. The worst-case temperature increases the SET pulsewidth by 57.6%. The worst-case voltage increases the SET pulsewidth by 51.2%. The pulses are therefore less likely to be masked electrically.
由集成电路(IC)上的辐射照射引起的单事件瞬态(set)可以在通过电路传播到存储元件(例如触发器)时通过逻辑门进行电屏蔽。随着CMOS技术的不断缩小,影响栅极性能的电压、电池尺寸和内部电容也在不断减小。综合影响导致栅极的电屏蔽能力降低。电屏蔽的减少意味着瞬态更有可能到达存储元件。此外,电压和温度的变化可以增强瞬态向存储元件的传播。本文研究了温度和电压变化对亚65nm组合逻辑电路电掩蔽的影响。最坏温度使SET脉冲宽度增加57.6%。最差电压使SET脉冲宽度增加51.2%。因此,脉冲不太可能被电掩盖。
{"title":"Effects of Voltage and Temperature Variations on the Electrical Masking Capability of Sub-65 nm Combinational Logic Circuits","authors":"Semiu A. Olowogemo, W. H. Robinson, D. Limbrick","doi":"10.1109/DFT.2018.8602975","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602975","url":null,"abstract":"Single Event Transients (SETs) induced from radiation strikes on an integrated circuit (IC) can be masked electrically by logic gates while propagating through the circuit towards a storage element (e.g., flip-flop). With the continuous scaling of CMOS technology, there are simultaneous reductions in voltage, cell size, and internal capacitances that impact the properties of the gates. The combined impact causes a reduction in the electrical masking capability of the gates. The reduction in electrical masking means that transients are more likely to reach the storage elements. In addition, variations in voltage and temperature could enhance the propagation of transient towards the storage elements. This paper describes the effects of temperature and voltage variations on the electrical masking of sub-65 nm combinational logic circuits. The worst-case temperature increases the SET pulsewidth by 57.6%. The worst-case voltage increases the SET pulsewidth by 51.2%. The pulses are therefore less likely to be masked electrically.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"361 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115935128","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fast Dynamic Device Authentication Based on Lorenz Chaotic Systems 基于Lorenz混沌系统的快速动态设备认证
Lake Bu, Hai Cheng, M. Kinsy
Chaotic systems, such as Lorenz systems or logistic functions, are known for their rapid divergence property. Even the smallest change in the initial condition will lead to vastly different outputs. This property renders the short-term behavior, i.e., output values, of these systems very hard to predict. Because of this divergence feature, lorenz systems are often used in cryptographic applications, particularly in key agreement protocols and encryptions. Yet, these chaotic systems do exhibit long-term deterministic behaviors-i.e., fit into a known shape over time. In this work, we propose a fast dynamic device authentication scheme that leverages both the divergence and convergence features of the Lorenz systems. In the scheme, a device proves its legitimacy by showing authentication tags belonging to a predetermined trajectory of a given Lorenz chaotic system. The security of the proposed technique resides in the fact that the short-range function output values are hard for an attacker to predict, but easy for a verifier to validate because the function is deterministic. In addition, in a multi-verifier scenario such as a mobile phone switching among base stations, the device does not have to re-initiate a separate authentication procedure each time. Instead, it just needs to prove the consistency of its chaotic behavior in an iterative manner, making the procedure very efficient in terms of execution time and computing resources.
混沌系统,如洛伦兹系统或逻辑函数,以其快速发散特性而闻名。即使初始条件中最小的变化也会导致截然不同的输出。这种特性使得这些系统的短期行为(即输出值)很难预测。由于这种发散特性,洛伦兹系统经常用于加密应用程序,特别是在密钥协议协议和加密中。然而,这些混沌系统确实表现出长期的确定性行为。随着时间的推移,它们会变成一个已知的形状。在这项工作中,我们提出了一种快速动态设备认证方案,该方案利用了洛伦兹系统的发散和收敛特征。在该方案中,设备通过显示属于给定洛伦兹混沌系统的预定轨迹的认证标签来证明其合法性。所提出的技术的安全性在于,攻击者很难预测短距离函数输出值,但验证者很容易验证,因为函数是确定性的。此外,在多个验证者场景中,例如移动电话在基站之间切换,设备不必每次都重新启动单独的身份验证过程。相反,它只需要以迭代的方式证明其混沌行为的一致性,使得该过程在执行时间和计算资源方面非常高效。
{"title":"Fast Dynamic Device Authentication Based on Lorenz Chaotic Systems","authors":"Lake Bu, Hai Cheng, M. Kinsy","doi":"10.1109/DFT.2018.8602986","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602986","url":null,"abstract":"Chaotic systems, such as Lorenz systems or logistic functions, are known for their rapid divergence property. Even the smallest change in the initial condition will lead to vastly different outputs. This property renders the short-term behavior, i.e., output values, of these systems very hard to predict. Because of this divergence feature, lorenz systems are often used in cryptographic applications, particularly in key agreement protocols and encryptions. Yet, these chaotic systems do exhibit long-term deterministic behaviors-i.e., fit into a known shape over time. In this work, we propose a fast dynamic device authentication scheme that leverages both the divergence and convergence features of the Lorenz systems. In the scheme, a device proves its legitimacy by showing authentication tags belonging to a predetermined trajectory of a given Lorenz chaotic system. The security of the proposed technique resides in the fact that the short-range function output values are hard for an attacker to predict, but easy for a verifier to validate because the function is deterministic. In addition, in a multi-verifier scenario such as a mobile phone switching among base stations, the device does not have to re-initiate a separate authentication procedure each time. Instead, it just needs to prove the consistency of its chaotic behavior in an iterative manner, making the procedure very efficient in terms of execution time and computing resources.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116980197","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Postprocessing Procedure for Reducing the Faulty Switching Activity of a Low-Power Test Set 降低低功耗测试集故障切换活动的后处理程序
I. Pomeranz
Low-power test generation procedures reduce the switching activity during functional capture cycles of scan-based tests in order to avoid overtesting of delay faults. The switching activity that these procedures address is the one in the fault-free circuit. Recently it was shown that excessive switching activity in faulty circuits can potentially cause test escapes. To avoid such situations, this paper describes a postprocessing procedure that reduces the switching activity of a low-power test set in faulty circuits. The main challenge that this procedure needs to address is the large number of faulty circuits for which the switching activity may be excessive. This challenge is addressed in this paper by reducing the fault-free switching activity in order to create a safety margin for an increased faulty switching activity. The safety margin is computed for every test individually. Experimental results for benchmark circuits demonstrate the ability of the procedure to eliminate excessive faulty switching activity for low-power test sets.
低功耗测试生成程序减少了基于扫描的测试的功能捕获周期中的切换活动,以避免延迟故障的过度测试。这些程序处理的开关活动是在无故障电路中的开关活动。最近有研究表明,在故障电路中过度的开关活动可能会导致测试逃逸。为了避免这种情况的发生,本文提出了一种降低故障电路中小功率测试装置的开关活动性的后处理方法。该程序需要解决的主要挑战是大量的故障电路,这些电路的开关活动可能过度。本文通过减少无故障切换活动来解决这一挑战,以便为增加的故障切换活动创建安全裕度。每个测试的安全裕度都是单独计算的。基准电路的实验结果表明,该方法能够消除小功率测试装置的过度故障开关活动。
{"title":"Postprocessing Procedure for Reducing the Faulty Switching Activity of a Low-Power Test Set","authors":"I. Pomeranz","doi":"10.1109/DFT.2018.8602967","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602967","url":null,"abstract":"Low-power test generation procedures reduce the switching activity during functional capture cycles of scan-based tests in order to avoid overtesting of delay faults. The switching activity that these procedures address is the one in the fault-free circuit. Recently it was shown that excessive switching activity in faulty circuits can potentially cause test escapes. To avoid such situations, this paper describes a postprocessing procedure that reduces the switching activity of a low-power test set in faulty circuits. The main challenge that this procedure needs to address is the large number of faulty circuits for which the switching activity may be excessive. This challenge is addressed in this paper by reducing the fault-free switching activity in order to create a safety margin for an increased faulty switching activity. The safety margin is computed for every test individually. Experimental results for benchmark circuits demonstrate the ability of the procedure to eliminate excessive faulty switching activity for low-power test sets.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"104 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117175641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Multiple Fault Detection in Nano Programmable Logic Arrays 纳米可编程逻辑阵列的多重故障检测
P. Junsangsri, F. Lombardi
This paper presents a new method for testing on a go-nogo basis nano programmable logic arrays; the basic configuration of an array made of passive and active interconnect resources (lines and switches) on two connected planes (AND and OR) is analyzed under a comprehensive multiple fault model. This model is applicable to production testing at nano manufacturing and considers faults (such as stuck-at and bridging faults) in the passive interconnect line structure as well as programming faults in the active resources (switching or crosspoint faults). The proposed method achieves full coverage in fault detection by configuring the array multiple times using a four-step procedure; as the complexity of testing such chip is largely dependent on the number of configuration rounds (also often referred to as programming phases) that the chip must undergo, then at production the proposed method achieves a substantial reduction in test time compared with previous techniques. Different from previous techniques that have a complexity as function of array size (i.e. quadratic with the dimension of the planes in the array), it is shown that the proposed technique has a complexity linear with the largest dimension of a plane in the nano array. Simulation results are provided to show that 100% detection is achieved and for detection, the average number of configuration rounds is significantly less than the upper bound predicted by the presented theory.
本文提出了一种基于go-nogo的纳米可编程逻辑阵列的测试新方法;在综合多故障模型下,分析了由与与或两个连接平面上的无源和有源互连资源(线路和交换机)组成的阵列的基本配置。该模型适用于纳米制造的生产测试,考虑无源互联线路结构中的故障(如卡滞故障、桥接故障)和主动资源中的编程故障(如切换或交叉点故障)。该方法采用四步法对阵列进行多次配置,实现故障检测的全覆盖;由于测试这种芯片的复杂性在很大程度上取决于芯片必须经历的配置回合数(也通常被称为编程阶段),因此在生产时,与以前的技术相比,所提出的方法大大减少了测试时间。不同于以往的复杂度与阵列尺寸成二次函数(即与阵列中平面的尺寸成二次函数)的方法,本文提出的方法的复杂度与纳米阵列中最大平面的尺寸成线性关系。仿真结果表明,该方法能够实现100%的检测,并且平均配置弹数明显小于理论预测的上限。
{"title":"Multiple Fault Detection in Nano Programmable Logic Arrays","authors":"P. Junsangsri, F. Lombardi","doi":"10.1109/DFT.2018.8602985","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602985","url":null,"abstract":"This paper presents a new method for testing on a go-nogo basis nano programmable logic arrays; the basic configuration of an array made of passive and active interconnect resources (lines and switches) on two connected planes (AND and OR) is analyzed under a comprehensive multiple fault model. This model is applicable to production testing at nano manufacturing and considers faults (such as stuck-at and bridging faults) in the passive interconnect line structure as well as programming faults in the active resources (switching or crosspoint faults). The proposed method achieves full coverage in fault detection by configuring the array multiple times using a four-step procedure; as the complexity of testing such chip is largely dependent on the number of configuration rounds (also often referred to as programming phases) that the chip must undergo, then at production the proposed method achieves a substantial reduction in test time compared with previous techniques. Different from previous techniques that have a complexity as function of array size (i.e. quadratic with the dimension of the planes in the array), it is shown that the proposed technique has a complexity linear with the largest dimension of a plane in the nano array. Simulation results are provided to show that 100% detection is achieved and for detection, the average number of configuration rounds is significantly less than the upper bound predicted by the presented theory.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126658951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MATS**: An On-Line Testing Approach for Reconfigurable Embedded Memories MATS**:可重构嵌入式存储器的在线测试方法
Ludovica Bozzoli, L. Sterpone
Modern Field Programmable Gate Arrays (FPGAs) embed dedicated blocks for Memories (BRAMs), digital signal processing (DSPs) and hardwired microprocessors merged with the reconfigurable logic array. This trend, coupled with Error Correction Code (ECC) mechanism and Dynamic Partial Reconfiguration (DPR), makes these devices ideal candidates for mission critical applications where high reliability is a strict requirement. Therefore, efficient and in-field testing became a major concern. Unfortunately, typical on-line memory testing approaches are not fully optimized for the reconfigurable scenario. In fact, a suitable fault model should be considered in order to enhance the fault coverage and reduce the test redundancy. In this work, we proposed the MATS** algorithm, which is able to reduce the execution time and optimize the fault coverage with respect to most popular embedded memories March Tests. Furthermore, MATS** results to be highly suitable to be executed, even partially, in brief time slots available within the device mission. Experimental results show that our approach is around 30% faster than state-of-the-art solutions while achieving the optimal fault coverage.
现代现场可编程门阵列(fpga)嵌入存储器(bram)专用块,数字信号处理(dsp)和硬连线微处理器与可重构逻辑阵列合并。这种趋势,加上纠错码(ECC)机制和动态部分重构(DPR),使这些设备成为对高可靠性有严格要求的关键任务应用的理想候选者。因此,高效和现场测试成为一个主要问题。不幸的是,典型的在线内存测试方法并没有完全针对可重构场景进行优化。实际上,为了提高故障覆盖率和减少测试冗余,需要考虑合适的故障模型。在这项工作中,我们提出了MATS**算法,该算法能够减少执行时间并优化目前最流行的嵌入式存储器三月测试的故障覆盖率。此外,MATS**结果非常适合在设备任务中可用的短时间内执行,甚至部分执行。实验结果表明,在实现最佳故障覆盖率的同时,我们的方法比最先进的解决方案快30%左右。
{"title":"MATS**: An On-Line Testing Approach for Reconfigurable Embedded Memories","authors":"Ludovica Bozzoli, L. Sterpone","doi":"10.1109/DFT.2018.8602934","DOIUrl":"https://doi.org/10.1109/DFT.2018.8602934","url":null,"abstract":"Modern Field Programmable Gate Arrays (FPGAs) embed dedicated blocks for Memories (BRAMs), digital signal processing (DSPs) and hardwired microprocessors merged with the reconfigurable logic array. This trend, coupled with Error Correction Code (ECC) mechanism and Dynamic Partial Reconfiguration (DPR), makes these devices ideal candidates for mission critical applications where high reliability is a strict requirement. Therefore, efficient and in-field testing became a major concern. Unfortunately, typical on-line memory testing approaches are not fully optimized for the reconfigurable scenario. In fact, a suitable fault model should be considered in order to enhance the fault coverage and reduce the test redundancy. In this work, we proposed the MATS** algorithm, which is able to reduce the execution time and optimize the fault coverage with respect to most popular embedded memories March Tests. Furthermore, MATS** results to be highly suitable to be executed, even partially, in brief time slots available within the device mission. Experimental results show that our approach is around 30% faster than state-of-the-art solutions while achieving the optimal fault coverage.","PeriodicalId":297244,"journal":{"name":"2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)","volume":"502 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131479558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2018 IEEE International Symposium on Defect and Fault Tolerance in VLSI and Nanotechnology Systems (DFT)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1