The rise of utilization wall limits the number of transistors that can be powered on in a single chip and results in a large region of dark silicon. While such phenomenon has led to disruptive innovation in computation, little work has been done for the Network-on-Chip (NoC) design. NoC not only directly influences the overall multi-core performance, but also consumes a significant portion of the total chip power. In this paper, we first reveal challenges and opportunities of designing power-efficient NoC in the dark silicon era. Then we propose NoC-Sprinting: based on the workload characteristics, it explores fine-grained sprinting that allows a chip to flexibly activate dark cores for instantaneous throughput improvement. In addition, it investigates topological/routing support and thermal-aware floorplanning for the sprinting process. Moreover, it builds an efficient network power-management scheme that can mitigate the dark silicon problems. Experiments on performance, power, and thermal analysis show that NoC-sprinting can provide tremendous speedup, increase sprinting duration, and meanwhile reduce the chip power significantly.
{"title":"NoC-sprinting: Interconnect for fine-grained sprinting in the dark silicon era","authors":"J. Zhan, Yuan Xie, Guangyu Sun","doi":"10.1145/2593069.2593165","DOIUrl":"https://doi.org/10.1145/2593069.2593165","url":null,"abstract":"The rise of utilization wall limits the number of transistors that can be powered on in a single chip and results in a large region of dark silicon. While such phenomenon has led to disruptive innovation in computation, little work has been done for the Network-on-Chip (NoC) design. NoC not only directly influences the overall multi-core performance, but also consumes a significant portion of the total chip power. In this paper, we first reveal challenges and opportunities of designing power-efficient NoC in the dark silicon era. Then we propose NoC-Sprinting: based on the workload characteristics, it explores fine-grained sprinting that allows a chip to flexibly activate dark cores for instantaneous throughput improvement. In addition, it investigates topological/routing support and thermal-aware floorplanning for the sprinting process. Moreover, it builds an efficient network power-management scheme that can mitigate the dark silicon problems. Experiments on performance, power, and thermal analysis show that NoC-sprinting can provide tremendous speedup, increase sprinting duration, and meanwhile reduce the chip power significantly.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126187041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered operation, where multiple input elements need to be combined into a single element by an associative operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.
{"title":"Reduction operator for wide-SIMDs reconsidered","authors":"Luc Waeijen, Dongrui She, H. Corporaal, Yifan He","doi":"10.1145/2593069.2593198","DOIUrl":"https://doi.org/10.1145/2593069.2593198","url":null,"abstract":"It has been shown that wide Single Instruction Multiple Data architectures (wide-SIMDs) can achieve high energy efficiency, especially in domains such as image and vision processing. In these and various other application domains, reduction is a frequently encountered operation, where multiple input elements need to be combined into a single element by an associative operation, e.g. addition or multiplication. There are many applications that require reduction such as: partial histogram merging, matrix multiplication and min/max-finding. Wide-SIMDs contain a large number of processing elements (PEs), which in general are connected by a minimal form of interconnect for scalability reasons. To efficiently support reduction operations on wide-SIMDs with such a minimal interconnect, we introduce two novel reduction algorithms which do not rely on complex communication networks or any dedicated hardware. The proposed approaches are compared with both dedicated hardware and other software solutions in terms of performance, area, and energy consumption. A practical case study demonstrates that the proposed software approach has much better generality, flexibility and no additional hardware cost. Compared to a dedicated hardware adder tree, the proposed software approach saves 6.8% area with a performance penalty of only 6.5%.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128547974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Liang Shi, Kaijie Wu, Mengying Zhao, C. Xue, E. Sha
NAND flash memory has been widely applied in embedded systems, personal computer systems, and data centers. However, with the development of flash memory, including its technology scaling and density improvement, the endurance of flash memory becomes a bottleneck. In this work, with the understanding of the relationship between data retention time and flash wearing, a retention trimming approach, which trims data retention time based on the time intervals between data updating, is proposed to reduce the wearing of flash memory. Reduced wearing of flash memory will improve the endurance of the flash memory. Extensive experimental results show that the proposed technique achieves significant wearing reduction for flash memory through retention trimming.
{"title":"Retention trimming for wear reduction of flash memory storage systems","authors":"Liang Shi, Kaijie Wu, Mengying Zhao, C. Xue, E. Sha","doi":"10.1145/2593069.2593203","DOIUrl":"https://doi.org/10.1145/2593069.2593203","url":null,"abstract":"NAND flash memory has been widely applied in embedded systems, personal computer systems, and data centers. However, with the development of flash memory, including its technology scaling and density improvement, the endurance of flash memory becomes a bottleneck. In this work, with the understanding of the relationship between data retention time and flash wearing, a retention trimming approach, which trims data retention time based on the time intervals between data updating, is proposed to reduce the wearing of flash memory. Reduced wearing of flash memory will improve the endurance of the flash memory. Extensive experimental results show that the proposed technique achieves significant wearing reduction for flash memory through retention trimming.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"301 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128622770","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Chun-Xun Lin, Chih-Hung Liu, I-Che Chen, D. T. Lee, Tsung-Yi Ho
Rapid growth in capacity makes flow-based microfluidic biochips a promising candidate for biochemical analysis because they can integrate more complex functions. However, as the number of components grows, the total length of flow channels between components must increase exponentially. Recent empirical studies show that long flow channels are vulnerable due to blocking and leakage defects. Thus, it is desirable to minimize the total length of flow channels for robustness. Also, for timing-sensitive biochemical assays, increase in the longest length of flow channel will delay the assay completion time and lead to variation of fluid, thereby affecting the correctness of outcome. The increasing number of components, including the pre-placed components, on the chip makes the flow channel routing problem even more complicated. In this paper, we propose an efficient obstacle-avoiding rectilinear Steiner minimum tree algorithm to deal with flow channel routing problem in flow-based microfluidic biochips. Based on the concept of Kruskal algorithm and formulating the considerations as a bi-criteria function, our algorithm is capable of simultaneously minimizing the total length and the longest length of flow channel.
{"title":"An efficient bi-criteria flow channel routing algorithm for flow-based microfluidic biochips","authors":"Chun-Xun Lin, Chih-Hung Liu, I-Che Chen, D. T. Lee, Tsung-Yi Ho","doi":"10.1145/2593069.2593084","DOIUrl":"https://doi.org/10.1145/2593069.2593084","url":null,"abstract":"Rapid growth in capacity makes flow-based microfluidic biochips a promising candidate for biochemical analysis because they can integrate more complex functions. However, as the number of components grows, the total length of flow channels between components must increase exponentially. Recent empirical studies show that long flow channels are vulnerable due to blocking and leakage defects. Thus, it is desirable to minimize the total length of flow channels for robustness. Also, for timing-sensitive biochemical assays, increase in the longest length of flow channel will delay the assay completion time and lead to variation of fluid, thereby affecting the correctness of outcome. The increasing number of components, including the pre-placed components, on the chip makes the flow channel routing problem even more complicated. In this paper, we propose an efficient obstacle-avoiding rectilinear Steiner minimum tree algorithm to deal with flow channel routing problem in flow-based microfluidic biochips. Based on the concept of Kruskal algorithm and formulating the considerations as a bi-criteria function, our algorithm is capable of simultaneously minimizing the total length and the longest length of flow channel.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132205858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ifigeneia Apostolopoulou, Konstantis Daloukas, N. Evmorfopoulos, G. Stamoulis
The inverse of the inductance matrix (reluctance matrix) is amenable to sparsification to a much greater extent than the inductance matrix itself. However, the inversion and subsequent truncation of a large dense inductance matrix to obtain the sparse inverse is very time-consuming, and previously proposed window-based techniques cannot provide adequate accuracy. In this paper we propose a method for selective inversion of the inductance matrix to a prescribed sparsity ratio, which is also amenable to parallelization on modern architectures. Experimental results demonstrate its potential to provide efficient and accurate approximation of the reluctance matrix for simulation of large-scale RLC circuits.
{"title":"Selective inversion of inductance matrix for large-scale sparse RLC simulation","authors":"Ifigeneia Apostolopoulou, Konstantis Daloukas, N. Evmorfopoulos, G. Stamoulis","doi":"10.1145/2593069.2593213","DOIUrl":"https://doi.org/10.1145/2593069.2593213","url":null,"abstract":"The inverse of the inductance matrix (reluctance matrix) is amenable to sparsification to a much greater extent than the inductance matrix itself. However, the inversion and subsequent truncation of a large dense inductance matrix to obtain the sparse inverse is very time-consuming, and previously proposed window-based techniques cannot provide adequate accuracy. In this paper we propose a method for selective inversion of the inductance matrix to a prescribed sparsity ratio, which is also amenable to parallelization on modern architectures. Experimental results demonstrate its potential to provide efficient and accurate approximation of the reluctance matrix for simulation of large-scale RLC circuits.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128813966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated emergency braking (AEB) systems become more and more important than ever in modern vehicles for assisting drivers in emergency driving situations. They mostly require fusion techniques for vehicle detection (camera and radar or stereo-vision system) that require complicated algorithms and additional costs. These have caused AEB systems less attractive to the market. This paper presents an automobile detection algorithm using single camera for the AEB system. The algorithm contains three main steps: background subtraction, thresholding, and inverted U-shape back wheel detection. The simulation under MATLAB environment provides 87.25% and 78% of detection rate and accuracy, respectively for a 1080×1920 pixel input image; 88.25% and 73.5% of detection rate and accuracy for a 480×640 pixel input image. Processing time achieved are 0.156s and 0.0297s accordingly.
{"title":"An automobile detection algorithm development for automated emergency braking system","authors":"L. Xia, Tran Duc Chung, K. A. A. Kassim","doi":"10.1145/2593069.2593083","DOIUrl":"https://doi.org/10.1145/2593069.2593083","url":null,"abstract":"Automated emergency braking (AEB) systems become more and more important than ever in modern vehicles for assisting drivers in emergency driving situations. They mostly require fusion techniques for vehicle detection (camera and radar or stereo-vision system) that require complicated algorithms and additional costs. These have caused AEB systems less attractive to the market. This paper presents an automobile detection algorithm using single camera for the AEB system. The algorithm contains three main steps: background subtraction, thresholding, and inverted U-shape back wheel detection. The simulation under MATLAB environment provides 87.25% and 78% of detection rate and accuracy, respectively for a 1080×1920 pixel input image; 88.25% and 73.5% of detection rate and accuracy for a 480×640 pixel input image. Processing time achieved are 0.156s and 0.0297s accordingly.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126845762","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mladen Slijepcevic, Leonidas Kosmidis, J. Abella, E. Quiñones, F. Cazorla
Shared caches in multicores challenge Worst-Case Execution Time (WCET) estimation due to inter-task interferences. Hardware and software cache partitioning address this issue although they complicate data sharing among tasks and the Operating System (OS) task scheduling and migration. In the context of Probabilistic Timing Analysis (PTA) time-randomised caches are used. We propose a new hardware mechanism to control inter-task interferences in shared time-randomised caches without the need of any hardware or software partitioning. Our proposed mechanism effectively bounds inter-task interferences by limiting the cache eviction frequency of each task, while providing tighter WCET estimates than cache partitioning algorithms. In a 4-core multicore processor setup our proposal improves cache partitioning by 56% in terms of guaranteed performance and 16% in terms of average performance.
{"title":"Time-analysable non-partitioned shared caches for real-time multicore systems","authors":"Mladen Slijepcevic, Leonidas Kosmidis, J. Abella, E. Quiñones, F. Cazorla","doi":"10.1145/2593069.2593235","DOIUrl":"https://doi.org/10.1145/2593069.2593235","url":null,"abstract":"Shared caches in multicores challenge Worst-Case Execution Time (WCET) estimation due to inter-task interferences. Hardware and software cache partitioning address this issue although they complicate data sharing among tasks and the Operating System (OS) task scheduling and migration. In the context of Probabilistic Timing Analysis (PTA) time-randomised caches are used. We propose a new hardware mechanism to control inter-task interferences in shared time-randomised caches without the need of any hardware or software partitioning. Our proposed mechanism effectively bounds inter-task interferences by limiting the cache eviction frequency of each task, while providing tighter WCET estimates than cache partitioning algorithms. In a 4-core multicore processor setup our proposal improves cache partitioning by 56% in terms of guaranteed performance and 16% in terms of average performance.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122950237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Dean Sullivan, J. Biggers, Guidong Zhu, Shaojie Zhang, Yier Jin
To address the concern that a complete detection scheme for effective hardware Trojan identification is lacking, we have designed an RTL security metric in order to evaluate the quality of IP cores (with the same or similar functionality) and counter Trojan attacks at the pre-fabrication stages of the IP design flow. The proposed security metric is constructed on top of two criteria, from which a quantitative security value can be assigned to the target circuit: 1) Distribution of controllability; 2) Existence of rare events. The proposed metric, called FIGHT, is an automated tool whereby malicious modifications to ICs and/or the vulnerability of the IP core can be identified, by monitoring both internal node controllability and the corresponding control value distribution plotted as a histogram. Experimentation on an RS232 module was performed to demonstrate our dual security criteria and proved security degradation to the IP module upon hardware Trojan insertion.
{"title":"FIGHT-metric: Functional identification of gate-level hardware trustworthiness","authors":"Dean Sullivan, J. Biggers, Guidong Zhu, Shaojie Zhang, Yier Jin","doi":"10.1145/2593069.2596681","DOIUrl":"https://doi.org/10.1145/2593069.2596681","url":null,"abstract":"To address the concern that a complete detection scheme for effective hardware Trojan identification is lacking, we have designed an RTL security metric in order to evaluate the quality of IP cores (with the same or similar functionality) and counter Trojan attacks at the pre-fabrication stages of the IP design flow. The proposed security metric is constructed on top of two criteria, from which a quantitative security value can be assigned to the target circuit: 1) Distribution of controllability; 2) Existence of rare events. The proposed metric, called FIGHT, is an automated tool whereby malicious modifications to ICs and/or the vulnerability of the IP core can be identified, by monitoring both internal node controllability and the corresponding control value distribution plotted as a histogram. Experimentation on an RS232 module was performed to demonstrate our dual security criteria and proved security degradation to the IP module upon hardware Trojan insertion.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126577693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current Integrated Circuit (IC) development process raises security concerns about hardware Trojan which are maliciously inserted to alter functional behavior or leak sensitive information. Most of the hardware Trojan detection techniques rely on a golden (trusted) IC against which to compare a suspected one. Hence they cannot be applied to designs using third party Intellectual Property (IP) cores where golden IP is unavailable. Moreover, due to the stealthy nature of hardware Trojan, there is no technique that can guarantee Trojan-free after manufacturing test. As a result, Trojan detection and recovery at run time acting as the last line of defense is necessary especially for mission-critical applications. In this paper, we propose design rules to assist run-time Trojan detection and fast recovery by exploring diversity of untrusted third party IP cores. With these design rules, we show the optimization approach to minimize the cost of implementation in terms of the number of different IP cores used by the implementation.
{"title":"High-level synthesis for run-time hardware Trojan detection and recovery","authors":"Xiaotong Cui, K. Ma, Liang Shi, Kaijie Wu","doi":"10.1145/2593069.2593150","DOIUrl":"https://doi.org/10.1145/2593069.2593150","url":null,"abstract":"Current Integrated Circuit (IC) development process raises security concerns about hardware Trojan which are maliciously inserted to alter functional behavior or leak sensitive information. Most of the hardware Trojan detection techniques rely on a golden (trusted) IC against which to compare a suspected one. Hence they cannot be applied to designs using third party Intellectual Property (IP) cores where golden IP is unavailable. Moreover, due to the stealthy nature of hardware Trojan, there is no technique that can guarantee Trojan-free after manufacturing test. As a result, Trojan detection and recovery at run time acting as the last line of defense is necessary especially for mission-critical applications. In this paper, we propose design rules to assist run-time Trojan detection and fast recovery by exploring diversity of untrusted third party IP cores. With these design rules, we show the optimization approach to minimize the cost of implementation in terms of the number of different IP cores used by the implementation.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116123602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Faramarz Khosravi, Felix Reimann, M. Glaß, J. Teich
In recent years, reliability has become a major issue and objective during the design of embedded systems. Here, different techniques to increase reliability like hardware-/software-based redundancy or component hardening are applied systematically during Design Space Exploration (DSE), aiming at achieving highest reliability at lowest possible cost. Existing approaches typically solely provide reliability measures, e. g. failure rate or Mean-Time-To-Failure (MTTF), to the optimization engine, poorly guiding the search which parts of the implementation to change. As a remedy, this work proposes an efficient approach that (a) determines the importance of resources with respect to the system's reliability and (b) employs this knowledge as part of a local search to guide the optimization engine which components/design decisions to investigate. First, we propose a novel approach to derive Importance Measures (IMs) using a structural evaluation of Success Trees (STs). Since ST-based reliability analysis is already used for MTTF calculation, our approach comes at almost no overhead. Second, we enrich the global DSE with a local search. Here, we propose strategies guided by the IMs that directly change and enhance the implemen- tation. In our experimental setup, the available measures to enhance reliability are the selection of hardening levels during resource allocation and software-based redundancy during task binding; exemplarily, the proposed local search considers the selected hardening levels. The results show that the proposed method outperforms a state-of-the-art approach regarding optimization quality, particularly in the search for highly-reliable yet affordable implementations - at negligible runtime overhead.
{"title":"Multi-objective local-search optimization using reliability importance measuring","authors":"Faramarz Khosravi, Felix Reimann, M. Glaß, J. Teich","doi":"10.1145/2593069.2593164","DOIUrl":"https://doi.org/10.1145/2593069.2593164","url":null,"abstract":"In recent years, reliability has become a major issue and objective during the design of embedded systems. Here, different techniques to increase reliability like hardware-/software-based redundancy or component hardening are applied systematically during Design Space Exploration (DSE), aiming at achieving highest reliability at lowest possible cost. Existing approaches typically solely provide reliability measures, e. g. failure rate or Mean-Time-To-Failure (MTTF), to the optimization engine, poorly guiding the search which parts of the implementation to change. As a remedy, this work proposes an efficient approach that (a) determines the importance of resources with respect to the system's reliability and (b) employs this knowledge as part of a local search to guide the optimization engine which components/design decisions to investigate. First, we propose a novel approach to derive Importance Measures (IMs) using a structural evaluation of Success Trees (STs). Since ST-based reliability analysis is already used for MTTF calculation, our approach comes at almost no overhead. Second, we enrich the global DSE with a local search. Here, we propose strategies guided by the IMs that directly change and enhance the implemen- tation. In our experimental setup, the available measures to enhance reliability are the selection of hardening levels during resource allocation and software-based redundancy during task binding; exemplarily, the proposed local search considers the selected hardening levels. The results show that the proposed method outperforms a state-of-the-art approach regarding optimization quality, particularly in the search for highly-reliable yet affordable implementations - at negligible runtime overhead.","PeriodicalId":433816,"journal":{"name":"2014 51st ACM/EDAC/IEEE Design Automation Conference (DAC)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125279607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}