Programmable Microfluidic Devices (PMDs) have revolutionized the traditional biochemical experiment flow. Test algorithms for PMDs have recently been proposed. Test patterns can be generated algorithmically. But an algorithm for fault localization once some faults have been identified is not yet available. When testing a PMD, once a test pattern fails it is unknown where the stuck valve is located. The stuck valve can be any one valve out of many valves forming the test pattern. In this paper, we propose an effective algorithm for the localization of stuck-at-0 faults and stuck-at-1 faults in a PMD. The stuck valve is localized either exactly or within a very small set of candidate valves. Once the locations of faulty valves are known, it becomes possible to continue to use the PMD by resynthesizing the application.
{"title":"Fault Localization in Programmable Microfluidic Devices","authors":"Alessandro Bernardini, Chunfeng Liu, Bing Li, Ulf Schlichtmann","doi":"10.23919/DATE.2019.8715023","DOIUrl":"https://doi.org/10.23919/DATE.2019.8715023","url":null,"abstract":"Programmable Microfluidic Devices (PMDs) have revolutionized the traditional biochemical experiment flow. Test algorithms for PMDs have recently been proposed. Test patterns can be generated algorithmically. But an algorithm for fault localization once some faults have been identified is not yet available. When testing a PMD, once a test pattern fails it is unknown where the stuck valve is located. The stuck valve can be any one valve out of many valves forming the test pattern. In this paper, we propose an effective algorithm for the localization of stuck-at-0 faults and stuck-at-1 faults in a PMD. The stuck valve is localized either exactly or within a very small set of candidate valves. Once the locations of faulty valves are known, it becomes possible to continue to use the PMD by resynthesizing the application.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115931658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8715049
Hongwei Qin, D. Feng, Wei Tong, Jingning Liu, Yutong Zhao
By exposing physical channels to host software, Open-Channel SSD shows great potential in future high performance storage systems. However, the existing scheme fails to achieve acceptable performance under heavy workloads. The main reasons reside not only in its single-buffer architecture, more importantly, but also in its line-based physical address management. Besides, the lock of address mapping table is also a performance burden under heavy workloads. We propose QBLK, an open source driver which tries to better exploit the parallelism of Open-Channel SSDs. Particularly, QBLK adopts four key techniques, namely (1) Multi-queue based buffering, (2) Per-channel based address management, (3) Lock-free address mapping, and (4) Fine-grained draining. Experimental results show that QBLK achieves up to 97.4% bandwidth improvement compared with the state-of-the-art PBLK scheme.
{"title":"QBLK: Towards Fully Exploiting the Parallelism of Open-Channel SSDs","authors":"Hongwei Qin, D. Feng, Wei Tong, Jingning Liu, Yutong Zhao","doi":"10.23919/DATE.2019.8715049","DOIUrl":"https://doi.org/10.23919/DATE.2019.8715049","url":null,"abstract":"By exposing physical channels to host software, Open-Channel SSD shows great potential in future high performance storage systems. However, the existing scheme fails to achieve acceptable performance under heavy workloads. The main reasons reside not only in its single-buffer architecture, more importantly, but also in its line-based physical address management. Besides, the lock of address mapping table is also a performance burden under heavy workloads. We propose QBLK, an open source driver which tries to better exploit the parallelism of Open-Channel SSDs. Particularly, QBLK adopts four key techniques, namely (1) Multi-queue based buffering, (2) Per-channel based address management, (3) Lock-free address mapping, and (4) Fine-grained draining. Experimental results show that QBLK achieves up to 97.4% bandwidth improvement compared with the state-of-the-art PBLK scheme.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131960763","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8715150
Haohao Liao, C. Gebotys
Recently electromagnetic fault injection (EMFI) techniques have been found to have significant implications on the security of embedded devices. Unfortunately there is still a lack of understanding of EM fault models and countermeasures for embedded processors. For the first time, this paper proposes an extended fault model based on the concept of critical charge and a new EMFI backside methodology based on over-clocking. Results show that exact timing of EM pulses can provide reliable repeatable instruction replacement faults for specific programs. An attack on AES is demonstrated showing that the EM fault injection requires on average less than 222 EM pulses and 5.3 plaintexts to retrieve the full AES key. This research is critical for ensuring embedded processors and their instruction set architectures are secure and resistant to fault injection attacks.
{"title":"Methodology for EM Fault Injection: Charge-based Fault Model","authors":"Haohao Liao, C. Gebotys","doi":"10.23919/DATE.2019.8715150","DOIUrl":"https://doi.org/10.23919/DATE.2019.8715150","url":null,"abstract":"Recently electromagnetic fault injection (EMFI) techniques have been found to have significant implications on the security of embedded devices. Unfortunately there is still a lack of understanding of EM fault models and countermeasures for embedded processors. For the first time, this paper proposes an extended fault model based on the concept of critical charge and a new EMFI backside methodology based on over-clocking. Results show that exact timing of EM pulses can provide reliable repeatable instruction replacement faults for specific programs. An attack on AES is demonstrated showing that the EM fault injection requires on average less than 222 EM pulses and 5.3 plaintexts to retrieve the full AES key. This research is critical for ensuring embedded processors and their instruction set architectures are secure and resistant to fault injection attacks.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133957591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714956
Hameeza Ahmed, P. C. Santos, J. P. C. Lima, Rafael Fão de Moura, M. Alves, A. C. S. Beck, L. Carro
Although not a new technique, due to the advent of 3D-stacked technologies, the integration of large memories and logic circuitry able to compute large amount of data has revived the Processing-in-Memory (PIM) techniques. PIM is a technique to increase performance while reducing energy consumption when dealing with large amounts of data. Despite several designs of PIM are available in the literature, their effective implementation still burdens the programmer. Also, various PIM instances are required to take advantage of the internal 3D-stacked memories, which further increases the challenges faced by the programmers. In this way, this work presents the Processing-In-Memory cOmpiler (PRIMO). Our compiler is able to efficiently exploit large vector units on a PIM architecture, directly from the original code. PRIMO is able to automatically select suitable PIM operations, allowing its automatic offloading. Moreover, PRIMO concerns about several PIM instances, selecting the most suitable instance while reduces internal communication between different PIM units. The compilation results of different benchmarks depict how PRIMO is able to exploit large vectors, while achieving a near-optimal performance when compared to the ideal execution for the case study PIM. PRIMO allows a speedup of 38× for specific kernels, while on average achieves 11.8 × for a set of benchmarks from PolyBench Suite.
{"title":"A Compiler for Automatic Selection of Suitable Processing-in-Memory Instructions","authors":"Hameeza Ahmed, P. C. Santos, J. P. C. Lima, Rafael Fão de Moura, M. Alves, A. C. S. Beck, L. Carro","doi":"10.23919/DATE.2019.8714956","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714956","url":null,"abstract":"Although not a new technique, due to the advent of 3D-stacked technologies, the integration of large memories and logic circuitry able to compute large amount of data has revived the Processing-in-Memory (PIM) techniques. PIM is a technique to increase performance while reducing energy consumption when dealing with large amounts of data. Despite several designs of PIM are available in the literature, their effective implementation still burdens the programmer. Also, various PIM instances are required to take advantage of the internal 3D-stacked memories, which further increases the challenges faced by the programmers. In this way, this work presents the Processing-In-Memory cOmpiler (PRIMO). Our compiler is able to efficiently exploit large vector units on a PIM architecture, directly from the original code. PRIMO is able to automatically select suitable PIM operations, allowing its automatic offloading. Moreover, PRIMO concerns about several PIM instances, selecting the most suitable instance while reduces internal communication between different PIM units. The compilation results of different benchmarks depict how PRIMO is able to exploit large vectors, while achieving a near-optimal performance when compared to the ideal execution for the case study PIM. PRIMO allows a speedup of 38× for specific kernels, while on average achieves 11.8 × for a set of benchmarks from PolyBench Suite.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131781511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714918
Sheriff Sadiqbatcha, Hengyang Zhao, H. Amrouch, J. Henkel, S. Tan
Accurate thermal models suitable for system level dynamic thermal, power and reliability regulation and management are vital for many commercial multi-core processors. However, developing such accurate thermal models and identifying the related thermal-power relevant spatial locations for commercial processors is a challenging task due to the lack of information and available tools. Existing tools such as HotSpot-like thermal models may suffer from inaccuracy or inefficiency for online applications, primarily because most rely on parameters that cannot be precisely quantified, such as power-traces, while others are numerical methods not suitable for runtime use. In this work, we propose a novel approach to automatically detecting the major heat-sources on a commercial multi-core microprocessor using an infrared thermal imaging setup. Our approach involves a number of steps including 2D discrete cosine transformation filter for noise reduction on the measured thermal maps, and Laplacian transformation followed by K-mean clustering for heat-source identification. Since the identified heat-sources are the thermally vulnerable areas of the die, we propose a novel approach to deriving a thermal model capable of predicting their temperatures during runtime. We apply Long-Short-Term-Memory (LSTM) networks to build a dynamic thermal model which uses system-level variables such as chip frequency, voltage and instruction count as inputs. The model is trained and tested exclusively using measured thermal data from a commercial multi-core processor. Experimental results show that the proposed thermal model achieves very high accuracy (root-mean-square-error: 2.04°C to 2.57° C) in predicting the temperature of all the identified heat-sources on the chip.
{"title":"Hot Spot Identification and System Parameterized Thermal Modeling for Multi-Core Processors Through Infrared Thermal Imaging","authors":"Sheriff Sadiqbatcha, Hengyang Zhao, H. Amrouch, J. Henkel, S. Tan","doi":"10.23919/DATE.2019.8714918","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714918","url":null,"abstract":"Accurate thermal models suitable for system level dynamic thermal, power and reliability regulation and management are vital for many commercial multi-core processors. However, developing such accurate thermal models and identifying the related thermal-power relevant spatial locations for commercial processors is a challenging task due to the lack of information and available tools. Existing tools such as HotSpot-like thermal models may suffer from inaccuracy or inefficiency for online applications, primarily because most rely on parameters that cannot be precisely quantified, such as power-traces, while others are numerical methods not suitable for runtime use. In this work, we propose a novel approach to automatically detecting the major heat-sources on a commercial multi-core microprocessor using an infrared thermal imaging setup. Our approach involves a number of steps including 2D discrete cosine transformation filter for noise reduction on the measured thermal maps, and Laplacian transformation followed by K-mean clustering for heat-source identification. Since the identified heat-sources are the thermally vulnerable areas of the die, we propose a novel approach to deriving a thermal model capable of predicting their temperatures during runtime. We apply Long-Short-Term-Memory (LSTM) networks to build a dynamic thermal model which uses system-level variables such as chip frequency, voltage and instruction count as inputs. The model is trained and tested exclusively using measured thermal data from a commercial multi-core processor. Experimental results show that the proposed thermal model achieves very high accuracy (root-mean-square-error: 2.04°C to 2.57° C) in predicting the temperature of all the identified heat-sources on the chip.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"208 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133672668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714809
L. Aksoy, M. Altun
In recent years the realization of a logic function on two-dimensional arrays of four-terminal switches, called switching lattices, has attracted considerable interest. Exact and approximate methods have been proposed for the problem of synthesizing Boolean functions on switching lattices with minimum size, called lattice synthesis (LS) problem. However, the exact method can only handle relatively small instances and the approximate methods may find solutions that are far from the optimum. This paper introduces an approximate algorithm, called JANUS, that formalizes the problem of realizing a logic function on a given lattice, called lattice mapping (LM) problem, as a satisfiability problem and explores the search space of the LS problem in a dichotomic search manner, solving LM problems for possible lattice candidates. This paper also presents three methods to improve the initial upper bound and an efficient way to realize multiple logic functions on a single lattice. Experimental results show that JANUS can find solutions very close to the minimum in a reasonable time and obtain better results than the existing approximate methods. The solutions of JANUS can also be better than those of the exact method, which cannot be determined to be optimal due to the given time limit, where the maximum gain on the number of switches reaches up to 25%.
{"title":"A Satisfiability-Based Approximate Algorithm for Logic Synthesis Using Switching Lattices","authors":"L. Aksoy, M. Altun","doi":"10.23919/DATE.2019.8714809","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714809","url":null,"abstract":"In recent years the realization of a logic function on two-dimensional arrays of four-terminal switches, called switching lattices, has attracted considerable interest. Exact and approximate methods have been proposed for the problem of synthesizing Boolean functions on switching lattices with minimum size, called lattice synthesis (LS) problem. However, the exact method can only handle relatively small instances and the approximate methods may find solutions that are far from the optimum. This paper introduces an approximate algorithm, called JANUS, that formalizes the problem of realizing a logic function on a given lattice, called lattice mapping (LM) problem, as a satisfiability problem and explores the search space of the LS problem in a dichotomic search manner, solving LM problems for possible lattice candidates. This paper also presents three methods to improve the initial upper bound and an efficient way to realize multiple logic functions on a single lattice. Experimental results show that JANUS can find solutions very close to the minimum in a reasonable time and obtain better results than the existing approximate methods. The solutions of JANUS can also be better than those of the exact method, which cannot be determined to be optimal due to the given time limit, where the maximum gain on the number of switches reaches up to 25%.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115920488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714939
V. Tenace, R. G. Rizzo, Debjyoti Bhattacharjee, A. Chattopadhyay, A. Calimera
A Memristor is a two-terminal device that can serve as a non-volatile memory element with built-in logic capabilities. Arranged in a crossbar structure, memristive arrays allow to represent complex Boolean logic functions that adhere to the logic-in-memory paradigm, where data and logic gates are glued together on the same piece of hardware. Needless to say, novel and ad-hoc CAD solutions are required to achieve practical and feasible hardware implementations. Existing techniques aim at optimal mapping strategies that account for Boolean logic functions described by means of 2-input NOR and NOT gates, thus overlooking the optimization capabilities that a smart and dedicated technology-aware logic synthesis can provide. In this paper, we introduce a novel library-free supergate-aided (SAID) logic synthesis approach with a dedicated mapping strategy tailored on MAGIC crossbars. Supergates are obtained with a Look-Up Table (LUT)-based synthesis that splits a complex logic network into smaller Boolean functions. Those functions are then mapped on the crossbar array as to minimize latency. The proposed SAID flow allows to (i) maximize supergate-level parallelism, thus reducing the total number of computing cycles, and (ii) relax mapping constraints, allowing an easy and fast mapping of Boolean functions on memristive crossbars. Experimental results obtained on several benchmarks from ISCAS’85 and IWLS’93 suites demonstrate that our solution is capable to outperform other state-of-the-art techniques in terms of speedup (3.89× in the best case), at the expense of a very low area overhead.
{"title":"SAID: A Supergate-Aided Logic Synthesis Flow for Memristive Crossbars","authors":"V. Tenace, R. G. Rizzo, Debjyoti Bhattacharjee, A. Chattopadhyay, A. Calimera","doi":"10.23919/DATE.2019.8714939","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714939","url":null,"abstract":"A Memristor is a two-terminal device that can serve as a non-volatile memory element with built-in logic capabilities. Arranged in a crossbar structure, memristive arrays allow to represent complex Boolean logic functions that adhere to the logic-in-memory paradigm, where data and logic gates are glued together on the same piece of hardware. Needless to say, novel and ad-hoc CAD solutions are required to achieve practical and feasible hardware implementations. Existing techniques aim at optimal mapping strategies that account for Boolean logic functions described by means of 2-input NOR and NOT gates, thus overlooking the optimization capabilities that a smart and dedicated technology-aware logic synthesis can provide. In this paper, we introduce a novel library-free supergate-aided (SAID) logic synthesis approach with a dedicated mapping strategy tailored on MAGIC crossbars. Supergates are obtained with a Look-Up Table (LUT)-based synthesis that splits a complex logic network into smaller Boolean functions. Those functions are then mapped on the crossbar array as to minimize latency. The proposed SAID flow allows to (i) maximize supergate-level parallelism, thus reducing the total number of computing cycles, and (ii) relax mapping constraints, allowing an easy and fast mapping of Boolean functions on memristive crossbars. Experimental results obtained on several benchmarks from ISCAS’85 and IWLS’93 suites demonstrate that our solution is capable to outperform other state-of-the-art techniques in terms of speedup (3.89× in the best case), at the expense of a very low area overhead.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"81 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114373616","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8715258
B. Mudassar, Priyabrata Saha, Yun Long, M. Amir, Evan Gebhardt, Taesik Na, J. Ko, M. Wolf, S. Mukhopadhyay
The cameras today are designed to capture signals with highest possible accuracy to most faithfully represent what it sees. However, many mission-critical autonomous applications ranging from traffic monitoring to disaster recovery to defense requires quality of information, where useful information depends on the tasks and is defined using complex features, rather than only changes in captured signal. Such applications require cameras that capture useful information from a scene with highest quality while meeting system constraints such as power, performance, and bandwidth. This paper will discuss the feasibility of a camera that learns how to capture task-dependent information with highest quality, paving the pathway to design a camera with brain. 3D integration of digital pixel sensors with massively parallel computing platform for machine learning creates a hardware architecture for such a camera. The paper will discuss embedded machine learning algorithms that can run on such platform to enhance quality of useful information by real-time control of the sensor parameters. We conclude by identifying critical challenges as well as opportunities for hardware and algorithmic innovations to enable machine learning in the feedback loop of a 3D image sensor based camera.
{"title":"A Camera with Brain – Embedding Machine Learning in 3D Sensors","authors":"B. Mudassar, Priyabrata Saha, Yun Long, M. Amir, Evan Gebhardt, Taesik Na, J. Ko, M. Wolf, S. Mukhopadhyay","doi":"10.23919/DATE.2019.8715258","DOIUrl":"https://doi.org/10.23919/DATE.2019.8715258","url":null,"abstract":"The cameras today are designed to capture signals with highest possible accuracy to most faithfully represent what it sees. However, many mission-critical autonomous applications ranging from traffic monitoring to disaster recovery to defense requires quality of information, where useful information depends on the tasks and is defined using complex features, rather than only changes in captured signal. Such applications require cameras that capture useful information from a scene with highest quality while meeting system constraints such as power, performance, and bandwidth. This paper will discuss the feasibility of a camera that learns how to capture task-dependent information with highest quality, paving the pathway to design a camera with brain. 3D integration of digital pixel sensors with massively parallel computing platform for machine learning creates a hardware architecture for such a camera. The paper will discuss embedded machine learning algorithms that can run on such platform to enhance quality of useful information by real-time control of the sensor parameters. We conclude by identifying critical challenges as well as opportunities for hardware and algorithmic innovations to enable machine learning in the feedback loop of a 3D image sensor based camera.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114539488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714930
Manolis Katsaragakis, Dimosthenis Masouros, Vasileios Tsoutsouras, Farzad Samie, L. Bauer, J. Henkel, D. Soudris
Resource management is a key technique for efficiently operating devices in Internet of Things (IoT). In this paper, we propose DMRM, a new algorithm based on economic and pricing models for dynamic resource management of IoT networks under CPU, memory, bandwidth and latency constraints. We use a supply and demand model, smart data pricing and perceived valued pricing, implementing a marketplace where IoT devices and Gateways buy and sell computing and communication resources necessary for task execution. Our new market-based algorithm is compared to relevant approaches showing that it not only reaches near-optimal results, but also, its scalable, distributed nature leads to three orders of magnitude lower execution requirements compared to centralized approaches.
{"title":"DMRM: Distributed Market-Based Resource Management of Edge Computing Systems","authors":"Manolis Katsaragakis, Dimosthenis Masouros, Vasileios Tsoutsouras, Farzad Samie, L. Bauer, J. Henkel, D. Soudris","doi":"10.23919/DATE.2019.8714930","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714930","url":null,"abstract":"Resource management is a key technique for efficiently operating devices in Internet of Things (IoT). In this paper, we propose DMRM, a new algorithm based on economic and pricing models for dynamic resource management of IoT networks under CPU, memory, bandwidth and latency constraints. We use a supply and demand model, smart data pricing and perceived valued pricing, implementing a marketplace where IoT devices and Gateways buy and sell computing and communication resources necessary for task execution. Our new market-based algorithm is compared to relevant approaches showing that it not only reaches near-optimal results, but also, its scalable, distributed nature leads to three orders of magnitude lower execution requirements compared to centralized approaches.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121953678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-25DOI: 10.23919/DATE.2019.8714879
Dennis D. Weller, Michael Hefenbrock, M. Golanbari, M. Beigl, M. Tahoori
Due to aggressive technology downscaling, process and runtime variations have a strong impact on the correct functionality in the field as well as manufacturing yield. The assessment of the yield and failure rate is extremely crucial for design optimization. The common practice is to use Monte Carlo simulations in order to account for device variations and estimate failure rate. However, Monte Carlo methods are infeasible for estimating rare events such as high sigma failure rates, and hence, various importance sampling methods have been proposed. In this paper, we present an efficient importance sampling approach based on Bayesian optimization. Its advantages include constant complexity independent of the dimensions of design space, the potential to find the global extrema, and higher trustworthiness of the estimated failure rate. We evaluated the approach on a 6T SRAM cell based on a 28nm FDSOI process. The results show significant speedup and more than two orders of magnitude better accuracy in failure rate estimation, compared to the best state-of-the-art technique.
{"title":"Bayesian Optimized Importance Sampling for High Sigma Failure Rate Estimation","authors":"Dennis D. Weller, Michael Hefenbrock, M. Golanbari, M. Beigl, M. Tahoori","doi":"10.23919/DATE.2019.8714879","DOIUrl":"https://doi.org/10.23919/DATE.2019.8714879","url":null,"abstract":"Due to aggressive technology downscaling, process and runtime variations have a strong impact on the correct functionality in the field as well as manufacturing yield. The assessment of the yield and failure rate is extremely crucial for design optimization. The common practice is to use Monte Carlo simulations in order to account for device variations and estimate failure rate. However, Monte Carlo methods are infeasible for estimating rare events such as high sigma failure rates, and hence, various importance sampling methods have been proposed. In this paper, we present an efficient importance sampling approach based on Bayesian optimization. Its advantages include constant complexity independent of the dimensions of design space, the potential to find the global extrema, and higher trustworthiness of the estimated failure rate. We evaluated the approach on a 6T SRAM cell based on a 28nm FDSOI process. The results show significant speedup and more than two orders of magnitude better accuracy in failure rate estimation, compared to the best state-of-the-art technique.","PeriodicalId":445778,"journal":{"name":"2019 Design, Automation & Test in Europe Conference & Exhibition (DATE)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128175636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}