Pub Date : 2020-03-06DOI: 10.1007/s10617-020-09234-6
Adeel Israr, Mohammad Kaleem, S. Nazir, Hamid Turab Mirza, S. Huss
{"title":"Nested genetic algorithm for highly reliable and efficient embedded system design","authors":"Adeel Israr, Mohammad Kaleem, S. Nazir, Hamid Turab Mirza, S. Huss","doi":"10.1007/s10617-020-09234-6","DOIUrl":"https://doi.org/10.1007/s10617-020-09234-6","url":null,"abstract":"","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"24 1","pages":"185 - 221"},"PeriodicalIF":1.4,"publicationDate":"2020-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10617-020-09234-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"52170466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-02-06DOI: 10.1007/s10617-020-09233-7
Asmeen Kashif, Mohammad A. S. Khalid
Rising data rates and input/output density in integrated circuits are challenging the traditional off-chip copper interconnect solutions, demanding a compatible high-speed serial interface capable of maintaining multi-gigabits data rates. Designers typically choose copper interconnect for chip-to-chip connections in a Multi-FPGA System (MFS). However, copper based interconnects are incapable of scaling up with the data rate and exhibit lossy characteristics with increasing frequency. Performance of an MFS can be enhanced if the off-chip electrical interconnects are replaced by short-range optical interconnects. Additionally, the selection of MFS inter-chip communication strategy also affects system performance. We have proposed latency-optimized MFS with serial optical interface with two different inter-chip communication strategies. The proposed architectures were experimentally evaluated using six real world benchmark circuits and provided an average system frequency gain of nearly 22%, compared to conventional MFS.
{"title":"Experimental evaluation and comparison of latency-optimized opticaland conventional multi-FPGA systems","authors":"Asmeen Kashif, Mohammad A. S. Khalid","doi":"10.1007/s10617-020-09233-7","DOIUrl":"https://doi.org/10.1007/s10617-020-09233-7","url":null,"abstract":"Rising data rates and input/output density in integrated circuits are challenging the traditional off-chip copper interconnect solutions, demanding a compatible high-speed serial interface capable of maintaining multi-gigabits data rates. Designers typically choose copper interconnect for chip-to-chip connections in a Multi-FPGA System (MFS). However, copper based interconnects are incapable of scaling up with the data rate and exhibit lossy characteristics with increasing frequency. Performance of an MFS can be enhanced if the off-chip electrical interconnects are replaced by short-range optical interconnects. Additionally, the selection of MFS inter-chip communication strategy also affects system performance. We have proposed latency-optimized MFS with serial optical interface with two different inter-chip communication strategies. The proposed architectures were experimentally evaluated using six real world benchmark circuits and provided an average system frequency gain of nearly 22%, compared to conventional MFS.","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"37 2-3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2020-02-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138524219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-12-05DOI: 10.1007/s10617-019-09232-3
Lalatendu Behera, Purandar Bhaduri
Real-time safety-critical systems are getting more complicated due to the introduction of mixed-criticality systems. The increasing use of mixed-criticality systems has motivated the real-time systems research community to investigate various non-functional aspects of these systems. Energy consumption minimization is one such aspect which is just beginning to be explored. In this paper, we propose a time-triggered dynamic voltage and frequency scaling (DVFS) algorithm for uniprocessor mixed-criticality systems. We show that our algorithm outperforms the predominant existing algorithm which uses DVFS for mixed-criticality systems with respect to minimization of energy consumption. In addition, ours is the first energy-efficient time-triggered algorithm for mixed-criticality systems. We prove an optimality result for the proposed algorithm with respect to energy consumption. Then we extend our algorithm for tasks with dependency constraints.
{"title":"An energy-efficient time-triggered scheduling algorithm for mixed-criticality systems","authors":"Lalatendu Behera, Purandar Bhaduri","doi":"10.1007/s10617-019-09232-3","DOIUrl":"https://doi.org/10.1007/s10617-019-09232-3","url":null,"abstract":"Real-time safety-critical systems are getting more complicated due to the introduction of mixed-criticality systems. The increasing use of mixed-criticality systems has motivated the real-time systems research community to investigate various non-functional aspects of these systems. Energy consumption minimization is one such aspect which is just beginning to be explored. In this paper, we propose a time-triggered dynamic voltage and frequency scaling (DVFS) algorithm for uniprocessor mixed-criticality systems. We show that our algorithm outperforms the predominant existing algorithm which uses DVFS for mixed-criticality systems with respect to minimization of energy consumption. In addition, ours is the first energy-efficient time-triggered algorithm for mixed-criticality systems. We prove an optimality result for the proposed algorithm with respect to energy consumption. Then we extend our algorithm for tasks with dependency constraints.","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"141 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2019-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138524209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-19DOI: 10.1007/s10617-019-09227-0
Miran Hasanagic, T. Fabbri, P. Larsen, V. Bandur, P. Tran-Jørgensen, J. Ouy
{"title":"Code generation for distributed embedded systems with VDM-RT","authors":"Miran Hasanagic, T. Fabbri, P. Larsen, V. Bandur, P. Tran-Jørgensen, J. Ouy","doi":"10.1007/s10617-019-09227-0","DOIUrl":"https://doi.org/10.1007/s10617-019-09227-0","url":null,"abstract":"","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"23 1","pages":"153 - 177"},"PeriodicalIF":1.4,"publicationDate":"2019-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10617-019-09227-0","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47392330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-13DOI: 10.1007/s10617-019-09230-5
Diogo M. F. Izidio, Antonyus P. A. Ferreira, H. R. Medeiros, Edna Barros
{"title":"An embedded automatic license plate recognition system using deep learning","authors":"Diogo M. F. Izidio, Antonyus P. A. Ferreira, H. R. Medeiros, Edna Barros","doi":"10.1007/s10617-019-09230-5","DOIUrl":"https://doi.org/10.1007/s10617-019-09230-5","url":null,"abstract":"","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"24 1","pages":"23 - 43"},"PeriodicalIF":1.4,"publicationDate":"2019-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10617-019-09230-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46086358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-09DOI: 10.1007/s10617-019-09228-z
Michael G. Jordan, Marcelo Brandalero, Guilherme M. Malfatti, Geraldo F. Oliveira, Arthur F. Lorenzon, Bruno C. da Silva, Luigi Carro, Mateus B. Rutzig, Antonio Carlos S. Beck
Given the saturation of single-threaded performance improvements in General-Purpose Processor, novel architectural techniques are required to meet emerging demands. In this paper, we propose a generic acceleration framework for approximate algorithms that replaces function execution by table look-up accesses in dedicated memories. A strategy based on the K-Means Clustering algorithm is used to learn mappings from arbitrary function inputs to frequently occurring outputs at compile-time. At run-time, these learned values are fetched from dedicated look-up tables and the best result is selected using the Nearest-Centroid Classifier, which is implemented in hardware. The proposed approach improves over the state-of-the-art neural acceleration solution, with nearly 3X times better performance, (18.72%) up to (90.99%) energy reductions and (17%) area savings under similar levels of quality, thus opening new opportunities for performance harvesting in approximate accelerators.
{"title":"Data clustering for efficient approximate computing","authors":"Michael G. Jordan, Marcelo Brandalero, Guilherme M. Malfatti, Geraldo F. Oliveira, Arthur F. Lorenzon, Bruno C. da Silva, Luigi Carro, Mateus B. Rutzig, Antonio Carlos S. Beck","doi":"10.1007/s10617-019-09228-z","DOIUrl":"https://doi.org/10.1007/s10617-019-09228-z","url":null,"abstract":"Given the saturation of single-threaded performance improvements in General-Purpose Processor, novel architectural techniques are required to meet emerging demands. In this paper, we propose a generic acceleration framework for approximate algorithms that replaces function execution by table look-up accesses in dedicated memories. A strategy based on the K-Means Clustering algorithm is used to learn mappings from arbitrary function inputs to frequently occurring outputs at compile-time. At run-time, these learned values are fetched from dedicated look-up tables and the best result is selected using the Nearest-Centroid Classifier, which is implemented in hardware. The proposed approach improves over the state-of-the-art neural acceleration solution, with nearly 3X times better performance, <span>(18.72%)</span> up to <span>(90.99%)</span> energy reductions and <span>(17%)</span> area savings under similar levels of quality, thus opening new opportunities for performance harvesting in approximate accelerators.","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"28 1","pages":""},"PeriodicalIF":1.4,"publicationDate":"2019-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138524211","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-08DOI: 10.1007/s10617-019-09229-y
Muhammad Waseem Anwar, M. Rashid, F. Azam, M. Kashif, Wasi Haider Butt
{"title":"A model-driven framework for design and verification of embedded systems through SystemVerilog","authors":"Muhammad Waseem Anwar, M. Rashid, F. Azam, M. Kashif, Wasi Haider Butt","doi":"10.1007/s10617-019-09229-y","DOIUrl":"https://doi.org/10.1007/s10617-019-09229-y","url":null,"abstract":"","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"23 1","pages":"179 - 223"},"PeriodicalIF":1.4,"publicationDate":"2019-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1007/s10617-019-09229-y","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42922845","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-11-05DOI: 10.1007/s10617-019-09225-2
C. R. S. Hanuman, J. Kamala, A. R. Aruna
The increasing demand of Industrial and Scientific data intensive applications are higher precision arithmetic with reduced computation time. In this paper, we designed a high-precision, fully pipelined 32-bit floating-point (FP) divider using Newton–Raphson (NR) algorithm realized with Urdhva–Tiryakbhyam (UT) multiplier for System on Chip applications. The divider design is based on Newton–Raphson (multiplicative) method and it supports all IEEE rounding modes with a latency of 15 cycles. The iterative NR computations are performed by using FP multiplier and FP adder. The key module of FP multiplier for calculating mantissa part is UT multiplier. It’s an ancient Vedic multiplication technique used from few centuries back for doing fast multiplications. We implemented two UT multipliers: one using carry look-ahead adders and another one using carry save adders. The results show that, the proposed architectures have 12% better precision with 24% high throughput than existing algorithms, at the cost of high on-chip power. The inputs to the divider are represented in IEEE-754 standard. The design uses Xilinx Vivado software and it is implemented on Virtex7 FPGA.
{"title":"Implementation of high precision/low latency FP divider using Urdhva–Tiryakbhyam multiplier for SoC applications","authors":"C. R. S. Hanuman, J. Kamala, A. R. Aruna","doi":"10.1007/s10617-019-09225-2","DOIUrl":"https://doi.org/10.1007/s10617-019-09225-2","url":null,"abstract":"The increasing demand of Industrial and Scientific data intensive applications are higher precision arithmetic with reduced computation time. In this paper, we designed a high-precision, fully pipelined 32-bit floating-point (FP) divider using Newton–Raphson (NR) algorithm realized with Urdhva–Tiryakbhyam (UT) multiplier for System on Chip applications. The divider design is based on Newton–Raphson (multiplicative) method and it supports all IEEE rounding modes with a latency of 15 cycles. The iterative NR computations are performed by using FP multiplier and FP adder. The key module of FP multiplier for calculating mantissa part is UT multiplier. It’s an ancient Vedic multiplication technique used from few centuries back for doing fast multiplications. We implemented two UT multipliers: one using carry look-ahead adders and another one using carry save adders. The results show that, the proposed architectures have 12% better precision with 24% high throughput than existing algorithms, at the cost of high on-chip power. The inputs to the divider are represented in IEEE-754 standard. The design uses Xilinx Vivado software and it is implemented on Virtex7 FPGA.","PeriodicalId":50594,"journal":{"name":"Design Automation for Embedded Systems","volume":"62 2","pages":""},"PeriodicalIF":1.4,"publicationDate":"2019-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138524186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}