Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704101
Esraa Ragab, M. A. E. Ghany, K. Hofmann
Memories are essential components of any computer system and their performance directly affects the system speed and efficiency. Furthermore, faster, cheaper and higher capacity memories are a demand that is increasing each day however this demand comes at the cost of complexity and other drawbacks. This paper introduces a multi-port DDR2 SDRAM controller that supports an AMBA AXI interface at each port. The design is responsible for memory initialization and automatic generation of refresh sequences. Round Robin arbitration algorithm is adopted in the design. The proposed design is successfully synthesized on xc7z020clg484-l (zedboard) with maximum operating frequency of 212 MHz which improves the design speed by around 30%. The area of the design has been also improved by around 40%.
{"title":"DDR2 Memory Controller for Multi-core Systems with AMBA AXI Interface","authors":"Esraa Ragab, M. A. E. Ghany, K. Hofmann","doi":"10.1109/ICM.2018.8704101","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704101","url":null,"abstract":"Memories are essential components of any computer system and their performance directly affects the system speed and efficiency. Furthermore, faster, cheaper and higher capacity memories are a demand that is increasing each day however this demand comes at the cost of complexity and other drawbacks. This paper introduces a multi-port DDR2 SDRAM controller that supports an AMBA AXI interface at each port. The design is responsible for memory initialization and automatic generation of refresh sequences. Round Robin arbitration algorithm is adopted in the design. The proposed design is successfully synthesized on xc7z020clg484-l (zedboard) with maximum operating frequency of 212 MHz which improves the design speed by around 30%. The area of the design has been also improved by around 40%.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133871084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704034
Mahmoud Salah, B. El-Shweky, Karim ElKholy, A. Helmy, Y. Ismail, K. Salah
Effective compression guarantees that IoT will continue spreading to cover various devices. High Efficiency Video Coding (HEVC), also known as H.265, is the most recent video coding standard; it introduces new coding methods compared to older popular standards. Such a change has a great impact on both encoder complexity and speed as it provides bitrate reduction up to 50% compared to H.264/AVC while keeping the same video quality. Unfortunately, HEVC is computationally intensive, relatively slow and power consuming. Therefore, hardware implementation is desirable to get over these drawbacks. This work introduces an efficient implementation of HEVC encoder/decoder Algorithm targeting low power solutions in order to be compatible with the IoT devices. Verflog language was chosen as an HDL to carry over the implemented design to hardware. In addition, an analytical comparison of the implemented hardware and the related work is presented. The comparison is based on encoding performance such as power consumption and compression speed.
{"title":"HEVC Implementation for IoT Applications","authors":"Mahmoud Salah, B. El-Shweky, Karim ElKholy, A. Helmy, Y. Ismail, K. Salah","doi":"10.1109/ICM.2018.8704034","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704034","url":null,"abstract":"Effective compression guarantees that IoT will continue spreading to cover various devices. High Efficiency Video Coding (HEVC), also known as H.265, is the most recent video coding standard; it introduces new coding methods compared to older popular standards. Such a change has a great impact on both encoder complexity and speed as it provides bitrate reduction up to 50% compared to H.264/AVC while keeping the same video quality. Unfortunately, HEVC is computationally intensive, relatively slow and power consuming. Therefore, hardware implementation is desirable to get over these drawbacks. This work introduces an efficient implementation of HEVC encoder/decoder Algorithm targeting low power solutions in order to be compatible with the IoT devices. Verflog language was chosen as an HDL to carry over the implemented design to hardware. In addition, an analytical comparison of the implemented hardware and the related work is presented. The comparison is based on encoding performance such as power consumption and compression speed.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116586405","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704025
Sahar Sharaf, H. Mostafa
Connected and autonomous cars present a major challenge for securing vehicles against outside or inside attacks which may affect the safety of the driver. In this paper there is a comparison between seven light weight authenticated algorithms, and a classification of automotive embedded systems for helping in choosing the suitable algorithm for each application as per system safety and security requirements. The authenticated algorithms used in this paper are POET, Deoxys, AEZ, MORUS, ACORN, AEGIS, and AES-GCM.
{"title":"A study of Authentication Encryption Algorithms (POET, Deoxys, AEZ, MORUS, ACORN, AEGIS, AES-GCM) For Automotive Security","authors":"Sahar Sharaf, H. Mostafa","doi":"10.1109/ICM.2018.8704025","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704025","url":null,"abstract":"Connected and autonomous cars present a major challenge for securing vehicles against outside or inside attacks which may affect the safety of the driver. In this paper there is a comparison between seven light weight authenticated algorithms, and a classification of automotive embedded systems for helping in choosing the suitable algorithm for each application as per system safety and security requirements. The authenticated algorithms used in this paper are POET, Deoxys, AEZ, MORUS, ACORN, AEGIS, and AES-GCM.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122884468","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704091
M. Tolba, W. Sayed, A. Radwan, S. Abd-El-Hafiz, A. Soliman
This paper proposes a speech encryption and decryption system, its hardware architecture design and FPGA implementation. The system utilizes Nosé Hoover chaotic generator and/or dynamic shift and bit permutation. The effect of different blocks in the proposed encryption scheme is studied and the security of the system is validated through perceptual and statistical tests. The complete encryption scheme is simulated using Xilinx ISE 14.5 and realized on FPGA Xilinx Kintex 7, presenting the experimental results on the oscilloscope. The efficiency is also validated through hardware resources utilization compared to previous works based on maximum frequency and throughput.
本文提出了一种语音加解密系统,其硬件架构设计和FPGA实现。该系统利用nos Hoover混沌发生器和/或动态移位和位置换。研究了不同区块对加密方案的影响,并通过感知测试和统计测试验证了系统的安全性。采用Xilinx ISE 14.5对完整的加密方案进行了仿真,并在Xilinx Kintex 7 FPGA上实现,并在示波器上给出了实验结果。与之前基于最大频率和吞吐量的工作相比,还通过硬件资源利用率验证了效率。
{"title":"Hardware Speech Encryption Using a Chaotic Generator, Dynamic Shift and Bit Permutation","authors":"M. Tolba, W. Sayed, A. Radwan, S. Abd-El-Hafiz, A. Soliman","doi":"10.1109/ICM.2018.8704091","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704091","url":null,"abstract":"This paper proposes a speech encryption and decryption system, its hardware architecture design and FPGA implementation. The system utilizes Nosé Hoover chaotic generator and/or dynamic shift and bit permutation. The effect of different blocks in the proposed encryption scheme is studied and the security of the system is validated through perceptual and statistical tests. The complete encryption scheme is simulated using Xilinx ISE 14.5 and realized on FPGA Xilinx Kintex 7, presenting the experimental results on the oscilloscope. The efficiency is also validated through hardware resources utilization compared to previous works based on maximum frequency and throughput.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131658651","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704053
A. Zakaria, A. Hussein
A future mobile trend is that traffic generated by smartphones will dominate even more than it does today. Recently, smartphone traffic is expected to increase by 10 times and total mobile traffic for all devices by 8 times and more than 90 percent of mobile data traffic will come from smartphones. For this reasons mobile users will need to be actively pushed onto the 5th generation mobile (5G), builds upon today's 4G mobile network technology, which promises to offer a higher connection speeds with lower latency, or time delays. In 5G, whatever the technology used, a user association technique is needed to determine whether a user is associated with a particular base station (BS) before data transmission starts. User association plays an indispensable role in enhancing the load balancing, the spectrum efficiency, and the energy efficiency of networks. The challenge here, is to make the appropriate association that achieve the minimum required data rate for each user with acceptable complexity. In this paper, the pragmatic user association is formulated as an optimization problem, which is resolved by Nash Bargaining Solution (NBS). Simulation results show that the proposed algorithm can enable network operators to support fair resource allocation and ensure that users can be served equitably by both macro cell and pico cell. Also this paper provide an algorithm with low-complexity and reaches to near-optimal solution with a high performance guarantee.
{"title":"Cell Association for Multi Band 5G Cellular HetNets based on NBS","authors":"A. Zakaria, A. Hussein","doi":"10.1109/ICM.2018.8704053","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704053","url":null,"abstract":"A future mobile trend is that traffic generated by smartphones will dominate even more than it does today. Recently, smartphone traffic is expected to increase by 10 times and total mobile traffic for all devices by 8 times and more than 90 percent of mobile data traffic will come from smartphones. For this reasons mobile users will need to be actively pushed onto the 5th generation mobile (5G), builds upon today's 4G mobile network technology, which promises to offer a higher connection speeds with lower latency, or time delays. In 5G, whatever the technology used, a user association technique is needed to determine whether a user is associated with a particular base station (BS) before data transmission starts. User association plays an indispensable role in enhancing the load balancing, the spectrum efficiency, and the energy efficiency of networks. The challenge here, is to make the appropriate association that achieve the minimum required data rate for each user with acceptable complexity. In this paper, the pragmatic user association is formulated as an optimization problem, which is resolved by Nash Bargaining Solution (NBS). Simulation results show that the proposed algorithm can enable network operators to support fair resource allocation and ensure that users can be served equitably by both macro cell and pico cell. Also this paper provide an algorithm with low-complexity and reaches to near-optimal solution with a high performance guarantee.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128984917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704078
Heba Abunahla, B. Mohammad, A. Alazzam, M. A. Jaoude, M. Al-Qutayri, S. Al-Sarawi
Low cost, high sensitivity and stability are the key features that steer the research and development of glucose sensing. In this paper we report on a low cost metal-insulator- metal (MIM) based sensor that responds to different glucose concentrations. The unique planar Pt/CuO/Pt structure provides large electrochemically active surface area which aids glucose oxidation. The device shows sensitivity to different glucose concentrations, that span low to high blood glucose levels for typical concentrations found in human blood. The fabricated sensor exhibits the ability to detect the glucose concentration at neutral pH (i.e. pH = 7). This eliminates the dilution step needed for most of the existing nonenzymatic glucose sensors to achieve alkaline medium that is essential to perform redox reactions in the absence of glucose oxidase. Unlike the available devices, the fabricated MIM structure involves two metal electrodes in the interaction to improve the sensitivity of the device. This contribution provides new insights into design and fabrication of low cost biomedical sensors.
{"title":"Nonenzymatic Glucose Sensor Using MIM Pt/CuO/Pt","authors":"Heba Abunahla, B. Mohammad, A. Alazzam, M. A. Jaoude, M. Al-Qutayri, S. Al-Sarawi","doi":"10.1109/ICM.2018.8704078","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704078","url":null,"abstract":"Low cost, high sensitivity and stability are the key features that steer the research and development of glucose sensing. In this paper we report on a low cost metal-insulator- metal (MIM) based sensor that responds to different glucose concentrations. The unique planar Pt/CuO/Pt structure provides large electrochemically active surface area which aids glucose oxidation. The device shows sensitivity to different glucose concentrations, that span low to high blood glucose levels for typical concentrations found in human blood. The fabricated sensor exhibits the ability to detect the glucose concentration at neutral pH (i.e. pH = 7). This eliminates the dilution step needed for most of the existing nonenzymatic glucose sensors to achieve alkaline medium that is essential to perform redox reactions in the absence of glucose oxidase. Unlike the available devices, the fabricated MIM structure involves two metal electrodes in the interaction to improve the sensitivity of the device. This contribution provides new insights into design and fabrication of low cost biomedical sensors.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116642245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704083
Abdel-Malik M. Sabreen, Adel A. Samir, L. A. ElMahdy, Mima H. Ibrahim, M. H. Tawfik, Omneia O. ElShaer, H. Mostafa
Seizure detection for epileptic patients can be done using Support Vector Machines (SVMs). SVMs are a well- established method in classification between seizure and nonseizure points. One of the SVM trainers is Gilbert’s Algorithm. This paper elaborates Gilbert’s Algorithm role in training SVM to succeed in performing seizure detection. FPGA is used to accelerate the SVM training because of its reconfigurability. The reached results are highlighted and discussed as well as the used power and resources.
{"title":"Seizure Detection Using Gilbert’s Algorithm","authors":"Abdel-Malik M. Sabreen, Adel A. Samir, L. A. ElMahdy, Mima H. Ibrahim, M. H. Tawfik, Omneia O. ElShaer, H. Mostafa","doi":"10.1109/ICM.2018.8704083","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704083","url":null,"abstract":"Seizure detection for epileptic patients can be done using Support Vector Machines (SVMs). SVMs are a well- established method in classification between seizure and nonseizure points. One of the SVM trainers is Gilbert’s Algorithm. This paper elaborates Gilbert’s Algorithm role in training SVM to succeed in performing seizure detection. FPGA is used to accelerate the SVM training because of its reconfigurability. The reached results are highlighted and discussed as well as the used power and resources.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126517996","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704067
Leïla Khanfir, Jaouhar Mouine
The comparator hysteresis adjustment has allowed emerging new application fields including peak detectors and spectrum analyzers. However, hysteresis programming techniques has been mainly developed for static comparators. Hence, when high speed operation and reduced silicon area are desired, such techniques should also be developed for dynamic comparators. This paper presents a new hysteresis programming technique in dynamic comparators based on the digital programming of the clock delay. For this purpose and to ensure optimal circuit performance, a new delay circuit has been designed. To validate the design, a dynamic comparator with 4-bit hysteresis programming has been implemented and simulated using a commercially available 0.18μm CMOS process. The comparator hysteresis is then adjusted form 200μV to 17mV. The whole circuit consumes 1.1pJ at 500MHz while consuming less than 65μW of static power.
{"title":"Programmable Clock Delay for Hysteresis Adjustment in Dynamic Comparators","authors":"Leïla Khanfir, Jaouhar Mouine","doi":"10.1109/ICM.2018.8704067","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704067","url":null,"abstract":"The comparator hysteresis adjustment has allowed emerging new application fields including peak detectors and spectrum analyzers. However, hysteresis programming techniques has been mainly developed for static comparators. Hence, when high speed operation and reduced silicon area are desired, such techniques should also be developed for dynamic comparators. This paper presents a new hysteresis programming technique in dynamic comparators based on the digital programming of the clock delay. For this purpose and to ensure optimal circuit performance, a new delay circuit has been designed. To validate the design, a dynamic comparator with 4-bit hysteresis programming has been implemented and simulated using a commercially available 0.18μm CMOS process. The comparator hysteresis is then adjusted form 200μV to 17mV. The whole circuit consumes 1.1pJ at 500MHz while consuming less than 65μW of static power.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126909135","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-12-01DOI: 10.1109/ICM.2018.8704035
A. Chakir, M. Tabaa, F. Moutaouakkil, H. Medromi, Karim Alami
The use of renewable energies has become important for their beneficial effects on the environment. In this category we find photovoltaic (PV) energy, taking a considerable share of use due to the large availability of the solar potential, but its main obstacle is the extraction of maximum power. In fact, a photovoltaic panel supplying a load, isn’t necessarily working at the point corresponds to the peak of the characteristic (V, P) of the system. To fix this problem is required to use power electronics, via a DC / DC converter controlled by a maximum power point tracking (MPPT) algorithm. In the literature, several algorithms are proposed that differ by the number of inputs used, the response time and also by their accuracy. In this work, we tested a photovoltaic system feeding a traditional Moroccan house, by four methods of MPPT, with four degree of complexity, namely: open circuit voltage fraction, Perturb & observe, the incremental conductance, and the MPPT algorithm using fuzzy logic, with the use of a boost chopper that keeps the DC bus voltage constant. The results are analyzed, and they show that perturb and observe is the good algorithm for a low cost application and the fuzzy logic is suitable for more accuracy.
{"title":"Compartive study of MPPT methods for PV systems : Case of Moroccan house","authors":"A. Chakir, M. Tabaa, F. Moutaouakkil, H. Medromi, Karim Alami","doi":"10.1109/ICM.2018.8704035","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704035","url":null,"abstract":"The use of renewable energies has become important for their beneficial effects on the environment. In this category we find photovoltaic (PV) energy, taking a considerable share of use due to the large availability of the solar potential, but its main obstacle is the extraction of maximum power. In fact, a photovoltaic panel supplying a load, isn’t necessarily working at the point corresponds to the peak of the characteristic (V, P) of the system. To fix this problem is required to use power electronics, via a DC / DC converter controlled by a maximum power point tracking (MPPT) algorithm. In the literature, several algorithms are proposed that differ by the number of inputs used, the response time and also by their accuracy. In this work, we tested a photovoltaic system feeding a traditional Moroccan house, by four methods of MPPT, with four degree of complexity, namely: open circuit voltage fraction, Perturb & observe, the incremental conductance, and the MPPT algorithm using fuzzy logic, with the use of a boost chopper that keeps the DC bus voltage constant. The results are analyzed, and they show that perturb and observe is the good algorithm for a low cost application and the fuzzy logic is suitable for more accuracy.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114900480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-07-02DOI: 10.1109/ICM.2018.8704093
P. Meloni, Daniela Loi, Gianfranco Deriu, A. Pimentel, Dolly Sapra, Maura Pintor, B. Biggio, Oscar Ripolles, David Solans, Francesco Conti, L. Benini, T. Stefanov, S. Minakova, Bernhard Moser, Natalia Shepeleva, M. Masin, F. Palumbo, N. Fragoulis, Ilias Theodorakopoulos
The use of Deep Learning (DL) algorithms is increasingly evolving in many application domains. Despite the rapid growing of algorithm size and complexity, performing DL inference at the edge is becoming a clear trend to cope with low latency, privacy and bandwidth constraints. Nevertheless, traditional implementation on low-energy computing nodes often requires experience-based manual intervention and trial-and-error iterations to get to a functional and effective solution. This work presents a computer-aided design (CAD) support for effective implementation of DL algorithms on embedded systems, aiming at automating different design steps and reducing cost. The proposed tool flow comprises capabilities to consider architecture-and hardware-related variables at very early stages of the development process, from pre-training hyperparameter optimization and algorithm configuration to deployment, and to adequately address security, power efficiency and adaptivity requirements. This paper also presents some preliminary results obtained by the first implementation of the optimization techniques supported by the tool flow.
{"title":"Architecture-aware design and implementation of CNN algorithms for embedded inference: the ALOHA project","authors":"P. Meloni, Daniela Loi, Gianfranco Deriu, A. Pimentel, Dolly Sapra, Maura Pintor, B. Biggio, Oscar Ripolles, David Solans, Francesco Conti, L. Benini, T. Stefanov, S. Minakova, Bernhard Moser, Natalia Shepeleva, M. Masin, F. Palumbo, N. Fragoulis, Ilias Theodorakopoulos","doi":"10.1109/ICM.2018.8704093","DOIUrl":"https://doi.org/10.1109/ICM.2018.8704093","url":null,"abstract":"The use of Deep Learning (DL) algorithms is increasingly evolving in many application domains. Despite the rapid growing of algorithm size and complexity, performing DL inference at the edge is becoming a clear trend to cope with low latency, privacy and bandwidth constraints. Nevertheless, traditional implementation on low-energy computing nodes often requires experience-based manual intervention and trial-and-error iterations to get to a functional and effective solution. This work presents a computer-aided design (CAD) support for effective implementation of DL algorithms on embedded systems, aiming at automating different design steps and reducing cost. The proposed tool flow comprises capabilities to consider architecture-and hardware-related variables at very early stages of the development process, from pre-training hyperparameter optimization and algorithm configuration to deployment, and to adequately address security, power efficiency and adaptivity requirements. This paper also presents some preliminary results obtained by the first implementation of the optimization techniques supported by the tool flow.","PeriodicalId":305356,"journal":{"name":"2018 30th International Conference on Microelectronics (ICM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121519785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}