首页 > 最新文献

2020 International SoC Design Conference (ISOCC)最新文献

英文 中文
Extraction of ROM Data from Bitstream in Xilinx FPGA 在Xilinx FPGA中从比特流中提取ROM数据
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333036
Soyeon Choi, Jieun Yeo, Hoyoung Yoo
Recently, many researches have investigated efficient reverse engineering methods to restore Programmable Logic Points (PLPs) and Programmable Interconnect Points (PIPs) in SRAM-based Field Programmable Gate Arrays (FPGAs). However, the research on the restoration of Programmable Content Points (PCPs) such as memory data are rarely studied. In this paper, we propose an efficient reverse engineering method to recover Read Only Memory (ROM) data, which is essential for the implementation of modern digital circuits. First, we analyze the FPGA hardware resources mapped to Xilinx primitive library of ROM, and next the proposed reverse engineering process is explained using mapping relation between ROM data and hardware resources. As an example, XC3S50 FPGA of Xilinx Sparatan-3 family is utilized, and the process of restoring the SBOX of AES (Advanced Encryption Standard) is provided as a practical application.
近年来,许多研究都在探索有效的逆向工程方法来恢复基于sram的现场可编程门阵列(fpga)中的可编程逻辑点(PLPs)和可编程互连点(pip)。然而,对于可编程内容点(pcp)的恢复,如存储器数据的研究却很少。在本文中,我们提出了一种有效的反向工程方法来恢复只读存储器(ROM)数据,这对于现代数字电路的实现至关重要。首先,我们分析了FPGA硬件资源映射到Xilinx ROM原语库,然后利用ROM数据与硬件资源之间的映射关系解释了所提出的逆向工程过程。以Xilinx Sparatan-3系列的XC3S50 FPGA为例,给出了恢复高级加密标准AES (Advanced Encryption Standard) SBOX的具体应用过程。
{"title":"Extraction of ROM Data from Bitstream in Xilinx FPGA","authors":"Soyeon Choi, Jieun Yeo, Hoyoung Yoo","doi":"10.1109/ISOCC50952.2020.9333036","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333036","url":null,"abstract":"Recently, many researches have investigated efficient reverse engineering methods to restore Programmable Logic Points (PLPs) and Programmable Interconnect Points (PIPs) in SRAM-based Field Programmable Gate Arrays (FPGAs). However, the research on the restoration of Programmable Content Points (PCPs) such as memory data are rarely studied. In this paper, we propose an efficient reverse engineering method to recover Read Only Memory (ROM) data, which is essential for the implementation of modern digital circuits. First, we analyze the FPGA hardware resources mapped to Xilinx primitive library of ROM, and next the proposed reverse engineering process is explained using mapping relation between ROM data and hardware resources. As an example, XC3S50 FPGA of Xilinx Sparatan-3 family is utilized, and the process of restoring the SBOX of AES (Advanced Encryption Standard) is provided as a practical application.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124435112","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sequential Compression Using Efficient LUT Correlation for Display Defect Compensation 基于高效LUT相关的序列压缩显示缺陷补偿
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9332953
Jooyong Choi, Suk-ju Kang, Minjoo Lee, Jun-Young Park, Jiwoong Lee
In this paper, we propose a novel sequential compression method for the coefficient look-up table (LUT) for compensation of display panel defects. First, the regression LUT coefficient data for defect compensation is rearranged as the display ratio. It is split and compressed into sub-blocks to utilize the correlation of the rearranged data. Then, to further increase the compression ratio, the initially compressed data is compressed again through the differential coding-based compression. In the performance evaluation, the proposed method has the same loss as the existing compression method, while increasing the compression ratio of up to 46.9%.
本文提出了一种用于显示面板缺陷补偿的系数查找表(LUT)序列压缩方法。首先,将缺陷补偿的回归LUT系数数据重新排列为显示比。它被分割并压缩成子块,以利用重新排列的数据的相关性。然后,为了进一步提高压缩比,将最初压缩的数据通过基于差分编码的压缩再次压缩。在性能评价中,所提方法与现有压缩方法损失相同,压缩比提高高达46.9%。
{"title":"Sequential Compression Using Efficient LUT Correlation for Display Defect Compensation","authors":"Jooyong Choi, Suk-ju Kang, Minjoo Lee, Jun-Young Park, Jiwoong Lee","doi":"10.1109/ISOCC50952.2020.9332953","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9332953","url":null,"abstract":"In this paper, we propose a novel sequential compression method for the coefficient look-up table (LUT) for compensation of display panel defects. First, the regression LUT coefficient data for defect compensation is rearranged as the display ratio. It is split and compressed into sub-blocks to utilize the correlation of the rearranged data. Then, to further increase the compression ratio, the initially compressed data is compressed again through the differential coding-based compression. In the performance evaluation, the proposed method has the same loss as the existing compression method, while increasing the compression ratio of up to 46.9%.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127265852","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Design of 5.8GHz Tunable Band Noise Cancelling CMOS LNA for DSRC Communications 用于DSRC通信的5.8GHz可调谐频段降噪CMOS LNA设计
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9332934
Dong Won Lee, Kangyoon Lee
This article presents about 5.8GHz noise cancelling CMOS LNA for DSRC communication. The LNA is designed with differential output with balun architecture and resistive-feedback noise cancelling technique. Tunable load capacitor bank achieves wideband input matching and gain selection. The LNA is implemented in 130nm CMOS technology and achieves a simulated gain of 24.2dB and PldB of -13.46dB and noise figure(NF) of 2.74dB at center frequency. The power consumption is 10.51mW at 1.2V power supply. The chip area is 509×559µm2
本文介绍了用于DSRC通信的5.8GHz降噪CMOS LNA。LNA设计为差分输出,采用平衡结构和电阻反馈降噪技术。可调谐负载电容器组实现宽带输入匹配和增益选择。该LNA采用130nm CMOS技术实现,在中心频率处仿真增益24.2dB, PldB为-13.46dB,噪声系数(NF)为2.74dB。1.2V供电时的功耗为10.51mW。芯片面积为509×559µm2
{"title":"A Design of 5.8GHz Tunable Band Noise Cancelling CMOS LNA for DSRC Communications","authors":"Dong Won Lee, Kangyoon Lee","doi":"10.1109/ISOCC50952.2020.9332934","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9332934","url":null,"abstract":"This article presents about 5.8GHz noise cancelling CMOS LNA for DSRC communication. The LNA is designed with differential output with balun architecture and resistive-feedback noise cancelling technique. Tunable load capacitor bank achieves wideband input matching and gain selection. The LNA is implemented in 130nm CMOS technology and achieves a simulated gain of 24.2dB and PldB of -13.46dB and noise figure(NF) of 2.74dB at center frequency. The power consumption is 10.51mW at 1.2V power supply. The chip area is 509×559µm2","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123545320","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reconfigurable Approximate Floating-Point Multiplier with kNN 具有kNN的可重构近似浮点乘法器
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9332978
Younggyun Cho, Mi Lu
Due to the high demands for computing, the available resources always lack. The approximate computing technique is the key to lowering hardware complexity and improving energy efficiency and performance. However, it is a challenge to properly design approximate multipliers since input data are unseen to users. This challenge can be overcome by Machine Learning (ML) classifiers. ML classifiers can predict the detailed feature of upcoming input data. Previous approximate multipliers are designed using simple adders based on ML classifiers but by using a simple adder-based approximate multiplier, the level of approximation cannot change at runtime. To overcome this drawback, using an accumulator and reconfigurable adders instead of simple adders are proposed in this paper. Also, the rounding technique is applied to approximate floating-point multipliers for further improvement. Our experimental results show that when the error tolerance of our target application is less than 5%, the proposed approximate multiplier can save area by 70.98%, and when the error tolerance is less than 3%, a rounding enhanced simple adders-based approximate multiplier can save area by 65.9% and a reconfigurable adder-based approximate multiplier with rounding can reduce the average delay and energy by 54.95% and 46.67% respectively compared to an exact multiplier.
由于对计算的高要求,可用资源总是缺乏。近似计算技术是降低硬件复杂度、提高能效和性能的关键。然而,正确设计近似乘法器是一个挑战,因为输入数据对用户是不可见的。这个挑战可以通过机器学习(ML)分类器来克服。机器学习分类器可以预测即将到来的输入数据的详细特征。以前的近似乘法器是使用基于ML分类器的简单加法器设计的,但是通过使用基于简单加法器的近似乘法器,近似的水平在运行时不能改变。为了克服这一缺点,本文提出用累加器和可重构加法器代替简单加法器。此外,为了进一步改进,还将舍入技术应用于近似浮点乘法器。实验结果表明,当目标应用的容错小于5%时,所提出的近似乘法器可节省70.98%的面积;当容错小于3%时,舍入增强的基于简单加法器的近似乘法器可节省65.9%的面积,舍入的基于可重构加法器的近似乘法器可比精确乘法器分别减少54.95%和46.67%的平均延迟和能量。
{"title":"A Reconfigurable Approximate Floating-Point Multiplier with kNN","authors":"Younggyun Cho, Mi Lu","doi":"10.1109/ISOCC50952.2020.9332978","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9332978","url":null,"abstract":"Due to the high demands for computing, the available resources always lack. The approximate computing technique is the key to lowering hardware complexity and improving energy efficiency and performance. However, it is a challenge to properly design approximate multipliers since input data are unseen to users. This challenge can be overcome by Machine Learning (ML) classifiers. ML classifiers can predict the detailed feature of upcoming input data. Previous approximate multipliers are designed using simple adders based on ML classifiers but by using a simple adder-based approximate multiplier, the level of approximation cannot change at runtime. To overcome this drawback, using an accumulator and reconfigurable adders instead of simple adders are proposed in this paper. Also, the rounding technique is applied to approximate floating-point multipliers for further improvement. Our experimental results show that when the error tolerance of our target application is less than 5%, the proposed approximate multiplier can save area by 70.98%, and when the error tolerance is less than 3%, a rounding enhanced simple adders-based approximate multiplier can save area by 65.9% and a reconfigurable adder-based approximate multiplier with rounding can reduce the average delay and energy by 54.95% and 46.67% respectively compared to an exact multiplier.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125291767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fast Prototyping of a Deep Neural Network on an FPGA 基于FPGA的深度神经网络快速原型设计
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333030
Wonjong Kim, Hyegang Jun
This paper describes a prototyping methodology for implementing deep neural network (DNN) models in hardware. From a DNN model developed in C or C++ programming language, we develop a hardware architecture using a SoC virtual platform and verify the functionality using FPGA board. It demonstrates the viability of using FPGAs for accelerating specific applications written in a high-level language. With the use of High-level Synthesis tools provided by Xilinx [3], it is shown to be possible to implement an FPGA design that would run the inference calculations required by the MobileNetV2 [1] Deep Neural Network. With minimal alterations to the C++ code developed for a software implementation of the MobileNetV2 where HDL code could be directly synthesized from the original C++ code, dramatically reducing the complexity of the project. Consequently, when the design was implemented on an FPGA, upwards of 5 times increase in speed was able to be realized when compared to similar processors (ARM7).
本文描述了一种在硬件上实现深度神经网络(DNN)模型的原型方法。从用C或c++编程语言开发的DNN模型开始,我们使用SoC虚拟平台开发了硬件架构,并使用FPGA板验证了功能。它证明了使用fpga加速用高级语言编写的特定应用程序的可行性。使用Xilinx提供的高级合成工具[3],可以实现一个FPGA设计,该设计将运行MobileNetV2[1]深度神经网络所需的推理计算。通过对为MobileNetV2软件实现开发的c++代码进行最小的修改,可以直接从原始c++代码合成HDL代码,从而大大降低了项目的复杂性。因此,当设计在FPGA上实现时,与类似的处理器(ARM7)相比,速度可以提高5倍以上。
{"title":"Fast Prototyping of a Deep Neural Network on an FPGA","authors":"Wonjong Kim, Hyegang Jun","doi":"10.1109/ISOCC50952.2020.9333030","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333030","url":null,"abstract":"This paper describes a prototyping methodology for implementing deep neural network (DNN) models in hardware. From a DNN model developed in C or C++ programming language, we develop a hardware architecture using a SoC virtual platform and verify the functionality using FPGA board. It demonstrates the viability of using FPGAs for accelerating specific applications written in a high-level language. With the use of High-level Synthesis tools provided by Xilinx [3], it is shown to be possible to implement an FPGA design that would run the inference calculations required by the MobileNetV2 [1] Deep Neural Network. With minimal alterations to the C++ code developed for a software implementation of the MobileNetV2 where HDL code could be directly synthesized from the original C++ code, dramatically reducing the complexity of the project. Consequently, when the design was implemented on an FPGA, upwards of 5 times increase in speed was able to be realized when compared to similar processors (ARM7).","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116015265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Effective Software Scheme of the Space Vector Modulation Using One-Chip Micro-Controller 利用单片机实现空间矢量调制的有效软件方案
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333010
W. Choi
An effective software scheme has been proposed for implementing the space vector modulation (SVM) of the three-phase split-output inverter (SOI). The conventional SVM algorithm needs complex computations, such as square root and arctangent. The SVM proposed in this paper, is further developed to decide directly the conduction times of power switches of three-phase SOI without requiring complex computations. One-chip micro-controller has been utilized for implementing the proposed algorithm. Experimental results have verified the effectiveness of the proposed approach by designing a 1.0 kW prototype inverter system.
提出了一种有效的实现三相分路输出逆变器空间矢量调制的软件方案。传统的支持向量机算法需要复杂的计算,如平方根和反正切。本文提出的支持向量机在不需要复杂计算的情况下,可以直接确定三相SOI电源开关导通时间。该算法采用单片微控制器实现。通过设计一个1.0 kW的原型逆变器系统,实验结果验证了该方法的有效性。
{"title":"Effective Software Scheme of the Space Vector Modulation Using One-Chip Micro-Controller","authors":"W. Choi","doi":"10.1109/ISOCC50952.2020.9333010","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333010","url":null,"abstract":"An effective software scheme has been proposed for implementing the space vector modulation (SVM) of the three-phase split-output inverter (SOI). The conventional SVM algorithm needs complex computations, such as square root and arctangent. The SVM proposed in this paper, is further developed to decide directly the conduction times of power switches of three-phase SOI without requiring complex computations. One-chip micro-controller has been utilized for implementing the proposed algorithm. Experimental results have verified the effectiveness of the proposed approach by designing a 1.0 kW prototype inverter system.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114387490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Compact CNN Training Accelerator with Variable Floating-Point Datapath 紧凑的CNN训练加速器与可变浮点数据路径
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9332986
Jiun Hong, TaeGeon Lee, Saad Arslan, Hyungwon Kim
This paper presents a compact architecture of CNN training accelerator targeted for mobile devices. Accuracy was verified using python in the CNN structure, and accuracy was compared by applying several data types to find optimized data types. In addition, floating-point operations are used in the computation of the CNN structure, and to implemented them, we have created and verified the addition, subtraction, and multiplication circuits of floating-point. The CNN architecture was verified using python, the floating point operation was verified using Vivado, and Area was verified TSMC 180nm.
本文提出了一种针对移动设备的CNN训练加速器的紧凑架构。在CNN结构中使用python验证准确率,并通过应用几种数据类型来比较准确率,以找到优化的数据类型。此外,在CNN结构的计算中使用了浮点运算,为了实现这些运算,我们创建并验证了浮点的加法、减法和乘法电路。CNN架构采用python验证,浮点运算采用Vivado验证,Area采用TSMC 180nm验证。
{"title":"Compact CNN Training Accelerator with Variable Floating-Point Datapath","authors":"Jiun Hong, TaeGeon Lee, Saad Arslan, Hyungwon Kim","doi":"10.1109/ISOCC50952.2020.9332986","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9332986","url":null,"abstract":"This paper presents a compact architecture of CNN training accelerator targeted for mobile devices. Accuracy was verified using python in the CNN structure, and accuracy was compared by applying several data types to find optimized data types. In addition, floating-point operations are used in the computation of the CNN structure, and to implemented them, we have created and verified the addition, subtraction, and multiplication circuits of floating-point. The CNN architecture was verified using python, the floating point operation was verified using Vivado, and Area was verified TSMC 180nm.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114584967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
50 MHz 3-Level Buck Converter with added Boost Converter 50兆赫3级降压转换器与增加升压转换器
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333088
S. SunitaM., D. MayurG., Preet Bedi, Nagesh Verma, Shashidhar Tantry
This paper presents a 50MHz, 3.5V input and 0.6- 3.12V output, 3-Level Buck Converter with the inclusion of a calibration circuit to ensure a constant voltage across the flying capacitor. Additionally, a Boost Converter is designed to increase the max output voltage level to 5.8V. All the circuits and their blocks are designed and simulated in 45nm CMOS technology on Cadence virtuoso. Peak efficiency is observed to be 90.3% for 3-Level Buck Converter and 93.6% for the modified Boost Converter.
本文介绍了一个50MHz, 3.5V输入,0.6- 3.12V输出,3电平降压转换器,包括一个校准电路,以确保在飞行电容器上的恒定电压。此外,升压转换器的设计将最大输出电压水平提高到5.8V。所有电路及其模块在Cadence virtuoso上采用45nm CMOS技术进行设计和仿真。观察到3电平降压变换器的峰值效率为90.3%,改进升压变换器的峰值效率为93.6%。
{"title":"50 MHz 3-Level Buck Converter with added Boost Converter","authors":"S. SunitaM., D. MayurG., Preet Bedi, Nagesh Verma, Shashidhar Tantry","doi":"10.1109/ISOCC50952.2020.9333088","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333088","url":null,"abstract":"This paper presents a 50MHz, 3.5V input and 0.6- 3.12V output, 3-Level Buck Converter with the inclusion of a calibration circuit to ensure a constant voltage across the flying capacitor. Additionally, a Boost Converter is designed to increase the max output voltage level to 5.8V. All the circuits and their blocks are designed and simulated in 45nm CMOS technology on Cadence virtuoso. Peak efficiency is observed to be 90.3% for 3-Level Buck Converter and 93.6% for the modified Boost Converter.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121926957","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Two-stage Training Mechanism for the CNN with Trainable Activation Function 具有可训练激活函数的CNN的两阶段训练机制
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333116
K. Chen, Jing-Wen Liang
Activation function design is critical in the convolutional neural network (CNN) because it affects the learning speed and the precision of classification. In a hardware implementation, using traditional activation function may cause large hardware area overhead due to its complicated calculation such as exponential. To reduce the hardware overhead, Taylor series expansion is a popular way to approximate the traditional activation function. However, this approach brings some approximation errors, which reduce the accuracy of the involved CNN model. Therefore, the trainable activation function and a two-stage training mechanism are proposed in this paper to compensate for the accuracy loss due to the Taylor series expansion. After initializing the involved trainable activation function, the coefficients of the trainable activation function according to different neural network layer will be adjusted properly along with the neural network training process. Compared with the conventional approach, the proposed trainable activation function can involve fewer Taylor expansion terms to improve the classification accuracy by 2.24% to 53.96%. Therefore, CNN with trainable activation functions can achieve better classification accuracy with less area cost.
激活函数的设计对卷积神经网络(CNN)的学习速度和分类精度有重要影响。在硬件实现中,使用传统的激活函数,由于其计算复杂(如指数计算),可能会造成较大的硬件面积开销。为了减少硬件开销,泰勒级数展开是一种常用的逼近传统激活函数的方法。然而,这种方法带来了一些近似误差,降低了所涉及的CNN模型的精度。因此,本文提出了可训练的激活函数和两阶段训练机制来补偿泰勒级数展开带来的精度损失。初始化所涉及的可训练激活函数后,根据不同的神经网络层,可训练激活函数的系数会随着神经网络的训练过程进行适当的调整。与传统方法相比,所提出的可训练激活函数涉及的泰勒展开项较少,分类准确率提高了2.24% ~ 53.96%。因此,具有可训练激活函数的CNN可以以较小的面积代价获得更好的分类精度。
{"title":"A Two-stage Training Mechanism for the CNN with Trainable Activation Function","authors":"K. Chen, Jing-Wen Liang","doi":"10.1109/ISOCC50952.2020.9333116","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333116","url":null,"abstract":"Activation function design is critical in the convolutional neural network (CNN) because it affects the learning speed and the precision of classification. In a hardware implementation, using traditional activation function may cause large hardware area overhead due to its complicated calculation such as exponential. To reduce the hardware overhead, Taylor series expansion is a popular way to approximate the traditional activation function. However, this approach brings some approximation errors, which reduce the accuracy of the involved CNN model. Therefore, the trainable activation function and a two-stage training mechanism are proposed in this paper to compensate for the accuracy loss due to the Taylor series expansion. After initializing the involved trainable activation function, the coefficients of the trainable activation function according to different neural network layer will be adjusted properly along with the neural network training process. Compared with the conventional approach, the proposed trainable activation function can involve fewer Taylor expansion terms to improve the classification accuracy by 2.24% to 53.96%. Therefore, CNN with trainable activation functions can achieve better classification accuracy with less area cost.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129565592","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Implementation of a Round Robin Processing Element for Deep Learning Accelerator
Pub Date : 2020-10-21 DOI: 10.1109/ISOCC50952.2020.9333012
Eunchong Lee, Yongseok Lee, Sang-Seol Lee, Byoung-Ho Choi
The deep learning acceleration hardwareperformance is greatly affected by Processing Elements (PEs). In order to apply deep learning accelerators to mobile devices, optimized PE must be designed as ASIC. To improve the performance of PE, we focused on methods of minimizing external memory access and parallelization. As a result, a deep learning accelerator architecture consisting of 512 PEs in parallel is proposed and the results of FPGA implementation is presented.
为了提高PE的性能,我们关注最小化外部内存访问和并行化的方法。
{"title":"Implementation of a Round Robin Processing Element for Deep Learning Accelerator","authors":"Eunchong Lee, Yongseok Lee, Sang-Seol Lee, Byoung-Ho Choi","doi":"10.1109/ISOCC50952.2020.9333012","DOIUrl":"https://doi.org/10.1109/ISOCC50952.2020.9333012","url":null,"abstract":"The deep learning acceleration hardwareperformance is greatly affected by Processing Elements (PEs). In order to apply deep learning accelerators to mobile devices, optimized PE must be designed as ASIC. To improve the performance of PE, we focused on methods of minimizing external memory access and parallelization. As a result, a deep learning accelerator architecture consisting of 512 PEs in parallel is proposed and the results of FPGA implementation is presented.","PeriodicalId":270577,"journal":{"name":"2020 International SoC Design Conference (ISOCC)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129342690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
2020 International SoC Design Conference (ISOCC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1