首页 > 最新文献

2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)最新文献

英文 中文
Neuromorphic networks on the SpiNNaker platform SpiNNaker平台的神经形态网络
G. Haessig, F. Galluppi, Xavier Lagorce, R. Benosman
This paper describes spike-based neural networks for optical flow and stereo estimation from Dynamic Vision Sensors data. These methods combine the Asynchronous Time-based Image Sensor with the SpiNNaker platform. The sensor generates spikes with sub-millisecond resolution in response to scene illumination changes. These spike are processed by a spiking neural network running on SpiNNaker with a 1 millisecond resolution to accurately determine the order and time difference of spikes from neighboring pixels, and therefore infer the velocity, direction or depth. The spiking neural networks are a variant of the Barlow-Levick method for optical flow estimation, and Marr& Poggio for the stereo matching.
本文介绍了基于脉冲神经网络的光流和立体估计的动态视觉传感器数据。这些方法将异步基于时间的图像传感器与SpiNNaker平台相结合。传感器产生亚毫秒分辨率的尖峰响应场景照明的变化。这些峰值由SpiNNaker上运行的峰值神经网络处理,分辨率为1毫秒,以准确确定相邻像素的峰值顺序和时间差,从而推断速度,方向或深度。尖峰神经网络是光流估计的Barlow-Levick方法和立体匹配的Marr& Poggio方法的变体。
{"title":"Neuromorphic networks on the SpiNNaker platform","authors":"G. Haessig, F. Galluppi, Xavier Lagorce, R. Benosman","doi":"10.1109/AICAS.2019.8771512","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771512","url":null,"abstract":"This paper describes spike-based neural networks for optical flow and stereo estimation from Dynamic Vision Sensors data. These methods combine the Asynchronous Time-based Image Sensor with the SpiNNaker platform. The sensor generates spikes with sub-millisecond resolution in response to scene illumination changes. These spike are processed by a spiking neural network running on SpiNNaker with a 1 millisecond resolution to accurately determine the order and time difference of spikes from neighboring pixels, and therefore infer the velocity, direction or depth. The spiking neural networks are a variant of the Barlow-Levick method for optical flow estimation, and Marr& Poggio for the stereo matching.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"511 ","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120932914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Robust Learning and Recognition of Visual Patterns in Neuromorphic Electronic Agents 神经形态电子代理视觉模式的鲁棒学习和识别
Dongchen Liang, Raphaela Kreiser, Carsten Nielsen, Ning Qiao, Yulia Sandamirskaya, G. Indiveri
Mixed-signal analog/digital neuromorphic circuits are characterized by ultra-low power consumption, real-time processing abilities, and low-latency response times. These features make them promising for robotic applications that require fast and power-efficient computing. However, the unavoidable variance inherently existing in the analog circuits makes it challenging to develop neural processing architectures able to perform complex computations robustly. In this paper, we present a spiking neural network architecture with spike-based learning that enables robust learning and recognition of visual patterns in noisy silicon neural substrate and noisy environments. The architecture is used to perform pattern recognition and inference after a training phase with computers and neuromorphic hardware in the loop. We validate the proposed system in a closed-loop hardware setup composed of neuromorphic vision sensors and processors, and we present experimental results that quantify its real-time and robust perception and action behavior.
混合信号模拟/数字神经形态电路具有超低功耗、实时处理能力和低延迟响应时间的特点。这些特性使它们在需要快速高效计算的机器人应用中很有前景。然而,模拟电路中固有的不可避免的变化使得开发能够鲁棒地执行复杂计算的神经处理架构具有挑战性。在本文中,我们提出了一种基于尖峰学习的尖峰神经网络架构,能够在嘈杂的硅神经衬底和嘈杂的环境中对视觉模式进行鲁棒学习和识别。该体系结构用于在计算机和神经形态硬件在循环中进行训练阶段后执行模式识别和推理。我们在由神经形态视觉传感器和处理器组成的闭环硬件设置中验证了所提出的系统,并给出了量化其实时性和鲁棒性感知和动作行为的实验结果。
{"title":"Robust Learning and Recognition of Visual Patterns in Neuromorphic Electronic Agents","authors":"Dongchen Liang, Raphaela Kreiser, Carsten Nielsen, Ning Qiao, Yulia Sandamirskaya, G. Indiveri","doi":"10.1109/AICAS.2019.8771580","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771580","url":null,"abstract":"Mixed-signal analog/digital neuromorphic circuits are characterized by ultra-low power consumption, real-time processing abilities, and low-latency response times. These features make them promising for robotic applications that require fast and power-efficient computing. However, the unavoidable variance inherently existing in the analog circuits makes it challenging to develop neural processing architectures able to perform complex computations robustly. In this paper, we present a spiking neural network architecture with spike-based learning that enables robust learning and recognition of visual patterns in noisy silicon neural substrate and noisy environments. The architecture is used to perform pattern recognition and inference after a training phase with computers and neuromorphic hardware in the loop. We validate the proposed system in a closed-loop hardware setup composed of neuromorphic vision sensors and processors, and we present experimental results that quantify its real-time and robust perception and action behavior.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127961045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Low Precision Electroencephalogram for Seizure Detection with Convolutional Neural Network 低精度脑电图卷积神经网络检测癫痫发作
N. D. Truong, O. Kavehei
Electroencephalogram (EEG) neural activity recording has been widely used for diagnosing and monitoring epileptic patients. Ambulatory epileptic monitoring devices that can detect or even predict seizures play an important role for patients with intractable epilepsy. Though many EEG-based seizure detection algorithms have been proposed in the literature with high accuracy, their hardware implementations are constrained because of power consumption. Many commercial non-research EEG monitoring systems samples multiple electrodes at a relatively high rate and transmit the data either via a wire or wirelessly to an external signal processing unit. In this work, we studied how a reduced sampling precision affects the performance of our machine learning signal processing in seizure detection. To answer this question, we reduce the number of bits (precision) in an analog-to-digital converter (ADC) used in an EEG recorder. The outcome shows that the reduction of ADC precision down to 6-bit does not significantly reduce our convolutional neural network performance in detecting seizure onsets. As an indication of the performance, we achieved an area under the curve (AUC) more than 92% and above 96% on Freiburg Hospital and the Boston Children’s Hospital-MIT seizure datasets, respectively. A possible reduction in ADC precision not only contribute to energy consumption reduction, particularly if the data has to be transmitted, but also offers an improved computational efficacy regarding memory requirement and circuit area.
脑电图(EEG)神经活动记录已广泛应用于癫痫患者的诊断和监测。可以检测甚至预测癫痫发作的动态癫痫监测装置对难治性癫痫患者起着重要的作用。虽然文献中提出了许多基于脑电图的癫痫发作检测算法,但由于功耗的限制,它们的硬件实现受到限制。许多商用的非研究性脑电图监测系统以相对较高的速率对多个电极进行采样,并通过有线或无线方式将数据传输到外部信号处理单元。在这项工作中,我们研究了降低采样精度如何影响癫痫检测中机器学习信号处理的性能。为了回答这个问题,我们减少了脑电图记录仪中使用的模数转换器(ADC)的位数(精度)。结果表明,将ADC精度降低到6位并不会显著降低卷积神经网络检测癫痫发作的性能。作为性能指标,我们分别在Freiburg医院和Boston Children 's Hospital- mit癫痫发作数据集上实现了超过92%和96%的曲线下面积(AUC)。ADC精度的可能降低不仅有助于降低能耗,特别是在必须传输数据的情况下,而且还提供了关于内存要求和电路面积的改进的计算效率。
{"title":"Low Precision Electroencephalogram for Seizure Detection with Convolutional Neural Network","authors":"N. D. Truong, O. Kavehei","doi":"10.1109/AICAS.2019.8771569","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771569","url":null,"abstract":"Electroencephalogram (EEG) neural activity recording has been widely used for diagnosing and monitoring epileptic patients. Ambulatory epileptic monitoring devices that can detect or even predict seizures play an important role for patients with intractable epilepsy. Though many EEG-based seizure detection algorithms have been proposed in the literature with high accuracy, their hardware implementations are constrained because of power consumption. Many commercial non-research EEG monitoring systems samples multiple electrodes at a relatively high rate and transmit the data either via a wire or wirelessly to an external signal processing unit. In this work, we studied how a reduced sampling precision affects the performance of our machine learning signal processing in seizure detection. To answer this question, we reduce the number of bits (precision) in an analog-to-digital converter (ADC) used in an EEG recorder. The outcome shows that the reduction of ADC precision down to 6-bit does not significantly reduce our convolutional neural network performance in detecting seizure onsets. As an indication of the performance, we achieved an area under the curve (AUC) more than 92% and above 96% on Freiburg Hospital and the Boston Children’s Hospital-MIT seizure datasets, respectively. A possible reduction in ADC precision not only contribute to energy consumption reduction, particularly if the data has to be transmitted, but also offers an improved computational efficacy regarding memory requirement and circuit area.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130997786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Deep Multi-Scale Residual Learning-based Blocking Artifacts Reduction for Compressed Images 基于深度多尺度残差学习的压缩图像块伪影减少
Min-Hui Lin, C. Yeh, Chu-Han Lin, Chih-Hsiang Huang, Li-Wei Kang
Blocking artifact, characterized by visually noticeable changes in pixel values along block boundaries, is a general problem in block-based image/video compression systems. Various post-processing techniques have been proposed to reduce blocking artifacts, but most of them usually introduce excessive blurring or ringing effects. This paper presents a deep learning-based compression artifacts reduction (or deblocking) framework relying on multi-scale residual learning. Recent popular approaches usually train deep models using a per-pixel loss function with explicit image priors for directly producing deblocked images. Instead, we formulate the problem as learning the residuals (or the artifacts) between original and the corresponding compressed images. In our deep model, each input image is down-scaled first with blocking artifacts naturally reduced. Then, the learned SR (super-resolution) convolutional neural network (CNN) will be used to up-sample the down-scaled version. Finally, the up-scaled version (with less artifacts) and the original input are fed into the learned artifact prediction CNN to obtain the estimated blocking artifacts. As a result, the blocking artifacts can be successfully removed by subtracting the predicted artifacts from the input image while preserving most original visual details.
块伪影是基于块的图像/视频压缩系统中的一个普遍问题,其特征是沿块边界的像素值在视觉上明显变化。各种后处理技术已经提出,以减少阻塞伪影,但大多数通常引入过多的模糊或振铃效果。本文提出了一种基于多尺度残差学习的基于深度学习的压缩伪影减少(或去块)框架。最近流行的方法通常使用带有显式图像先验的逐像素损失函数来训练深度模型,以直接生成去块图像。相反,我们将问题表述为学习原始图像和相应压缩图像之间的残差(或伪影)。在我们的深度模型中,每个输入图像首先被缩小,块伪影自然减少。然后,使用学习到的超分辨率卷积神经网络(CNN)对缩小版本进行上采样。最后,将放大版本(较少的伪影)和原始输入输入到学习到的伪影预测CNN中,得到估计的块伪影。因此,通过从输入图像中减去预测的伪影,同时保留大多数原始视觉细节,可以成功地去除阻塞伪影。
{"title":"Deep Multi-Scale Residual Learning-based Blocking Artifacts Reduction for Compressed Images","authors":"Min-Hui Lin, C. Yeh, Chu-Han Lin, Chih-Hsiang Huang, Li-Wei Kang","doi":"10.1109/AICAS.2019.8771613","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771613","url":null,"abstract":"Blocking artifact, characterized by visually noticeable changes in pixel values along block boundaries, is a general problem in block-based image/video compression systems. Various post-processing techniques have been proposed to reduce blocking artifacts, but most of them usually introduce excessive blurring or ringing effects. This paper presents a deep learning-based compression artifacts reduction (or deblocking) framework relying on multi-scale residual learning. Recent popular approaches usually train deep models using a per-pixel loss function with explicit image priors for directly producing deblocked images. Instead, we formulate the problem as learning the residuals (or the artifacts) between original and the corresponding compressed images. In our deep model, each input image is down-scaled first with blocking artifacts naturally reduced. Then, the learned SR (super-resolution) convolutional neural network (CNN) will be used to up-sample the down-scaled version. Finally, the up-scaled version (with less artifacts) and the original input are fed into the learned artifact prediction CNN to obtain the estimated blocking artifacts. As a result, the blocking artifacts can be successfully removed by subtracting the predicted artifacts from the input image while preserving most original visual details.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133689851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A Flexible and High-Performance Self-Organizing Feature Map Training Acceleration Circuit and Its Applications 一种灵活、高性能的自组织特征映射训练加速电路及其应用
Yuheng Sun, T. Chiueh
Self-organizing feature map (SOFM) is a type of artificial neural network based on an unsupervised learning algorithm. In this work, we present a circuit for accelerating SOFM training, which forms the foundation for an effective, efficient, and flexible SOFM training platform for different network geometries, including array, rectangular, and binary tree. FPGA validation was also conducted to examine the speedup ratio of this circuit when compared with training using software. In addition, we applied our design to three applications: chromaticity diagram learning, MNIST handwritten numeral auto-labeling, and image vector quantization. All three experiments show that the proposed circuit architecture indeed provides a high-performance and cost-effective solution to SOFM training.
自组织特征映射(SOFM)是一种基于无监督学习算法的人工神经网络。在这项工作中,我们提出了一个加速SOFM训练的电路,这为一个有效、高效、灵活的SOFM训练平台奠定了基础,该平台适用于不同的网络几何形状,包括阵列、矩形和二叉树。并进行了FPGA验证,与软件训练相比,验证了该电路的加速比。此外,我们将我们的设计应用于三个应用:色度图学习,MNIST手写数字自动标记和图像矢量量化。三个实验都表明,所提出的电路架构确实为SOFM训练提供了一个高性能和经济的解决方案。
{"title":"A Flexible and High-Performance Self-Organizing Feature Map Training Acceleration Circuit and Its Applications","authors":"Yuheng Sun, T. Chiueh","doi":"10.1109/AICAS.2019.8771556","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771556","url":null,"abstract":"Self-organizing feature map (SOFM) is a type of artificial neural network based on an unsupervised learning algorithm. In this work, we present a circuit for accelerating SOFM training, which forms the foundation for an effective, efficient, and flexible SOFM training platform for different network geometries, including array, rectangular, and binary tree. FPGA validation was also conducted to examine the speedup ratio of this circuit when compared with training using software. In addition, we applied our design to three applications: chromaticity diagram learning, MNIST handwritten numeral auto-labeling, and image vector quantization. All three experiments show that the proposed circuit architecture indeed provides a high-performance and cost-effective solution to SOFM training.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133985426","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Complexity Reduction on HEVC Intra Mode Decision with modified LeNet-5 基于改进LeNet-5的HEVC模式内决策复杂度降低
Hai-Che Ting, H. Fang, Jia-Shung Wang
The HEVC (H.265) standard was finalized in April 2013, currently being as the prevalent video coding standard. One key contributor to the performance gain over H.264 is the intra prediction that extended a large number of prediction directions on various sizes of prediction units (PUs), thus at a cost of very high computational complexity. When HEVC has been emerged, several fast Intra prediction and Coding Unit (CU) size decision algorithms are being developed for practical applications. Actually, these two components would cost around 60% to 70% encoding time in the all-intra HEVC encoding. In this paper, a novel CNN-based solution is proposed and evaluated. The main idea is to elect a smallest set of adequate intra directions using our modified LeNet-5 CNN model, thus reduce the computational complexity of (further) rate distortion optimization to a tolerable limit. Besides, two filters are employed: the edge strength extractor in [4] and the early terminated CU partition in [7] to skip most of the unlikely directions and to decrease the number of CUs, respectively. The experimental results demonstrate that the proposed method provides a decrease of up to 66.59% computation with a slightly increase in the bit-rate (1.1% on average) and a little reduction of picture quality (0.109% on average in PSNR) at most.
HEVC (H.265)标准于2013年4月定稿,目前是流行的视频编码标准。与H.264相比,性能提升的一个关键因素是内部预测,它在不同大小的预测单元(pu)上扩展了大量的预测方向,因此以非常高的计算复杂性为代价。随着HEVC的出现,一些快速的内部预测和编码单元(CU)大小决策算法正在被开发用于实际应用。实际上,在全帧内HEVC编码中,这两个组件的编码时间约为60% ~ 70%。本文提出并评估了一种新的基于cnn的解决方案。主要思想是使用我们改进的LeNet-5 CNN模型选择最小的适当的内部方向集,从而将(进一步)速率失真优化的计算复杂度降低到可容忍的极限。此外,采用了[4]中的边缘强度提取器和[7]中的提前终止的CU分区两种滤波器,分别跳过大部分不可能的方向和减少CU的数量。实验结果表明,该方法最多可减少66.59%的计算量,比特率略有提高(平均1.1%),图像质量略有下降(平均PSNR为0.109%)。
{"title":"Complexity Reduction on HEVC Intra Mode Decision with modified LeNet-5","authors":"Hai-Che Ting, H. Fang, Jia-Shung Wang","doi":"10.1109/AICAS.2019.8771586","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771586","url":null,"abstract":"The HEVC (H.265) standard was finalized in April 2013, currently being as the prevalent video coding standard. One key contributor to the performance gain over H.264 is the intra prediction that extended a large number of prediction directions on various sizes of prediction units (PUs), thus at a cost of very high computational complexity. When HEVC has been emerged, several fast Intra prediction and Coding Unit (CU) size decision algorithms are being developed for practical applications. Actually, these two components would cost around 60% to 70% encoding time in the all-intra HEVC encoding. In this paper, a novel CNN-based solution is proposed and evaluated. The main idea is to elect a smallest set of adequate intra directions using our modified LeNet-5 CNN model, thus reduce the computational complexity of (further) rate distortion optimization to a tolerable limit. Besides, two filters are employed: the edge strength extractor in [4] and the early terminated CU partition in [7] to skip most of the unlikely directions and to decrease the number of CUs, respectively. The experimental results demonstrate that the proposed method provides a decrease of up to 66.59% computation with a slightly increase in the bit-rate (1.1% on average) and a little reduction of picture quality (0.109% on average in PSNR) at most.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121089562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Fast Convolution Algorithm for Convolutional Neural Networks 卷积神经网络的快速卷积算法
Tae-Sun Kim, Ji-Hoon Bae, M. Sunwoo
Recent advances in computing power made possible by developments of faster general-purpose graphics processing units (GPGPUs) have increased the complexity of convolutional neural network (CNN) models. However, because of the limited applications of the existing GPGPUs, CNN accelerators are becoming more important. The current accelerators focus on improvement in memory scheduling and architectures. Thus, the number of multiplier-accumulator (MAC) operations is not reduced. In this study, a new convolution layer operation algorithm is proposed using the coarse-to-fine method instead of hardware or architecture approaches. This algorithm is shown to reduce the MAC operations by 33%. However, the accuracy of the Top 1 is decreased only by 3% and the Top 5 only by 1%.
由于更快的通用图形处理单元(gpgpu)的发展,计算能力的最新进步增加了卷积神经网络(CNN)模型的复杂性。然而,由于现有gpgpu的应用有限,CNN加速器变得越来越重要。当前的加速器专注于内存调度和架构的改进。因此,乘法累加器(MAC)操作的数量没有减少。本文提出了一种新的卷积层运算算法,采用从粗到精的方法代替硬件或体系结构方法。该算法可将MAC操作减少33%。然而,前1名的准确率只下降了3%,前5名的准确率只下降了1%。
{"title":"Fast Convolution Algorithm for Convolutional Neural Networks","authors":"Tae-Sun Kim, Ji-Hoon Bae, M. Sunwoo","doi":"10.1109/AICAS.2019.8771531","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771531","url":null,"abstract":"Recent advances in computing power made possible by developments of faster general-purpose graphics processing units (GPGPUs) have increased the complexity of convolutional neural network (CNN) models. However, because of the limited applications of the existing GPGPUs, CNN accelerators are becoming more important. The current accelerators focus on improvement in memory scheduling and architectures. Thus, the number of multiplier-accumulator (MAC) operations is not reduced. In this study, a new convolution layer operation algorithm is proposed using the coarse-to-fine method instead of hardware or architecture approaches. This algorithm is shown to reduce the MAC operations by 33%. However, the accuracy of the Top 1 is decreased only by 3% and the Top 5 only by 1%.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126534479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Memristor Emulators for an Adaptive DPE Algorithm: Comparative Study 自适应DPE算法的忆阻器仿真器:比较研究
Hussein Assaf, Y. Savaria, M. Sawan
Vector Matrix Multiplication (VMM) is a complex operation requiring large computational power to fulfill one iteration. Resistive computing; including memristors, is one solution to speed up VMM by optimizing the multiplication process into few steps despite the matrices’ sizes. In this paper, we propose an Adaptive Dot Product Engine (ADPE) algorithm based on memristors for enhancing the process of resistive computing in VMM. The algorithm showed 5% error on preliminary results with one on-line training step for one layered crossbar array circuit of memristors. However memristors require new fabrication technologies where the design and validation processes of systems using these devices remains challenging. A comparison of various available circuits emulating a memristor suitable for ADPE is presented and models were compared based on chip size, circuit elements used and operating frequency.
向量矩阵乘法(VMM)是一种复杂的运算,需要大量的计算能力来完成一次迭代。电阻的计算;包括忆阻器,是一种加速VMM的解决方案,它将乘法过程优化为几个步骤,而不管矩阵的大小。本文提出了一种基于忆阻器的自适应点积引擎(ADPE)算法,以提高VMM中的电阻计算过程。该算法对一个忆阻器层栅阵列电路进行一次在线训练,初步结果误差为5%。然而,记忆电阻器需要新的制造技术,其中使用这些器件的系统的设计和验证过程仍然具有挑战性。比较了几种适合于ADPE的忆阻器仿真电路,并根据芯片尺寸、所使用的电路元件和工作频率对模型进行了比较。
{"title":"Memristor Emulators for an Adaptive DPE Algorithm: Comparative Study","authors":"Hussein Assaf, Y. Savaria, M. Sawan","doi":"10.1109/AICAS.2019.8771594","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771594","url":null,"abstract":"Vector Matrix Multiplication (VMM) is a complex operation requiring large computational power to fulfill one iteration. Resistive computing; including memristors, is one solution to speed up VMM by optimizing the multiplication process into few steps despite the matrices’ sizes. In this paper, we propose an Adaptive Dot Product Engine (ADPE) algorithm based on memristors for enhancing the process of resistive computing in VMM. The algorithm showed 5% error on preliminary results with one on-line training step for one layered crossbar array circuit of memristors. However memristors require new fabrication technologies where the design and validation processes of systems using these devices remains challenging. A comparison of various available circuits emulating a memristor suitable for ADPE is presented and models were compared based on chip size, circuit elements used and operating frequency.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127832781","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
On-chip Learning of Multilayer Perceptron Based on Memristors with Limited Multilevel States 基于有限多能级状态忆阻器的多层感知器片上学习
Yuhang Zhang, Guanghui He, K. Tang, Guoxing Wang
The cross-point memristor array is viewed as a promising candidate for neuromorphic computing due to its non-volatile storage and parallel computing features. However, the programming threshold and resistance fluctuation among different multilevel states restrict the capacity of weight representation and thus numerical precision. This poses great challenges for on-chip learning. This work evaluates the deterioration of learning accuracy on multilayer perceptron due to limited multilevel states and proposes stochastic “skip-and-update” algorithm to facilitate on-chip learning with low-precision memristors.
交叉点忆阻器阵列由于其非易失性存储和并行计算的特点,被认为是神经形态计算的一个有前途的候选者。然而,不同多能级状态间的编程阈值和阻力波动限制了权重表示的能力,从而影响了数值精度。这对片上学习提出了巨大的挑战。本文评估了多层感知器由于有限的多层状态而导致的学习精度下降,并提出了随机“跳过和更新”算法,以促进低精度记忆电阻器的片上学习。
{"title":"On-chip Learning of Multilayer Perceptron Based on Memristors with Limited Multilevel States","authors":"Yuhang Zhang, Guanghui He, K. Tang, Guoxing Wang","doi":"10.1109/AICAS.2019.8771513","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771513","url":null,"abstract":"The cross-point memristor array is viewed as a promising candidate for neuromorphic computing due to its non-volatile storage and parallel computing features. However, the programming threshold and resistance fluctuation among different multilevel states restrict the capacity of weight representation and thus numerical precision. This poses great challenges for on-chip learning. This work evaluates the deterioration of learning accuracy on multilayer perceptron due to limited multilevel states and proposes stochastic “skip-and-update” algorithm to facilitate on-chip learning with low-precision memristors.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133715420","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Artificial Intelligence of Things Wearable System for Cardiac Disease Detection 用于心脏病检测的物联网可穿戴系统
Yu-Jin Lin, Chen-Wei Chuang, Chun-Yueh Yen, Sheng-Hsin Huang, Peng-Wei Huang, Ju-Yi Chen, Shuenn-Yuh Lee
This study proposes an artificial intelligence of things (AIoT) system for electrocardiogram (ECG) analysis and cardiac disease detection. The system includes a front-end IoT-based hardware, a user interface on smart device’s application (APP), a cloud database, and an AI platform for cardiac disease detection. The front-end IoT-based hardware, a wearable ECG patch that includes an analog front-end circuit and a Bluetooth module, can detect ECG signals. The APP on smart devices can not only display users’ real-time ECG signals but also label unusual signals instantly and reach real-time disease detection. These ECG signals will be uploaded to the cloud database. The cloud database is used to store each user’s ECG signals, which forms a big-data database for AI algorithm to detect cardiac disease. The algorithm proposed by this study is based on convolutional neural network and the average accuracy is 94.96%. The ECG dataset applied in this study is collected from patients in Tainan Hospital, Ministry of Health and Welfare. Moreover, signal verification was also performed by a cardiologist.
本研究提出一种用于心电图分析和心脏病检测的物联网(AIoT)系统。该系统包括基于物联网的前端硬件、智能设备应用程序(APP)的用户界面、云数据库和用于心脏病检测的人工智能平台。前端基于物联网的硬件是一种可穿戴的ECG贴片,包括模拟前端电路和蓝牙模块,可以检测ECG信号。智能设备上的APP不仅可以实时显示用户的心电信号,还可以即时标记异常信号,实现实时疾病检测。这些心电信号将被上传到云数据库。云数据库用于存储每个用户的心电信号,形成一个大数据数据库,供AI算法检测心脏病。本文提出的算法基于卷积神经网络,平均准确率为94.96%。本研究使用的心电数据采自卫生福利部台南医院的患者。此外,信号验证也由心脏病专家进行。
{"title":"Artificial Intelligence of Things Wearable System for Cardiac Disease Detection","authors":"Yu-Jin Lin, Chen-Wei Chuang, Chun-Yueh Yen, Sheng-Hsin Huang, Peng-Wei Huang, Ju-Yi Chen, Shuenn-Yuh Lee","doi":"10.1109/AICAS.2019.8771630","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771630","url":null,"abstract":"This study proposes an artificial intelligence of things (AIoT) system for electrocardiogram (ECG) analysis and cardiac disease detection. The system includes a front-end IoT-based hardware, a user interface on smart device’s application (APP), a cloud database, and an AI platform for cardiac disease detection. The front-end IoT-based hardware, a wearable ECG patch that includes an analog front-end circuit and a Bluetooth module, can detect ECG signals. The APP on smart devices can not only display users’ real-time ECG signals but also label unusual signals instantly and reach real-time disease detection. These ECG signals will be uploaded to the cloud database. The cloud database is used to store each user’s ECG signals, which forms a big-data database for AI algorithm to detect cardiac disease. The algorithm proposed by this study is based on convolutional neural network and the average accuracy is 94.96%. The ECG dataset applied in this study is collected from patients in Tainan Hospital, Ministry of Health and Welfare. Moreover, signal verification was also performed by a cardiologist.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114342928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1