Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771472
Iulia-Alexandra Lungu, Shih-Chii Liu, T. Delbrück
This paper describes a hand symbol recognition system that can quickly be trained to incrementally learn to recognize new symbols using about 100 times less data and time than by using conventional training. It is driven by frames from a Dynamic Vision Sensor (DVS) event camera. Conventional cameras have very redundant output, especially at high frame rates. Dynamic vision sensors output sparse and asynchronous brightness change events that occur when an object or the camera is moving. Images consisting of a fixed number of events from a DVS drive recognition and incremental learning of new hand symbols in the context of a RoShamBo (rock-paper-scissors) demonstration. Conventional training on the original RoShamBo dataset requires about 12.5h compute time on a desktop GPU using the 2.5 million images in the base dataset. Novel symbols that a user shows for a few tens of seconds to the system can be learned on-the-fly using the iCaRL incremental learning algorithm with 3 minutes of training time on a desktop GPU, while preserving recognition accuracy of previously trained symbols. Our system runs a residual network with 32 layers and maintains 88.4% after 100 epochs or 77% after 5 epochs overall accuracy after 4 incremental training stages. Each stage adds an additional 2 novel symbols to the base 4 symbols. The paper also reports an inexpensive robot hand used for live demonstrations of the base RoShamBo game.
{"title":"Fast event-driven incremental learning of hand symbols","authors":"Iulia-Alexandra Lungu, Shih-Chii Liu, T. Delbrück","doi":"10.1109/AICAS.2019.8771472","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771472","url":null,"abstract":"This paper describes a hand symbol recognition system that can quickly be trained to incrementally learn to recognize new symbols using about 100 times less data and time than by using conventional training. It is driven by frames from a Dynamic Vision Sensor (DVS) event camera. Conventional cameras have very redundant output, especially at high frame rates. Dynamic vision sensors output sparse and asynchronous brightness change events that occur when an object or the camera is moving. Images consisting of a fixed number of events from a DVS drive recognition and incremental learning of new hand symbols in the context of a RoShamBo (rock-paper-scissors) demonstration. Conventional training on the original RoShamBo dataset requires about 12.5h compute time on a desktop GPU using the 2.5 million images in the base dataset. Novel symbols that a user shows for a few tens of seconds to the system can be learned on-the-fly using the iCaRL incremental learning algorithm with 3 minutes of training time on a desktop GPU, while preserving recognition accuracy of previously trained symbols. Our system runs a residual network with 32 layers and maintains 88.4% after 100 epochs or 77% after 5 epochs overall accuracy after 4 incremental training stages. Each stage adds an additional 2 novel symbols to the base 4 symbols. The paper also reports an inexpensive robot hand used for live demonstrations of the base RoShamBo game.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116665480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771558
Jose Granados, Haoming Chu, Z. Zou, Lirong Zheng
Internet of Things (IoT) applications for healthcare are one of the most studied aspects in the research landscape due to the promise of more efficient resource allocation for hospitals and as a companion tool for health professionals. Yet, the requirements in terms of low power, latency and knowledge extraction from the large amount of physiological data generated represent a challenge to be addressed by the research community. In this work, we examine the balance between power consumption, performance and latency among edge, gateway, fog and cloud layers in an IoT medical platform featuring inference by Deep Learning models. We setup an IoT architecture to acquire and classify multichannel electrocardiogram (ECG) signals into normal or abnormal states which could represent a clinically relevant condition by combining custom embedded devices with contemporary open source machine learning packages such as TensorFlow. Different hardware platforms are tested in order to find the best compromise in terms of convenience, latency, power consumption and performance. Our experiments indicate that the real time requisites are fulfilled, however there is a need to reduce energy expenditure by means of incorporating low power SoCs with integrated neuromorphic blocks.
{"title":"Towards Workload-Balanced, Live Deep Learning Analytics for Confidentiality-Aware IoT Medical Platforms","authors":"Jose Granados, Haoming Chu, Z. Zou, Lirong Zheng","doi":"10.1109/AICAS.2019.8771558","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771558","url":null,"abstract":"Internet of Things (IoT) applications for healthcare are one of the most studied aspects in the research landscape due to the promise of more efficient resource allocation for hospitals and as a companion tool for health professionals. Yet, the requirements in terms of low power, latency and knowledge extraction from the large amount of physiological data generated represent a challenge to be addressed by the research community. In this work, we examine the balance between power consumption, performance and latency among edge, gateway, fog and cloud layers in an IoT medical platform featuring inference by Deep Learning models. We setup an IoT architecture to acquire and classify multichannel electrocardiogram (ECG) signals into normal or abnormal states which could represent a clinically relevant condition by combining custom embedded devices with contemporary open source machine learning packages such as TensorFlow. Different hardware platforms are tested in order to find the best compromise in terms of convenience, latency, power consumption and performance. Our experiments indicate that the real time requisites are fulfilled, however there is a need to reduce energy expenditure by means of incorporating low power SoCs with integrated neuromorphic blocks.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125608441","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771482
Xiaoyu Feng, Jinshan Yue, Qingwei Guo, Huazhong Yang, Yongpan Liu
Emerging artificial intelligence brings new opportunities for embedded machine health monitoring systems. However, previous work mainly focus on algorithm improvement and ignore the software-hardware co-design. This paper proposes a CNN-RNN algorithm for remaining useful life (RUL) prediction, with hardware optimization for practical deployment. The CNN-RNN algorithm combines the feature extraction ability of CNN and the sequential processing ability of RNN, which shows 23%–53% improvement on the CMAPSS dataset. This algorithm also considers hardware implementation overhead and an FPGA based accelerator is developed. The accelerator adopts kernel-optimized design to utilize data reuse and reduce memory accesses. It enables real-time response and 5.89GOPs/W energy efficiency within small size and cost overhead. The FPGA implementation shows 15× CNN speedup and 9× overall speedup compared with the embedded processor Cortex-A9.
{"title":"Accelerating CNN-RNN Based Machine Health Monitoring on FPGA","authors":"Xiaoyu Feng, Jinshan Yue, Qingwei Guo, Huazhong Yang, Yongpan Liu","doi":"10.1109/AICAS.2019.8771482","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771482","url":null,"abstract":"Emerging artificial intelligence brings new opportunities for embedded machine health monitoring systems. However, previous work mainly focus on algorithm improvement and ignore the software-hardware co-design. This paper proposes a CNN-RNN algorithm for remaining useful life (RUL) prediction, with hardware optimization for practical deployment. The CNN-RNN algorithm combines the feature extraction ability of CNN and the sequential processing ability of RNN, which shows 23%–53% improvement on the CMAPSS dataset. This algorithm also considers hardware implementation overhead and an FPGA based accelerator is developed. The accelerator adopts kernel-optimized design to utilize data reuse and reduce memory accesses. It enables real-time response and 5.89GOPs/W energy efficiency within small size and cost overhead. The FPGA implementation shows 15× CNN speedup and 9× overall speedup compared with the embedded processor Cortex-A9.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116680229","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771469
Yi-Heng Wu, Heng Lee, Yu Sheng Lin, Shao-Yi Chien
In recent years, deep convolutional neural networks (CNNs) achieve ground-breaking success in many computer vision research fields. Due to the large model size and tremendous computation of CNNs, they cannot be efficiently executed in small devices like mobile phones. Although several hardware accelerator architectures have been developed, most of them can only efficient address one of the two major layers in CNN, convolutional (CONV) and fully connected (FC) layers. In this paper, based on algorithm-architecture-co-exploration, our architecture targets at executing both layers with high efficiency. Vector quantization technique is first selected to compress the parameters, reduce the computation, and unify the behaviors of both CONV and FC layers. To fully exploit the gain of vector quantization, we then propose an accelerator architecture for quantized CNN. Different DRAM access schemes are employed to reduce DRAM access. We also design a high-throughput processing element architecture to accelerate quantized layers. Compare to previous accelerators for CNN, the proposed architecture achieves 1.2–5x less DRAM access and 1.5–5x higher throughput for both CONV and FC layers.
{"title":"Accelerator Design for Vector Quantized Convolutional Neural Network","authors":"Yi-Heng Wu, Heng Lee, Yu Sheng Lin, Shao-Yi Chien","doi":"10.1109/AICAS.2019.8771469","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771469","url":null,"abstract":"In recent years, deep convolutional neural networks (CNNs) achieve ground-breaking success in many computer vision research fields. Due to the large model size and tremendous computation of CNNs, they cannot be efficiently executed in small devices like mobile phones. Although several hardware accelerator architectures have been developed, most of them can only efficient address one of the two major layers in CNN, convolutional (CONV) and fully connected (FC) layers. In this paper, based on algorithm-architecture-co-exploration, our architecture targets at executing both layers with high efficiency. Vector quantization technique is first selected to compress the parameters, reduce the computation, and unify the behaviors of both CONV and FC layers. To fully exploit the gain of vector quantization, we then propose an accelerator architecture for quantized CNN. Different DRAM access schemes are employed to reduce DRAM access. We also design a high-throughput processing element architecture to accelerate quantized layers. Compare to previous accelerators for CNN, the proposed architecture achieves 1.2–5x less DRAM access and 1.5–5x higher throughput for both CONV and FC layers.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126069479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771617
Sungpill Choi, Kyeongryeol Bong, Donghyeon Han, H. Yoo
An energy efficient memory-centric convolutional neural network (CNN) processor architecture is proposed for smart devices such as wearable devices or internet of things (IoT) devices. To achieve energy-efficient processing, it has 2 key features: First, 1-D shift convolution PEs with fully distributed memory architecture achieve 3.1TOPS/W energy efficiency. Compared with conventional architecture, even though it has massively parallel 1024 MAC units, it achieve high energy efficiency by scaling down voltage to 0.46V due to its fully local routed design. Next, fully configurable 2-D mesh core-to-core interconnection support various size of input features to maximize utilization. The proposed architecture is evaluated 16mm2 chip which is fabricated with 65nm CMOS process and it performs real-time face recognition with only 9.4mW at 10MHz and 0.48V.
{"title":"CNNP-v2:An Energy Efficient Memory-Centric Convolutional Neural Network Processor Architecture","authors":"Sungpill Choi, Kyeongryeol Bong, Donghyeon Han, H. Yoo","doi":"10.1109/AICAS.2019.8771617","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771617","url":null,"abstract":"An energy efficient memory-centric convolutional neural network (CNN) processor architecture is proposed for smart devices such as wearable devices or internet of things (IoT) devices. To achieve energy-efficient processing, it has 2 key features: First, 1-D shift convolution PEs with fully distributed memory architecture achieve 3.1TOPS/W energy efficiency. Compared with conventional architecture, even though it has massively parallel 1024 MAC units, it achieve high energy efficiency by scaling down voltage to 0.46V due to its fully local routed design. Next, fully configurable 2-D mesh core-to-core interconnection support various size of input features to maximize utilization. The proposed architecture is evaluated 16mm2 chip which is fabricated with 65nm CMOS process and it performs real-time face recognition with only 9.4mW at 10MHz and 0.48V.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121799000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771610
Vincent Camus, C. Enz, M. Verhelst
The current trend for deep learning has come with an enormous computational need for billions of Multiply-Accumulate (MAC) operations per inference. Fortunately, reduced precision has demonstrated large benefits with low impact on accuracy, paving the way towards processing in mobile devices and IoT nodes. Precision-scalable MAC architectures optimized for neural networks have recently gained interest thanks to their subword parallel or bit-serial capabilities. Yet, it has been hard to make a fair judgment of their relative benefits as they have been implemented with different technologies and performance targets. In this work, run-time configurable MAC units from ISSCC 2017 and 2018 are implemented and compared objectively under diverse precision scenarios. All circuits are synthesized in a 28nm commercial CMOS process with precision ranging from 2 to 8 bits. This work analyzes the impact of scalability and compares the different MAC units in terms of energy, throughput and area, aiming to understand the optimal architectures to reduce computation costs in neural-network processing.
{"title":"Survey of Precision-Scalable Multiply-Accumulate Units for Neural-Network Processing","authors":"Vincent Camus, C. Enz, M. Verhelst","doi":"10.1109/AICAS.2019.8771610","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771610","url":null,"abstract":"The current trend for deep learning has come with an enormous computational need for billions of Multiply-Accumulate (MAC) operations per inference. Fortunately, reduced precision has demonstrated large benefits with low impact on accuracy, paving the way towards processing in mobile devices and IoT nodes. Precision-scalable MAC architectures optimized for neural networks have recently gained interest thanks to their subword parallel or bit-serial capabilities. Yet, it has been hard to make a fair judgment of their relative benefits as they have been implemented with different technologies and performance targets. In this work, run-time configurable MAC units from ISSCC 2017 and 2018 are implemented and compared objectively under diverse precision scenarios. All circuits are synthesized in a 28nm commercial CMOS process with precision ranging from 2 to 8 bits. This work analyzes the impact of scalability and compares the different MAC units in terms of energy, throughput and area, aiming to understand the optimal architectures to reduce computation costs in neural-network processing.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"24 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123478511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771614
Huang-Yu Yao, Hsuan-Pei Huang, Yu-Chi Huang, C. Lo
Spiking neural networks (SNN) are regarded by many as the “third generation network” that will solve computation problems in a more biologically realistic way. In our project, we design a robotic platform controlled by a user-defined SNN in order to develop a next generation artificial intelligence robot with high flexibility. This paper describes the preliminary progress of the project. We first implement a basic simple decision network and the robot is able to perform a basic but vital foraging and risk-avoiding task. Next, we implement the neural network of the fruit fly central complex in order to endow the robot with spatial orientation memory, a crucial function underlying the ability of spatial navigation.
{"title":"Flyintel – a Platform for Robot Navigation based on a Brain-Inspired Spiking Neural Network","authors":"Huang-Yu Yao, Hsuan-Pei Huang, Yu-Chi Huang, C. Lo","doi":"10.1109/AICAS.2019.8771614","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771614","url":null,"abstract":"Spiking neural networks (SNN) are regarded by many as the “third generation network” that will solve computation problems in a more biologically realistic way. In our project, we design a robotic platform controlled by a user-defined SNN in order to develop a next generation artificial intelligence robot with high flexibility. This paper describes the preliminary progress of the project. We first implement a basic simple decision network and the robot is able to perform a basic but vital foraging and risk-avoiding task. Next, we implement the neural network of the fruit fly central complex in order to endow the robot with spatial orientation memory, a crucial function underlying the ability of spatial navigation.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131385764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771520
Yuhuang Hu, Hong Ming Chen, T. Delbrück
Slasher is the first open 1/10 scale autonomous driving platform for exploring the use of neuromorphic event cameras for fast driving in unstructured indoor and outdoor environments. Slasher features a DAVIS event-based camera and ROS computer for perception and control. The DAVIS camera provides high dynamic range, sparse output, and sub-millisecond latency output for the quick visual control needed for fast driving. A race controller and Bluetooth remote joystick are used to coordinate different processing pipelines, and a low-cost ultra-wide-band (UWB) positioning system records trajectories. The modular design of Slasher can easily integrate additional features and sensors. In this paper, we show its application in a reflexive Convolutional Neural Network (CNN) steering controller trained by end-to-end learning. We present preliminary experiments in closed-loop indoor and outdoor trail driving.
{"title":"Slasher: Stadium racer car for event camera end-to-end learning autonomous driving experiments","authors":"Yuhuang Hu, Hong Ming Chen, T. Delbrück","doi":"10.1109/AICAS.2019.8771520","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771520","url":null,"abstract":"Slasher is the first open 1/10 scale autonomous driving platform for exploring the use of neuromorphic event cameras for fast driving in unstructured indoor and outdoor environments. Slasher features a DAVIS event-based camera and ROS computer for perception and control. The DAVIS camera provides high dynamic range, sparse output, and sub-millisecond latency output for the quick visual control needed for fast driving. A race controller and Bluetooth remote joystick are used to coordinate different processing pipelines, and a low-cost ultra-wide-band (UWB) positioning system records trajectories. The modular design of Slasher can easily integrate additional features and sensors. In this paper, we show its application in a reflexive Convolutional Neural Network (CNN) steering controller trained by end-to-end learning. We present preliminary experiments in closed-loop indoor and outdoor trail driving.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128110087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we propose a novel sleep apnea identification system by adopting a sleep breathing monitoring mattress which utilizes the ultra-wideband (UWB) physiological sensing technique. Unlike traditional methods which need wearable devices and electrical equipment connected to patients, the proposed system detects apnea in a non-conscious and non-contact way by using UWB sensors. The proposed system is built by a machine learning technique in the offline stage, and detects apnea in the online stage by using our designed apnea detection algorithm. The experimental results illustrate that the proposed apnea identification system efficiently detects sleep apnea without diagnoses undertaken at hospitals.
{"title":"Novel Sleep Apnea Detection Based on UWB Artificial Intelligence Mattress","authors":"Chiapin Wang, Jen-Hau Chan, Shih-Hau Fang, Ho-Ti Cheng, Yeh-Liang Hsu","doi":"10.1109/AICAS.2019.8771598","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771598","url":null,"abstract":"In this paper, we propose a novel sleep apnea identification system by adopting a sleep breathing monitoring mattress which utilizes the ultra-wideband (UWB) physiological sensing technique. Unlike traditional methods which need wearable devices and electrical equipment connected to patients, the proposed system detects apnea in a non-conscious and non-contact way by using UWB sensors. The proposed system is built by a machine learning technique in the offline stage, and detects apnea in the online stage by using our designed apnea detection algorithm. The experimental results illustrate that the proposed apnea identification system efficiently detects sleep apnea without diagnoses undertaken at hospitals.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133152629","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-03-18DOI: 10.1109/AICAS.2019.8771593
Ting-Wei Sun, A. Wu
There has been a lot of previous works on speech emotion with machine learning method. However, most of them rely on the effectiveness of labelled speech data. In this paper, we propose a novel algorithm which combines both sparse autoencoder and attention mechanism. The aim is to benefit from labeled and unlabeled data with autoencoder, and to apply attention mechanism to focus on speech frames which have strong emotional information. We can also ignore other speech frames which do not carry emotional content. The proposed algorithm is evaluated on three public databases with cross-language system. Experimental results show that the proposed algorithm provide significantly higher accurate predictions compare to existing speech emotion recognition algorithms.
{"title":"Sparse Autoencoder with Attention Mechanism for Speech Emotion Recognition","authors":"Ting-Wei Sun, A. Wu","doi":"10.1109/AICAS.2019.8771593","DOIUrl":"https://doi.org/10.1109/AICAS.2019.8771593","url":null,"abstract":"There has been a lot of previous works on speech emotion with machine learning method. However, most of them rely on the effectiveness of labelled speech data. In this paper, we propose a novel algorithm which combines both sparse autoencoder and attention mechanism. The aim is to benefit from labeled and unlabeled data with autoencoder, and to apply attention mechanism to focus on speech frames which have strong emotional information. We can also ignore other speech frames which do not carry emotional content. The proposed algorithm is evaluated on three public databases with cross-language system. Experimental results show that the proposed algorithm provide significantly higher accurate predictions compare to existing speech emotion recognition algorithms.","PeriodicalId":273095,"journal":{"name":"2019 IEEE International Conference on Artificial Intelligence Circuits and Systems (AICAS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126303668","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}