Autonomous driving cars need highly complex hardware and software systems, which require high performance computing platforms in order to enable a real time AI-based perception and decision making pipeline. The industry has been exploring various in-vehicle accelerators such as GPUs, ASICs and FPGAs. Yet the autonomous driving platform design is far from mature when taking into account of system reliability, redundancy and higher level of autonomy. In this work, we propose a hybrid computing system design, which integrates a GPU as the primary computing system and a FPGA as a secondary system. This hybrid system architecture has multiple advantages: 1) The FPGA can be constantly running as a complementary system with very short latency, helping to detect main system failure and anomalous behavior, contributing to system functionality verification and reliability. 2) If the primary system fails (mostly from sensor or interconnection error), the FPGA will quickly detect the failure and run a safe-mode task with a subset of sensors. 3) The FPGA can be used as an independent computing system to run extra algorithm components to improve the overall system autonomy. For example, FPGA can handle driver monitoring tasks while GPU focuses on driving functions. Together they can boost the driving function from L2 (constantly requires driver’s attention) to L3 (allows driver to mind off for 10 seconds). This paper defines how such a system works, discusses various use cases and potential design challenges, and shares some initial results and insights about how to make such a system deliver the maximum value for autonomous driving.
{"title":"A Hybrid GPU + FPGA System Design for Autonomous Driving Cars","authors":"Cong Hao, Junli Gu, Deming Chen, A. Sarwari, Zhijie Jin, Husam Abu-Haimed, Daryl Sew, Yuhong Li, Xinheng Liu, Bryan Wu, Dongdong Fu","doi":"10.1109/SiPS47522.2019.9020540","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020540","url":null,"abstract":"Autonomous driving cars need highly complex hardware and software systems, which require high performance computing platforms in order to enable a real time AI-based perception and decision making pipeline. The industry has been exploring various in-vehicle accelerators such as GPUs, ASICs and FPGAs. Yet the autonomous driving platform design is far from mature when taking into account of system reliability, redundancy and higher level of autonomy. In this work, we propose a hybrid computing system design, which integrates a GPU as the primary computing system and a FPGA as a secondary system. This hybrid system architecture has multiple advantages: 1) The FPGA can be constantly running as a complementary system with very short latency, helping to detect main system failure and anomalous behavior, contributing to system functionality verification and reliability. 2) If the primary system fails (mostly from sensor or interconnection error), the FPGA will quickly detect the failure and run a safe-mode task with a subset of sensors. 3) The FPGA can be used as an independent computing system to run extra algorithm components to improve the overall system autonomy. For example, FPGA can handle driver monitoring tasks while GPU focuses on driving functions. Together they can boost the driving function from L2 (constantly requires driver’s attention) to L3 (allows driver to mind off for 10 seconds). This paper defines how such a system works, discusses various use cases and potential design challenges, and shares some initial results and insights about how to make such a system deliver the maximum value for autonomous driving.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122560559","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020522
Yuanyuan Zhu, Jiafei Fu, Xu Xu, Z. Ye
Post-filtering is a popular technique for multichannel speech enhancement system, in order to further improve the speech quality and intelligibility after beamforming. This paper presents a novel post-filtering to a minimum variance distortionless response (MVDR) beamforming which is a single-channel modified complementary joint sparse representations (M-CJSR) method. First, MVDR beamformer is used to suppress interference and noise. Subsequently, the proposed M-CJSR approach based on joint dictionary learning is applied as a single microphone post-filter to process the beamformer output. Different from the existing post-filtering techniques which rely on the assumptions about the noise field, this algorithm considers a more generalized signal model including the ambient noise, like diffuse noise or white noise, as well as the point-source interference. Moreover, the original CJSR method is extended to jointly learn dictionaries for not only the mappings from mixture to speech and noise, but also the mapping from mixture to interference. In order to take the complementary advantages of different sparse representations, we design the weighting parameters based on the residual components of the estimated signals. An experimental study which consists of objective evaluations under various conditions verifies the superiority of the proposed algorithm compared to other state-of-the-art methods.
{"title":"Modified Complementary Joint Sparse Representations: A Novel Post-Filtering to MVDR Beamforming","authors":"Yuanyuan Zhu, Jiafei Fu, Xu Xu, Z. Ye","doi":"10.1109/SiPS47522.2019.9020522","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020522","url":null,"abstract":"Post-filtering is a popular technique for multichannel speech enhancement system, in order to further improve the speech quality and intelligibility after beamforming. This paper presents a novel post-filtering to a minimum variance distortionless response (MVDR) beamforming which is a single-channel modified complementary joint sparse representations (M-CJSR) method. First, MVDR beamformer is used to suppress interference and noise. Subsequently, the proposed M-CJSR approach based on joint dictionary learning is applied as a single microphone post-filter to process the beamformer output. Different from the existing post-filtering techniques which rely on the assumptions about the noise field, this algorithm considers a more generalized signal model including the ambient noise, like diffuse noise or white noise, as well as the point-source interference. Moreover, the original CJSR method is extended to jointly learn dictionaries for not only the mappings from mixture to speech and noise, but also the mapping from mixture to interference. In order to take the complementary advantages of different sparse representations, we design the weighting parameters based on the residual components of the estimated signals. An experimental study which consists of objective evaluations under various conditions verifies the superiority of the proposed algorithm compared to other state-of-the-art methods.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128079066","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/sips47522.2019.9020636
{"title":"SiPS 2019 Conference Committee","authors":"","doi":"10.1109/sips47522.2019.9020636","DOIUrl":"https://doi.org/10.1109/sips47522.2019.9020636","url":null,"abstract":"","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128186671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020592
Weihang Tan, Aengran Au, Benjamin Aase, S. Aao, Yingjie Lao
Bootstrapping algorithm, which is the intermediate refreshing procedure of a processed ciphertext, has been the performance bottleneck among various existing Fully Homomorphic Encryption (FHE) schemes. Specifically, the external product of polynomials is the most computationally expensive step of bootstrapping algorithms that are based on the Ring Learning With Error (RLWE) problem. In this paper, we design a novel and scalable polynomial multiplier architecture for a bootstrapping algorithm along with a conflict-free memory management scheme to reduce the latency, while achieving a full utilization of the processing elements (PEs). Each PE is a modified radix-2 butterfly unit from fast Fourier transform (FFT), which can be reconfigured to use in both the number theoretic transform (NTT) and the basic modular multiplication of polynomial multiplication in the external product step. The experimental results show that our design yields 33% less area-time product than prior designs.
{"title":"An Efficient Polynomial Multiplier Architecture for the Bootstrapping Algorithm in a Fully Homomorphic Encryption Scheme","authors":"Weihang Tan, Aengran Au, Benjamin Aase, S. Aao, Yingjie Lao","doi":"10.1109/SiPS47522.2019.9020592","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020592","url":null,"abstract":"Bootstrapping algorithm, which is the intermediate refreshing procedure of a processed ciphertext, has been the performance bottleneck among various existing Fully Homomorphic Encryption (FHE) schemes. Specifically, the external product of polynomials is the most computationally expensive step of bootstrapping algorithms that are based on the Ring Learning With Error (RLWE) problem. In this paper, we design a novel and scalable polynomial multiplier architecture for a bootstrapping algorithm along with a conflict-free memory management scheme to reduce the latency, while achieving a full utilization of the processing elements (PEs). Each PE is a modified radix-2 butterfly unit from fast Fourier transform (FFT), which can be reconfigured to use in both the number theoretic transform (NTT) and the basic modular multiplication of polynomial multiplication in the external product step. The experimental results show that our design yields 33% less area-time product than prior designs.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115884045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020318
Jingtao Li, Manqing Mao, C. Chakrabarti
Binary deep neural networks, that have been implemented in resistive random access memory (ReRAM) for storage efficiency, suffer from poor recognition performance in the presence of hardware errors. This paper addresses this problem by deriving a novel weight distribution and representation scheme that mitigates errors due to faulty ReRAM cells with minimal storage overhead. In the proposed scheme, the weight matrix is partitioned into grains, and each weight in a grain is represented by the sum of a multi-bit mean and a 1-bit deviation. The grain size as well as the mean to deviation ratio of the weights in a grain can be chosen such that the network is resilient to hardware errors. A hybrid processing-in-memory (PIM) architecture is proposed to support this scheme. The mean values are stored in a small SRAM and processed by a CMOS unit, and the deviations are stored and processed by the ReRAM unit. Compared to the baseline binary neural network which fails in the presence of severe hardware errors, the proposed hybrid scheme has only a mild recognition performance degradation. Simulation results show the proposed scheme achieves 97.84% test accuracy (a 0.84% accuracy drop) on a MNIST dataset, and 88.07% test accuracy (a 1.10% accuracy drop) on a CIFAR-10 dataset under 9.04% stuck-at-1 and 1.75% stuck-at-0 faults.
{"title":"Improving Reliability of ReRAM-Based DNN Implementation through Novel Weight Distribution","authors":"Jingtao Li, Manqing Mao, C. Chakrabarti","doi":"10.1109/SiPS47522.2019.9020318","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020318","url":null,"abstract":"Binary deep neural networks, that have been implemented in resistive random access memory (ReRAM) for storage efficiency, suffer from poor recognition performance in the presence of hardware errors. This paper addresses this problem by deriving a novel weight distribution and representation scheme that mitigates errors due to faulty ReRAM cells with minimal storage overhead. In the proposed scheme, the weight matrix is partitioned into grains, and each weight in a grain is represented by the sum of a multi-bit mean and a 1-bit deviation. The grain size as well as the mean to deviation ratio of the weights in a grain can be chosen such that the network is resilient to hardware errors. A hybrid processing-in-memory (PIM) architecture is proposed to support this scheme. The mean values are stored in a small SRAM and processed by a CMOS unit, and the deviations are stored and processed by the ReRAM unit. Compared to the baseline binary neural network which fails in the presence of severe hardware errors, the proposed hybrid scheme has only a mild recognition performance degradation. Simulation results show the proposed scheme achieves 97.84% test accuracy (a 0.84% accuracy drop) on a MNIST dataset, and 88.07% test accuracy (a 1.10% accuracy drop) on a CIFAR-10 dataset under 9.04% stuck-at-1 and 1.75% stuck-at-0 faults.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121669337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/sips47522.2019.9020313
{"title":"[SiPS 2019 Title Page]","authors":"","doi":"10.1109/sips47522.2019.9020313","DOIUrl":"https://doi.org/10.1109/sips47522.2019.9020313","url":null,"abstract":"","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125262801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020482
Wei Zhang, Xuyang Gao, Yibing Shi
Multiple-input multiple-output (MIMO) with Orthogonal Frequency Division Multiplexing (OFDM) technology has both the advantages of MIMO and OFDM. Vector Orthogonal Frequency Division Multiplexing (V-OFDM) is an extension of OFDM, which makes data transmission flexible. In MIMO systems using V-OFDM technology, different novel schemes are proposed to improve channel estimation performance for different channel sparsity. The 2-D Kriging interpolation scheme is proposed for the non-sparse channels, which can significantly improve the performance of conventional Least Square (LS) and Minimum Mean Square Error (MMSE) algorithms. When the channel is sparse, the estimation process can be modeled as a sparse recovery problem using compressed sensing (CS) theory. In this paper, the measurement matrix is determined by pilot locations, and a pilot search algorithm based on random genetic algorithm (RGA) is proposed to minimize the cross-correlation value of the measurement matrix. Besides, a variable threshold sparsity adaptive matching pursuit (VTSAMP) algorithm is designed to obtain more accurate estimates, which achieves better Normalized Mean Square Error (NMSE) performance, higher calculation speed, and lower implementation complexity.
{"title":"Pilot-Assisted Methods for Channel Estimation in MIMO-V-OFDM Systems","authors":"Wei Zhang, Xuyang Gao, Yibing Shi","doi":"10.1109/SiPS47522.2019.9020482","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020482","url":null,"abstract":"Multiple-input multiple-output (MIMO) with Orthogonal Frequency Division Multiplexing (OFDM) technology has both the advantages of MIMO and OFDM. Vector Orthogonal Frequency Division Multiplexing (V-OFDM) is an extension of OFDM, which makes data transmission flexible. In MIMO systems using V-OFDM technology, different novel schemes are proposed to improve channel estimation performance for different channel sparsity. The 2-D Kriging interpolation scheme is proposed for the non-sparse channels, which can significantly improve the performance of conventional Least Square (LS) and Minimum Mean Square Error (MMSE) algorithms. When the channel is sparse, the estimation process can be modeled as a sparse recovery problem using compressed sensing (CS) theory. In this paper, the measurement matrix is determined by pilot locations, and a pilot search algorithm based on random genetic algorithm (RGA) is proposed to minimize the cross-correlation value of the measurement matrix. Besides, a variable threshold sparsity adaptive matching pursuit (VTSAMP) algorithm is designed to obtain more accurate estimates, which achieves better Normalized Mean Square Error (NMSE) performance, higher calculation speed, and lower implementation complexity.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130069349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020519
Weikang Qian, Runsheng Wang, Yuan Wang, Marc D. Riedel, Ru Huang
Although the metal-oxide-semiconductor field-effect transistor (MOSFET) has been the dominant device for modern very-large scale integration (VLSI) circuits for more than six decades, with the dawning of a post-Moore era, researchers are trying to find replacements. A foundation of modern digital computing is the encoding of digital values through a binary radix representation. However, as we enter into the post-Moore era, the challenges of increasing power density, signal noise, and device unreliability raise the question of whether this basic way of encoding data is still the best choice, particularly with novel electronic devices. Prior work has shown that binary radix encoding has some disadvantages. We argue that it is crucial to rethink the necessity of using this representation in the post-Moore era. In this paper, we review some recent development on computation-driven data encoding. We begin with stochastic encoding, a representation proposed a long time ago, discussing both its advantages and disadvantages. Then, we review several recent breakthroughs with variations of stochastic encoding that mitigate many of its disadvantages. Finally, we conclude the paper by extrapolating future directions for effective computation-driven data encoding.
{"title":"A Survey of Computation-Driven Data Encoding","authors":"Weikang Qian, Runsheng Wang, Yuan Wang, Marc D. Riedel, Ru Huang","doi":"10.1109/SiPS47522.2019.9020519","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020519","url":null,"abstract":"Although the metal-oxide-semiconductor field-effect transistor (MOSFET) has been the dominant device for modern very-large scale integration (VLSI) circuits for more than six decades, with the dawning of a post-Moore era, researchers are trying to find replacements. A foundation of modern digital computing is the encoding of digital values through a binary radix representation. However, as we enter into the post-Moore era, the challenges of increasing power density, signal noise, and device unreliability raise the question of whether this basic way of encoding data is still the best choice, particularly with novel electronic devices. Prior work has shown that binary radix encoding has some disadvantages. We argue that it is crucial to rethink the necessity of using this representation in the post-Moore era. In this paper, we review some recent development on computation-driven data encoding. We begin with stochastic encoding, a representation proposed a long time ago, discussing both its advantages and disadvantages. Then, we review several recent breakthroughs with variations of stochastic encoding that mitigate many of its disadvantages. Finally, we conclude the paper by extrapolating future directions for effective computation-driven data encoding.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"2015 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127674322","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020385
Luca Ferranti, J. Boutellier
Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.
{"title":"Towards Algebraic Modeling of GPU Memory Access for Bank Conflict Mitigation","authors":"Luca Ferranti, J. Boutellier","doi":"10.1109/SiPS47522.2019.9020385","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020385","url":null,"abstract":"Graphics Processing Units (GPU) have been widely used in various fields of scientific computing, such as in signal processing. GPUs have a hierarchical memory structure with memory layers that are shared between GPU processing elements. Partly due to the complex memory hierarchy, GPU programming is non-trivial, and several aspects must be taken into account, one being memory access patterns. One of the fastest GPU memory layers, shared memory, is grouped into banks to enable fast, parallel access for processing elements. Unfortunately, it may happen that multiple threads of a GPU program may access the same shared memory bank simultaneously causing a bank conflict. If this happens, program execution slows down as memory accesses have to be rescheduled to determine which instruction to execute first. Bank conflicts are not taken into account automatically by the compiler, and hence the programmer must detect and deal with them prior to program execution. In this paper, we present an algebraic approach to detect bank conflicts and prove some theoretical results that can be used to predict when bank conflicts happen and how to avoid them. Also, our experimental results illustrate the savings in computation time.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129382316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-09-16DOI: 10.1109/SiPS47522.2019.9020609
Yi-Ta Chen, Yu-Chuan Chuang, A. Wu
In this paper, we propose an AdaBoost-assisted extreme learning machine for efficient online sequential classification (AOS-ELM). In order to achieve better accuracy in online sequential learning scenarios, we utilize the cost-sensitive algorithm-AdaBoost, which diversifying the weak classifiers, and adding the forgetting mechanism, which stabilizing the performance during the training procedure. Hence, AOS-ELM adapts better to sequentially arrived data compared with other voting based methods. The experiment results show AOS-ELM can achieve 94.41% accuracy on MNIST dataset, which is the theoretical accuracy bound performed by original batch learning algorithm, AdaBoost-ELM. Moreover, with the forgetting mechanism, the standard deviation of accuracy during the online sequential learning process is reduced to 8.26x.
{"title":"AdaBoost-assisted Extreme Learning Machine for Efficient Online Sequential Classification","authors":"Yi-Ta Chen, Yu-Chuan Chuang, A. Wu","doi":"10.1109/SiPS47522.2019.9020609","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020609","url":null,"abstract":"In this paper, we propose an AdaBoost-assisted extreme learning machine for efficient online sequential classification (AOS-ELM). In order to achieve better accuracy in online sequential learning scenarios, we utilize the cost-sensitive algorithm-AdaBoost, which diversifying the weak classifiers, and adding the forgetting mechanism, which stabilizing the performance during the training procedure. Hence, AOS-ELM adapts better to sequentially arrived data compared with other voting based methods. The experiment results show AOS-ELM can achieve 94.41% accuracy on MNIST dataset, which is the theoretical accuracy bound performed by original batch learning algorithm, AdaBoost-ELM. Moreover, with the forgetting mechanism, the standard deviation of accuracy during the online sequential learning process is reduced to 8.26x.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115714826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}