Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020406
Xianxian Zhang, T. Kristjansson, Philip Hilmes
The Audio Front-End (AFE) is a key component in mitigating acoustic environmental challenges for far-field automatic speech recognition (ASR) on Amazon Echo family of products. A critical component of the AFE is the Beam Selector, which identifies which beam points to the target user. In this paper, we proposed a new SIR beam selector that utilizes subband-based signal-to-interference ratios to learn the locations of the audio sources and therefore further improve the beam selection accuracy for multi-microphone based AFE system. We analyzed the performance of a Signal to Interference Ratio (SIR) beam selector with a comparison to classic beam selector using the datasets collected under various conditions. This method is evaluated and shown to simultaneously decrease word-error-rate (WER) for speech recognition by up to 46.20% and improve barge-in performance via FRR by up to 39.18%.
{"title":"SIR Beam Selector for Amazon Echo Devices Audio Front-End","authors":"Xianxian Zhang, T. Kristjansson, Philip Hilmes","doi":"10.1109/SiPS47522.2019.9020406","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020406","url":null,"abstract":"The Audio Front-End (AFE) is a key component in mitigating acoustic environmental challenges for far-field automatic speech recognition (ASR) on Amazon Echo family of products. A critical component of the AFE is the Beam Selector, which identifies which beam points to the target user. In this paper, we proposed a new SIR beam selector that utilizes subband-based signal-to-interference ratios to learn the locations of the audio sources and therefore further improve the beam selection accuracy for multi-microphone based AFE system. We analyzed the performance of a Signal to Interference Ratio (SIR) beam selector with a comparison to classic beam selector using the datasets collected under various conditions. This method is evaluated and shown to simultaneously decrease word-error-rate (WER) for speech recognition by up to 46.20% and improve barge-in performance via FRR by up to 39.18%.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115154711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stochastic computing (SC), which processes the data in the form of random bit streams, has been used in neural networks due to simple logic gates performing complex arithmetic and the inherent high error-tolerance. However, SC-based neural network accelerators suffer from high latency, random fluctuations, and large hardware cost of pseudo-random number generators (PRNG), thus diminishing the advantages of stochastic computing. In this paper, we address these problems with a novel technique of generating bit streams in parallel, which needs only one clock for conversion and significantly reduces the hardware cost. Based on this parallel bitstream generator, we further present two kinds of convolutional neural network (CNN) accelerator architectures with digital and analog circuits, respectively, showing great potential for low-power applications.
{"title":"Parallel Convolutional Neural Network (CNN) Accelerators Based on Stochastic Computing","authors":"Yawen Zhang, Xinyue Zhang, Jiahao Song, Yuan Wang, Ru Huang, Runsheng Wang","doi":"10.1109/SiPS47522.2019.9020615","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020615","url":null,"abstract":"Stochastic computing (SC), which processes the data in the form of random bit streams, has been used in neural networks due to simple logic gates performing complex arithmetic and the inherent high error-tolerance. However, SC-based neural network accelerators suffer from high latency, random fluctuations, and large hardware cost of pseudo-random number generators (PRNG), thus diminishing the advantages of stochastic computing. In this paper, we address these problems with a novel technique of generating bit streams in parallel, which needs only one clock for conversion and significantly reduces the hardware cost. Based on this parallel bitstream generator, we further present two kinds of convolutional neural network (CNN) accelerator architectures with digital and analog circuits, respectively, showing great potential for low-power applications.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121051605","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020384
Jing Tian, Jun Lin, Zhongfeng Wang
Supersingular isogeny key encapsulation (SIKE) protocol delivers promising public and secret key sizes over other post-quantum candidates. However, the huge computations form the bottleneck and limit its practical applications. The modular multiplication operation, which is one of the most computationally demanding operations in the fundamental arithmetics, takes up a large part of the computations in the protocol. In this paper, we propose an improved unconventional-radix finite-field multiplication (IFFM) algorithm which reduces the computational complexity by about 20% compared to previous algorithms. We then devise a new high-speed modular multiplier architecture based on the IFFM. It is shown that the proposed architecture can be extensively pipelined to achieve a very high clock speed due to its complete feedforward scheme, which demonstrates significant advantages over conventional designs. The FPGA implementation results show the proposed multiplier has about 67 times faster throughput than the state-of-the-art designs and more than 12 times better area efficiency than previous works. Therefore, we think that these achievements will greatly contribute to the practicability of this protocol.
{"title":"Ultra-Fast Modular Multiplication Implementation for Isogeny-Based Post-Quantum Cryptography","authors":"Jing Tian, Jun Lin, Zhongfeng Wang","doi":"10.1109/SiPS47522.2019.9020384","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020384","url":null,"abstract":"Supersingular isogeny key encapsulation (SIKE) protocol delivers promising public and secret key sizes over other post-quantum candidates. However, the huge computations form the bottleneck and limit its practical applications. The modular multiplication operation, which is one of the most computationally demanding operations in the fundamental arithmetics, takes up a large part of the computations in the protocol. In this paper, we propose an improved unconventional-radix finite-field multiplication (IFFM) algorithm which reduces the computational complexity by about 20% compared to previous algorithms. We then devise a new high-speed modular multiplier architecture based on the IFFM. It is shown that the proposed architecture can be extensively pipelined to achieve a very high clock speed due to its complete feedforward scheme, which demonstrates significant advantages over conventional designs. The FPGA implementation results show the proposed multiplier has about 67 times faster throughput than the state-of-the-art designs and more than 12 times better area efficiency than previous works. Therefore, we think that these achievements will greatly contribute to the practicability of this protocol.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127263462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020613
Yonghong Bai, Zhiyuan Yan
In physical unclonable functions (PUFs) based key generation methods, the bias of PUF outputs would leak secrecy. A secure and robust key generation method based on PUFs and polar codes is proposed in this paper. First, a PUF-based key generation process is modeled as a wiretap channel. Then, two secure polar coding schemes are designed for the wiretap channel to improve the robustness of key generation and to reduce the secrecy leakage caused by the bias of PUF outputs. To construct the secure polar coding schemes, density evolution is used to evaluate the error probability of synthesized channels, which in turn is used to approximate both the error probability and the secrecy leakage of the system. To reduce the polar construction complexity, the channel independent polar construction method aids density evolution to select parameters of the secure polar coding schemes. Finally, we compare the key generation design with other works and find that our key generation scheme requires fewer PUF bits than other works when they generate the same length key with failure probability $le 10^{-6}$.
{"title":"A Secure and Robust Key Generation Method Using Physical Unclonable Functions and Polar Codes","authors":"Yonghong Bai, Zhiyuan Yan","doi":"10.1109/SiPS47522.2019.9020613","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020613","url":null,"abstract":"In physical unclonable functions (PUFs) based key generation methods, the bias of PUF outputs would leak secrecy. A secure and robust key generation method based on PUFs and polar codes is proposed in this paper. First, a PUF-based key generation process is modeled as a wiretap channel. Then, two secure polar coding schemes are designed for the wiretap channel to improve the robustness of key generation and to reduce the secrecy leakage caused by the bias of PUF outputs. To construct the secure polar coding schemes, density evolution is used to evaluate the error probability of synthesized channels, which in turn is used to approximate both the error probability and the secrecy leakage of the system. To reduce the polar construction complexity, the channel independent polar construction method aids density evolution to select parameters of the secure polar coding schemes. Finally, we compare the key generation design with other works and find that our key generation scheme requires fewer PUF bits than other works when they generate the same length key with failure probability $le 10^{-6}$.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128257983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020317
Wonyong Sung, Lukas Lee, Jinhwan Park
Real-time speech recognition on mobile and embedded devices is an important application of neural networks. Acoustic modeling is the fundamental part of speech recognition and is usually implemented with long short-term memory (LSTM)-based recurrent neural networks (RNNs). However, the single thread execution of an LSTM RNN is extremely slow in most embedded devices because the algorithm needs to fetch a large number of parameters from the DRAM for computing each output sample. We explore a few acoustic modeling algorithms that can be executed very efficiently on embedded devices. These algorithms reduce the overhead of memory accesses using multi-timestep parallelization that computes multiple output samples at a time by reading the parameters only once from the DRAM. The algorithms considered are the quasi RNNs (QRNNs), Gated ConvNets, and diagonalized LSTMs. In addition, we explore neural networks that equip one-dimensional (1-D) convolution at each layer of these algorithms, and by which can obtain a very large performance increase in QRNNs and Gated ConvNets. The experiments were conducted using the connectionist temporal classification (CTC)-based end-to-end speech recognition on WSJ corpus. We not only significantly increase the execution speed but also obtain a much higher accuracy, compared to LSTM RNN-based modeling. Thus, this work can be applicable not only to embedded system-based implementations but also to server-based ones.
{"title":"Exploration of On-device End-to-End Acoustic Modeling with Neural Networks","authors":"Wonyong Sung, Lukas Lee, Jinhwan Park","doi":"10.1109/SiPS47522.2019.9020317","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020317","url":null,"abstract":"Real-time speech recognition on mobile and embedded devices is an important application of neural networks. Acoustic modeling is the fundamental part of speech recognition and is usually implemented with long short-term memory (LSTM)-based recurrent neural networks (RNNs). However, the single thread execution of an LSTM RNN is extremely slow in most embedded devices because the algorithm needs to fetch a large number of parameters from the DRAM for computing each output sample. We explore a few acoustic modeling algorithms that can be executed very efficiently on embedded devices. These algorithms reduce the overhead of memory accesses using multi-timestep parallelization that computes multiple output samples at a time by reading the parameters only once from the DRAM. The algorithms considered are the quasi RNNs (QRNNs), Gated ConvNets, and diagonalized LSTMs. In addition, we explore neural networks that equip one-dimensional (1-D) convolution at each layer of these algorithms, and by which can obtain a very large performance increase in QRNNs and Gated ConvNets. The experiments were conducted using the connectionist temporal classification (CTC)-based end-to-end speech recognition on WSJ corpus. We not only significantly increase the execution speed but also obtain a much higher accuracy, compared to LSTM RNN-based modeling. Thus, this work can be applicable not only to embedded system-based implementations but also to server-based ones.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130323177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020323
Yuanyong Luo, H. Pan, Q. Shen, Zhongfeng Wang
In VLSI design domain, Clustered Look-Ahead (CLA) technique is a promising method to further pipeline or accelerate IIR digital filters in the coming era of 5G network for mobile devices. However, much efforts are needed to acquire the stable CLA pipelined architecture. Therefore, this paper proposes a CLA Formula to aid the fast architecture design for CLA pipelined IIR digital filters. To obtain the stable architecture with the pipeline stage ranging from 6 to 96, comparison experiments show that when compared to the symbolic method with substitution, the proposed CLA Formula aided method can save more than half the software coding time for designers and reduce almost 168$sim$30243 times the execution time for the programs.
{"title":"CLA Formula Aided Fast Architecture Design for Clustered Look-Ahead Pipelined IIR Digital Filter","authors":"Yuanyong Luo, H. Pan, Q. Shen, Zhongfeng Wang","doi":"10.1109/SiPS47522.2019.9020323","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020323","url":null,"abstract":"In VLSI design domain, Clustered Look-Ahead (CLA) technique is a promising method to further pipeline or accelerate IIR digital filters in the coming era of 5G network for mobile devices. However, much efforts are needed to acquire the stable CLA pipelined architecture. Therefore, this paper proposes a CLA Formula to aid the fast architecture design for CLA pipelined IIR digital filters. To obtain the stable architecture with the pipeline stage ranging from 6 to 96, comparison experiments show that when compared to the symbolic method with substitution, the proposed CLA Formula aided method can save more than half the software coding time for designers and reduce almost 168$sim$30243 times the execution time for the programs.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130661610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020408
Zhengxia Ji, Mengyun Nie, Lingquan Meng, Qingran Wang, Chunguo Li, Kang Song
The increasing use of information source in unreliable wireless communication is a driving force to explore the networks’ energy efficiency and security. To fully improve the performance of the system, in this paper, we combine these two directions and investigate the secrecy energy efficiency (SEE) of the network in which the information can be eavesdropped consisting of an energy source, an information source, a destination and an eavesdrop node, all of which are equipped with single antenna. The system model is based on ST (save-then-transmit) protocol. The information source node harvests energy from the received signal power to charge its battery, which is used to retransmit the received signal to the destination. Under the limited transmit power mode, we get the expression for SEE, which depends on energy absorption rate and time. Our analytical results reveal that the secrecy efficiency has a maximum. The optimal energy absorption rate was further calculated by Newton iterative algorithm. Then we propose optimal energy source selection method. Simulation results finally verify the accuracy and efficiency of our proposed algorithm for secrecy energy efficiency maximization.
{"title":"On Secrecy Energy Efficiency of RF Energy Harvesting System","authors":"Zhengxia Ji, Mengyun Nie, Lingquan Meng, Qingran Wang, Chunguo Li, Kang Song","doi":"10.1109/SiPS47522.2019.9020408","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020408","url":null,"abstract":"The increasing use of information source in unreliable wireless communication is a driving force to explore the networks’ energy efficiency and security. To fully improve the performance of the system, in this paper, we combine these two directions and investigate the secrecy energy efficiency (SEE) of the network in which the information can be eavesdropped consisting of an energy source, an information source, a destination and an eavesdrop node, all of which are equipped with single antenna. The system model is based on ST (save-then-transmit) protocol. The information source node harvests energy from the received signal power to charge its battery, which is used to retransmit the received signal to the destination. Under the limited transmit power mode, we get the expression for SEE, which depends on energy absorption rate and time. Our analytical results reveal that the secrecy efficiency has a maximum. The optimal energy absorption rate was further calculated by Newton iterative algorithm. Then we propose optimal energy source selection method. Simulation results finally verify the accuracy and efficiency of our proposed algorithm for secrecy energy efficiency maximization.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115798092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020322
Zihui Zhu, Hengrui Gu, Zhengming Zhang, Yongming Huang, Luxi Yang
Semantic segmentation of retinal vessel images is of great value for clinical diagnosis. Due to the complex information of retinal vessel features, the existing algorithms have problems such as discontinuities of segmented vessels. To achieve better semantic segmentation results, we propose an encoder-decoder structure combined with dense convolution and depth separable convolution. Firstly, the images are enhanced by extracting the original green channel, limiting contrast histogram equalization and sharpening, then data argumentation is performed to expand the data set. Secondly, the processed images are trained by the proposed network using a weighted loss function. Finally, the test images are segmented by the trained model. The proposed algorithm is tested on the DRIVE data set, and its average accuracy, sensitivity and specificity reached 96.83%, 83.71%, and 98.95%, respectively.
{"title":"Semantic Segmentation of Retinal Vessel Images via Dense Convolution and Depth Separable Convolution","authors":"Zihui Zhu, Hengrui Gu, Zhengming Zhang, Yongming Huang, Luxi Yang","doi":"10.1109/SiPS47522.2019.9020322","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020322","url":null,"abstract":"Semantic segmentation of retinal vessel images is of great value for clinical diagnosis. Due to the complex information of retinal vessel features, the existing algorithms have problems such as discontinuities of segmented vessels. To achieve better semantic segmentation results, we propose an encoder-decoder structure combined with dense convolution and depth separable convolution. Firstly, the images are enhanced by extracting the original green channel, limiting contrast histogram equalization and sharpening, then data argumentation is performed to expand the data set. Secondly, the processed images are trained by the proposed network using a weighted loss function. Finally, the test images are segmented by the trained model. The proposed algorithm is tested on the DRIVE data set, and its average accuracy, sensitivity and specificity reached 96.83%, 83.71%, and 98.95%, respectively.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132338043","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Low-density parity-check (LDPC) codes are used to correct encoding errors that occur during transmission, which enjoys an excellent performance. The performance of existing Min-Sum decoders for LDPC codes relies heavily on accurate channel estimation. A two-dimensional blind channel decoding algorithm that does not require precise channel estimation is presented in this paper. The algorithm converts the original one-dimensional signal into a two-dimensional LDPC signal according to the template. Dictionary learning is introduced for pre-filtering, and deep learning is adopted for further denoising and decoding. It is revealed that the two-dimensional blind decoding algorithm has a significant improvement over the traditional belief propagation (BP) decoding algorithm when the channel noise is unknown. Moreover, the combination of dictionary learning and deep learning has a great improvement in performance and data size reduction.
{"title":"A Channel-Blind Decoding for LDPC Based on Deep Learning and Dictionary Learning","authors":"Xu Pang, Chao Yang, Zaichen Zhang, X. You, Chuan Zhang","doi":"10.1109/SiPS47522.2019.9020628","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020628","url":null,"abstract":"Low-density parity-check (LDPC) codes are used to correct encoding errors that occur during transmission, which enjoys an excellent performance. The performance of existing Min-Sum decoders for LDPC codes relies heavily on accurate channel estimation. A two-dimensional blind channel decoding algorithm that does not require precise channel estimation is presented in this paper. The algorithm converts the original one-dimensional signal into a two-dimensional LDPC signal according to the template. Dictionary learning is introduced for pre-filtering, and deep learning is adopted for further denoising and decoding. It is revealed that the two-dimensional blind decoding algorithm has a significant improvement over the traditional belief propagation (BP) decoding algorithm when the channel noise is unknown. Moreover, the combination of dictionary learning and deep learning has a great improvement in performance and data size reduction.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130946503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2019-10-01DOI: 10.1109/SiPS47522.2019.9020365
Siyu Liao, Chunhua Deng, Lingjia Liu, Bo Yuan
Neural network has been applied into MIMO detection problem and has achieved the state-of-the-art performance. However, it is hard to deploy these large and deep neural network models to resource constrained platforms. In this paper, we impose the circulant structure inside neural network to generate a low complexity model for MIMO detection. This method can train the circulant structured network from scratch or convert from an existing dense neural network model. Experiments show that this algorithm can achieve half the model size with negligible performance drop.
{"title":"Structured Neural Network with Low Complexity for MIMO Detection","authors":"Siyu Liao, Chunhua Deng, Lingjia Liu, Bo Yuan","doi":"10.1109/SiPS47522.2019.9020365","DOIUrl":"https://doi.org/10.1109/SiPS47522.2019.9020365","url":null,"abstract":"Neural network has been applied into MIMO detection problem and has achieved the state-of-the-art performance. However, it is hard to deploy these large and deep neural network models to resource constrained platforms. In this paper, we impose the circulant structure inside neural network to generate a low complexity model for MIMO detection. This method can train the circulant structured network from scratch or convert from an existing dense neural network model. Experiments show that this algorithm can achieve half the model size with negligible performance drop.","PeriodicalId":256971,"journal":{"name":"2019 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"2005 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127655748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}