Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110023
Georgios Georgakarakos, Sudeep Kanur, J. Lilius, K. Desnos
Dataflow models of computation have early on been acknowledged as an attractive methodology to describe parallel algorithms, hence they have become highly relevant for programming in the current multicore processor era. While several frameworks provide tools to create dataflow descriptions of algorithms, generating parallel code for programmable processors is still sub-optimal due to the scheduling overheads and the semantics gap when expressing parallelism with conventional programming languages featuring threads. In this paper we propose an optimization of the parallel code generation process by combining dataflow and task programming models. We develop a task-based code generator for PREESM, a dataflow-based prototyping framework, in order to deploy algorithms described as synchronous dataflow graphs on multicore platforms. Experimental performance comparison of our task generated code against typical thread-based code shows that our approach removes significant scheduling and synchronization overheads while maintaining similar (and occasionally improving) application throughput.
{"title":"Task-based execution of synchronous dataflow graphs for scalable multicore computing","authors":"Georgios Georgakarakos, Sudeep Kanur, J. Lilius, K. Desnos","doi":"10.1109/SiPS.2017.8110023","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110023","url":null,"abstract":"Dataflow models of computation have early on been acknowledged as an attractive methodology to describe parallel algorithms, hence they have become highly relevant for programming in the current multicore processor era. While several frameworks provide tools to create dataflow descriptions of algorithms, generating parallel code for programmable processors is still sub-optimal due to the scheduling overheads and the semantics gap when expressing parallelism with conventional programming languages featuring threads. In this paper we propose an optimization of the parallel code generation process by combining dataflow and task programming models. We develop a task-based code generator for PREESM, a dataflow-based prototyping framework, in order to deploy algorithms described as synchronous dataflow graphs on multicore platforms. Experimental performance comparison of our task generated code against typical thread-based code shows that our approach removes significant scheduling and synchronization overheads while maintaining similar (and occasionally improving) application throughput.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125538206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8109980
Benjamin Barrois, O. Sentieys
This paper presents a comparison between custom fixed-point (FxP) and floating-point (FlP) arithmetic, applied to bidimensional K-means clustering algorithm. After a discussion on the K-means clustering algorithm and arithmetic characteristics, hardware implementations of FxP and FlP arithmetic operators are compared in terms of area, delay and energy, for different bitwidth, using the ApxPerf2.0 framework. Finally, both are compared in the context of K-means clustering. The direct comparison shows the large difference between 8-to-16-bit FxP and FlP operators, FlP adders consuming 5–12 χ more energy than FxP adders, and multipliers 2–10χ more. However, when applied to K-means clustering algorithm, the gap between FxP and FlP tightens. Indeed, the accuracy improvements brought by FlP make the computation more accurate and lead to an accuracy equivalent to FxP with less iterations of the algorithm, proportionally reducing the global energy spent. The 8-bit version of the algorithm becomes more profitable using FlP, which is 80% more accurate with only 1.6 χ more energy. This paper finally discusses the stake of custom FlP for low-energy general-purpose computation, thanks to its ease of use, supported by an energy overhead lower than what could have been expected.
{"title":"Customizing fixed-point and floating-point arithmetic — A case study in K-means clustering","authors":"Benjamin Barrois, O. Sentieys","doi":"10.1109/SiPS.2017.8109980","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8109980","url":null,"abstract":"This paper presents a comparison between custom fixed-point (FxP) and floating-point (FlP) arithmetic, applied to bidimensional K-means clustering algorithm. After a discussion on the K-means clustering algorithm and arithmetic characteristics, hardware implementations of FxP and FlP arithmetic operators are compared in terms of area, delay and energy, for different bitwidth, using the ApxPerf2.0 framework. Finally, both are compared in the context of K-means clustering. The direct comparison shows the large difference between 8-to-16-bit FxP and FlP operators, FlP adders consuming 5–12 χ more energy than FxP adders, and multipliers 2–10χ more. However, when applied to K-means clustering algorithm, the gap between FxP and FlP tightens. Indeed, the accuracy improvements brought by FlP make the computation more accurate and lead to an accuracy equivalent to FxP with less iterations of the algorithm, proportionally reducing the global energy spent. The 8-bit version of the algorithm becomes more profitable using FlP, which is 80% more accurate with only 1.6 χ more energy. This paper finally discusses the stake of custom FlP for low-energy general-purpose computation, thanks to its ease of use, supported by an energy overhead lower than what could have been expected.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121527467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110008
Mikko Parviainen, Pasi Pertilä
This article presents a method to obtain personalized Head-Related Transfer Functions (HRTFs) for creating virtual soundscapes based on small amount of measurements. The best matching set of HRTFs are selected among the entries from publicly available databases. The proposed method is evaluated using a listening test where subjects assess the audio samples created using the best matching set of HRTFs against a randomly chosen set of HRTFs from the same location. The listening test indicates that subjects prefer the proposed method over random set of HRTFs.
{"title":"Obtaining an optimal set of head-related transfer functions with a small amount of measurements","authors":"Mikko Parviainen, Pasi Pertilä","doi":"10.1109/SiPS.2017.8110008","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110008","url":null,"abstract":"This article presents a method to obtain personalized Head-Related Transfer Functions (HRTFs) for creating virtual soundscapes based on small amount of measurements. The best matching set of HRTFs are selected among the entries from publicly available databases. The proposed method is evaluated using a listening test where subjects assess the audio samples created using the best matching set of HRTFs against a randomly chosen set of HRTFs from the same location. The listening test indicates that subjects prefer the proposed method over random set of HRTFs.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132613771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the aid of a storage-release mechanism named key-keysmith, an implementation approach based on chemical reaction networks (CRNs) for synchronous sequential logic is proposed. This design approach, which stores logic information in keysmith and releases it through key, primarily focuses on the underlying state transitions behind the required logic rather than the electronic circuit representation. Therefore, it can be uniformly and easily employed to implement any synchronous sequential logic with molecular reactions. Theoretical analysis and numerical simulations have demonstrated the robustness and universality of the proposed approach.
{"title":"CRN-based design methodology for synchronous sequential logic","authors":"Zhiwei Zhong, Lulu Ge, Ziyuan Shen, X. You, Chuan Zhang","doi":"10.1109/SiPS.2017.8109979","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8109979","url":null,"abstract":"With the aid of a storage-release mechanism named key-keysmith, an implementation approach based on chemical reaction networks (CRNs) for synchronous sequential logic is proposed. This design approach, which stores logic information in keysmith and releases it through key, primarily focuses on the underlying state transitions behind the required logic rather than the electronic circuit representation. Therefore, it can be uniformly and easily employed to implement any synchronous sequential logic with molecular reactions. Theoretical analysis and numerical simulations have demonstrated the robustness and universality of the proposed approach.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134310204","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110011
Yu Gong, Tingting Xu, Bo Liu, Wei-qi Ge, Jinjiang Yang, Jun Yang, Longxing Shi
With the rapidly increasing applications of deep learning, LSTM-RNNs are widely used. Meanwhile, the complex data dependence and intensive computation limit the performance of the accelerators. In this paper, we first proposed a hybrid network expansion model to exploit the finegrained data parallelism. Based on the model, we implemented a Reconfigurable Processing Unit(RPU) using Processing In Memory(PIM) units. Our work shows that the gates and cells in LSTM can be partitioned to fundamental operations and then recombined and mapped into heterogeneous computing components. The experimental results show that, implemented on 45nm CMOS process, the proposed RPU with size of 1.51 mm2 and power of 413 mw achieves 309 GOPS/W in power efficiency, and is 1.7 χ better than state-of-the-art reconfigurable architecture.
{"title":"Processing LSTM in memory using hybrid network expansion model","authors":"Yu Gong, Tingting Xu, Bo Liu, Wei-qi Ge, Jinjiang Yang, Jun Yang, Longxing Shi","doi":"10.1109/SiPS.2017.8110011","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110011","url":null,"abstract":"With the rapidly increasing applications of deep learning, LSTM-RNNs are widely used. Meanwhile, the complex data dependence and intensive computation limit the performance of the accelerators. In this paper, we first proposed a hybrid network expansion model to exploit the finegrained data parallelism. Based on the model, we implemented a Reconfigurable Processing Unit(RPU) using Processing In Memory(PIM) units. Our work shows that the gates and cells in LSTM can be partitioned to fundamental operations and then recombined and mapped into heterogeneous computing components. The experimental results show that, implemented on 45nm CMOS process, the proposed RPU with size of 1.51 mm2 and power of 413 mw achieves 309 GOPS/W in power efficiency, and is 1.7 χ better than state-of-the-art reconfigurable architecture.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"108 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133414533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110022
B. Gal, Camille Leroux, C. Jégo
Polar codes are a family of error correcting codes that achieves the symmetric capacity of memoryless channels when the code length N tends to infinity. However, moderate code lengths are required in most of wireless digital applications to limit the decoding latency. In some other applications, such as optical communications or quantum key distribution, the latency introduced by very long codes is not an issue. The main challenge is to design codes with the best error correction capability, a tractable complexity and a high throughput. In such a context, SC decoding is an interesting solution because its performance improves with N while the computational complexity scales almost linearly. In this paper, we propose to improve the scalability of SC decoders thanks to four architectural optimizations. The resulting SC decoder is implemented on an FPGA device and favorably compares with state-of-the-art scalable SC decoders. Moreover, a 222 polar code SC decoder is implemented on a Stratix-5 FPGA. This code length is twice larger than the ones achieved in previous works. To the best of our knowledge, this is the first architecture for which a N = 4 million bits polar code can be actually decoded on a reconfigurable circuit.
{"title":"Successive cancellation decoder for very long polar codes","authors":"B. Gal, Camille Leroux, C. Jégo","doi":"10.1109/SiPS.2017.8110022","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110022","url":null,"abstract":"Polar codes are a family of error correcting codes that achieves the symmetric capacity of memoryless channels when the code length N tends to infinity. However, moderate code lengths are required in most of wireless digital applications to limit the decoding latency. In some other applications, such as optical communications or quantum key distribution, the latency introduced by very long codes is not an issue. The main challenge is to design codes with the best error correction capability, a tractable complexity and a high throughput. In such a context, SC decoding is an interesting solution because its performance improves with N while the computational complexity scales almost linearly. In this paper, we propose to improve the scalability of SC decoders thanks to four architectural optimizations. The resulting SC decoder is implemented on an FPGA device and favorably compares with state-of-the-art scalable SC decoders. Moreover, a 222 polar code SC decoder is implemented on a Stratix-5 FPGA. This code length is twice larger than the ones achieved in previous works. To the best of our knowledge, this is the first architecture for which a N = 4 million bits polar code can be actually decoded on a reconfigurable circuit.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128100582","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110020
A. Jallouli, Fatma Belghith, M. A. B. Ayed, W. Hamidouche, J. Nezan, N. Masmoudi
The Post-HEVC is the emerging video coding standard beyond the High Efficiency Video Coding (HEVC) standard. It is more complex in transformation and prediction steps but it offers the opportunity of 3D and 360° videos coding and compression. This paper presents different statistical analyzes of Post-HEVC encoded videos especially analysis on 1D and 2D transformation types and analysis on intra and inter prediction types of some test videos for different classes and resolutions. Analyzes are carried out at the decoder level where the coding decision has already been taken by the encoder. Results show that the choice of transformation (type and size) and the prediction type (intra or inter) depends on the nature of video: motion and texture. This work can be considered as a milestone for proposing intelligent algorithms based on video characteristics to perform fast decision in the Post-HEVC encoding process.
后HEVC是继HEVC (High Efficiency video coding)标准之后的新兴视频编码标准。它在转换和预测步骤上比较复杂,但为3D和360°视频编码和压缩提供了机会。本文对hevc编码后的视频进行了不同的统计分析,特别是对一维和二维变换类型的分析,以及对不同类别和分辨率的一些测试视频的内预测和间预测类型的分析。在编码器已经做出编码决定的地方,在解码器级别进行分析。结果表明,变换(类型和大小)和预测类型(内部或内部)的选择取决于视频的性质:运动和纹理。这项工作可以被认为是提出基于视频特征的智能算法在后hevc编码过程中进行快速决策的里程碑。
{"title":"Statistical analysis of Post-HEVC encoded videos","authors":"A. Jallouli, Fatma Belghith, M. A. B. Ayed, W. Hamidouche, J. Nezan, N. Masmoudi","doi":"10.1109/SiPS.2017.8110020","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110020","url":null,"abstract":"The Post-HEVC is the emerging video coding standard beyond the High Efficiency Video Coding (HEVC) standard. It is more complex in transformation and prediction steps but it offers the opportunity of 3D and 360° videos coding and compression. This paper presents different statistical analyzes of Post-HEVC encoded videos especially analysis on 1D and 2D transformation types and analysis on intra and inter prediction types of some test videos for different classes and resolutions. Analyzes are carried out at the decoder level where the coding decision has already been taken by the encoder. Results show that the choice of transformation (type and size) and the prediction type (intra or inter) depends on the nature of video: motion and texture. This work can be considered as a milestone for proposing intelligent algorithms based on video characteristics to perform fast decision in the Post-HEVC encoding process.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114180474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110000
Swati Bhardwaj, Shashank Raghuraman, A. Acharyya
This paper proposes a low complex hardware accelerator algorithmic modification for n-dimensional (nD) FastICA methodology based on Coordinate Rotation Digital Computer (CORDIC) to attain high computation speed. The most complex and time consuming update stage and convergence check required for computation of the nth weight vector are eliminated in the proposed methodology. Using the Gram-Schmidt Orthogonalization stage and normalization stage to calculate nth weight vector in an entirely sequential procedure of CORDIC-based FastICA results in a significant gain in terms of the computation time. The proposed methodology has been functionally verified and validated by applying it for separating 6D speech signals. It has been implemented on hardware using Verilog HDL and synthesized using UMC 180nm technology. The average improvement in computation time obtained by using the proposed methodology for 4D to 6D FastICA with 1024 samples, considering the minimum case of two iterations for nth stage, was found to be 98.79 %.
{"title":"Low complexity hardware accelerator for nD FastICA based on coordinate rotation","authors":"Swati Bhardwaj, Shashank Raghuraman, A. Acharyya","doi":"10.1109/SiPS.2017.8110000","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110000","url":null,"abstract":"This paper proposes a low complex hardware accelerator algorithmic modification for n-dimensional (nD) FastICA methodology based on Coordinate Rotation Digital Computer (CORDIC) to attain high computation speed. The most complex and time consuming update stage and convergence check required for computation of the nth weight vector are eliminated in the proposed methodology. Using the Gram-Schmidt Orthogonalization stage and normalization stage to calculate nth weight vector in an entirely sequential procedure of CORDIC-based FastICA results in a significant gain in terms of the computation time. The proposed methodology has been functionally verified and validated by applying it for separating 6D speech signals. It has been implemented on hardware using Verilog HDL and synthesized using UMC 180nm technology. The average improvement in computation time obtained by using the proposed methodology for 4D to 6D FastICA with 1024 samples, considering the minimum case of two iterations for nth stage, was found to be 98.79 %.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124309218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8109993
Yuri Nishizumi, Go Matsukawa, K. Kajihara, T. Kodama, S. Izumi, H. Kawaguchi, C. Nakanishi, Toshio Goto, Takeo Kato, M. Yoshimoto
This paper describes FPGA implementation of object recognition processor for HDTV resolution 30 fps video using the Sparse FIND feature. Two-stage feature extraction processing by HOG and Sparse FIND, a highly parallel classification in the support vector machine (SVM), and a block-parallel processing for RAM access cycle reduction are proposed to perform a real time object recognition with enormous computational complexity. From implementation of the proposed architecture in the FPGA, it was confirmed that detection using the Sparse FIND feature was performed for HDTV images at 47.63 fps, on average, at 90 MHz. The recognition accuracy degradation from the original Sparse FIND-base object detection algorithm implemented on software was 0.5%, which shows that the FPGA system provides sufficient accuracy for practical use.
{"title":"FPGA implementation of object recognition processor for HDTV resolution video using sparse FIND feature","authors":"Yuri Nishizumi, Go Matsukawa, K. Kajihara, T. Kodama, S. Izumi, H. Kawaguchi, C. Nakanishi, Toshio Goto, Takeo Kato, M. Yoshimoto","doi":"10.1109/SiPS.2017.8109993","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8109993","url":null,"abstract":"This paper describes FPGA implementation of object recognition processor for HDTV resolution 30 fps video using the Sparse FIND feature. Two-stage feature extraction processing by HOG and Sparse FIND, a highly parallel classification in the support vector machine (SVM), and a block-parallel processing for RAM access cycle reduction are proposed to perform a real time object recognition with enormous computational complexity. From implementation of the proposed architecture in the FPGA, it was confirmed that detection using the Sparse FIND feature was performed for HDTV images at 47.63 fps, on average, at 90 MHz. The recognition accuracy degradation from the original Sparse FIND-base object detection algorithm implemented on software was 0.5%, which shows that the FPGA system provides sufficient accuracy for practical use.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116800175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2017-10-01DOI: 10.1109/SiPS.2017.8110016
Meng-Ya Tsai, Ching-Yao Chou, A. Wu
To realize Electrocardiography (ECG) signals monitoring systems, compressive sensing (CS) is a new technique to reduce power of biosensors and data transmission. Instead of spending high complexity on reconstructing back to data domain to do signal analysis, compressed analysis (CA) exploits the data structure preserved by CS to directly analyze in the compressed domain. However, compressively-sensed signals contaminated by interference cause learning performance degradation. Meanwhile, traditional interference removal methods are developed for signals in data domain, which involve reconstruction. In this paper, we propose a new CA framework using pre-trained subspace-based dictionary to project interfered and compressed data onto the subspace with high learnability and low complexity. Through simulations, we show that our technique enables 5.64% improvements on accuracy of detection compared with conventional CA, and reduces 99% complexity compared with reconstructed analysis.
{"title":"Robust compressed analysis using subspace-based dictionary for ECG telemonitoring systems","authors":"Meng-Ya Tsai, Ching-Yao Chou, A. Wu","doi":"10.1109/SiPS.2017.8110016","DOIUrl":"https://doi.org/10.1109/SiPS.2017.8110016","url":null,"abstract":"To realize Electrocardiography (ECG) signals monitoring systems, compressive sensing (CS) is a new technique to reduce power of biosensors and data transmission. Instead of spending high complexity on reconstructing back to data domain to do signal analysis, compressed analysis (CA) exploits the data structure preserved by CS to directly analyze in the compressed domain. However, compressively-sensed signals contaminated by interference cause learning performance degradation. Meanwhile, traditional interference removal methods are developed for signals in data domain, which involve reconstruction. In this paper, we propose a new CA framework using pre-trained subspace-based dictionary to project interfered and compressed data onto the subspace with high learnability and low complexity. Through simulations, we show that our technique enables 5.64% improvements on accuracy of detection compared with conventional CA, and reduces 99% complexity compared with reconstructed analysis.","PeriodicalId":251688,"journal":{"name":"2017 IEEE International Workshop on Signal Processing Systems (SiPS)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2017-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128676923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}