Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659694
A. Kan, Z. Peng, K. Moua, R. Litovsky
Bilateral cochlear implantation is becoming the standard of care for patients with sensorineural hearing loss with demonstrated improvements over unilateral use in everyday tasks, such as sound localization ability. However, even with bilateral implantation, performance in these tasks is still poorer than that of normal hearing listeners. The gap in performance has often been attributed to the poor encoding of fine structure interaural time differences (ITDs) by clinical processor. However, in theory, the signal processing employed in clinical processors should still encode envelope ITDs with some degree of fidelity. In this work, we quantitatively measured the ability of Cochlear CP910 processors to encode envelope ITDs, while running the Advanced Combinational Encoder (ACE) strategy. Results suggest that while the processors are able to support relatively good envelope encoding, the peak-picking approach of the ACE strategy degrades the computation of ITDs by encoding spectral information in different frequency regions in the two ears. Our results may explain the poorer sound localization performance observed in cochlear implant users who use the ACE strategy, but cannot account for the poorer sound localization performance observed in cochlear implant users in general.
{"title":"A systematic assessment of a cochlear implant processor's ability to encode interaural time differences","authors":"A. Kan, Z. Peng, K. Moua, R. Litovsky","doi":"10.23919/APSIPA.2018.8659694","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659694","url":null,"abstract":"Bilateral cochlear implantation is becoming the standard of care for patients with sensorineural hearing loss with demonstrated improvements over unilateral use in everyday tasks, such as sound localization ability. However, even with bilateral implantation, performance in these tasks is still poorer than that of normal hearing listeners. The gap in performance has often been attributed to the poor encoding of fine structure interaural time differences (ITDs) by clinical processor. However, in theory, the signal processing employed in clinical processors should still encode envelope ITDs with some degree of fidelity. In this work, we quantitatively measured the ability of Cochlear CP910 processors to encode envelope ITDs, while running the Advanced Combinational Encoder (ACE) strategy. Results suggest that while the processors are able to support relatively good envelope encoding, the peak-picking approach of the ACE strategy degrades the computation of ITDs by encoding spectral information in different frequency regions in the two ears. Our results may explain the poorer sound localization performance observed in cochlear implant users who use the ACE strategy, but cannot account for the poorer sound localization performance observed in cochlear implant users in general.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128163562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659532
Po-Chiang Lin, Sheng-Lun Huang
NGFI, the Next Generation Fronthaul Interface, is a promising fronthaul interface for the C-RAN (Cloud Radio Access Network). NGFI is used to connect RCC (Radio Cloud Center) and RRS (Radio Remote System) in order to avoid the drawbacks of traditional CPRI (Common Public Radio Interface). In this paper we investigate the NGFI-based C-RAN. We use the OpenAirInterface (OAI) open source 4G/5G mobile communication software and GPP (general purpose processor) based servers and personal computers to build an OAI C-RAN testbed. We also use the source codes of OAI to run the performance profiling on this OAI C-RAN testbed to understand the behavior of this testbed. The purpose of this paper is to build the comprehensive performance profiling methods and results on the OAI C-RAN system, and to use these results to help designing and optimizing the OAI C-RAN system. Based on the results, we could decide which part of the system software to optimize to improve the system speed and the efficiency of memory usage.
NGFI,下一代前传接口,是一种很有前途的用于C-RAN(云无线接入网)的前传接口。NGFI用于连接RCC (Radio Cloud Center)和RRS (Radio Remote System),避免了传统CPRI (Common Public Radio Interface)的弊端。本文研究了基于ngfi的C-RAN。我们使用OpenAirInterface (OAI)开源4G/5G移动通信软件和基于GPP(通用处理器)的服务器和个人计算机构建了OAI C-RAN测试平台。我们还使用OAI的源代码在这个OAI C-RAN测试平台上运行性能分析,以了解这个测试平台的行为。本文的目的是建立OAI C-RAN系统的综合性能分析方法和结果,并利用这些结果帮助OAI C-RAN系统的设计和优化。根据结果,我们可以决定对系统软件的哪个部分进行优化,以提高系统速度和内存使用效率。
{"title":"Performance Profiling of Cloud Radio Access Networks using OpenAirInterface","authors":"Po-Chiang Lin, Sheng-Lun Huang","doi":"10.23919/APSIPA.2018.8659532","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659532","url":null,"abstract":"NGFI, the Next Generation Fronthaul Interface, is a promising fronthaul interface for the C-RAN (Cloud Radio Access Network). NGFI is used to connect RCC (Radio Cloud Center) and RRS (Radio Remote System) in order to avoid the drawbacks of traditional CPRI (Common Public Radio Interface). In this paper we investigate the NGFI-based C-RAN. We use the OpenAirInterface (OAI) open source 4G/5G mobile communication software and GPP (general purpose processor) based servers and personal computers to build an OAI C-RAN testbed. We also use the source codes of OAI to run the performance profiling on this OAI C-RAN testbed to understand the behavior of this testbed. The purpose of this paper is to build the comprehensive performance profiling methods and results on the OAI C-RAN system, and to use these results to help designing and optimizing the OAI C-RAN system. Based on the results, we could decide which part of the system software to optimize to improve the system speed and the efficiency of memory usage.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125683182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659629
Xiaobai Chen, Shanlin Xiao, Zhiyi Yu
Convolutional neural network (CNN) is the most powerful artificial intelligence algorithm widely used in computer vision due to its state-of-the-art performance. There are many accelerators proposed for CNN to handle its huge computation and communication cost. In this paper we proposed a reconfigurable process engine which can support different data flows, bit-widths, and parallelism strategies for CNNs. The process engine was implemented on Xilinx ZC706 FPGA board, with high flexibility to support all popular CNNs, and better energy efficiency compared to other state-of-the-art designs.
{"title":"A Reconfigurable Process Engine for Flexible Convolutional Neural Network Acceleration","authors":"Xiaobai Chen, Shanlin Xiao, Zhiyi Yu","doi":"10.23919/APSIPA.2018.8659629","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659629","url":null,"abstract":"Convolutional neural network (CNN) is the most powerful artificial intelligence algorithm widely used in computer vision due to its state-of-the-art performance. There are many accelerators proposed for CNN to handle its huge computation and communication cost. In this paper we proposed a reconfigurable process engine which can support different data flows, bit-widths, and parallelism strategies for CNNs. The process engine was implemented on Xilinx ZC706 FPGA board, with high flexibility to support all popular CNNs, and better energy efficiency compared to other state-of-the-art designs.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"83 7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127979305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659709
Kazuhiro Hara, Miwa Katayama, M. Kawakita, T. Fujii, T. Mishina
Effective compression technology is required to reduce the huge amount of information for integral three-dimensional (3D) television. For compressing an integral 3D image, we propose a compression method of converting elemental images to multiview images and of applying multiview video coding to part of the multiview images and their depth maps. In this method, the relationship between the number of the part of the multiview images and the image quality degradation of a reconstructed 3D image was studied by subjective evaluation experiment, and we confirmed the amount of information required for displaying an acceptable reconstructed 3D image. As a result, the reconstructed 3D images with acceptable image quality were obtained with about 2/9 times the amount of information for coding all the multiview images converted from the elemental images.
{"title":"Integral 3D image coding by using multiview video compression technologies","authors":"Kazuhiro Hara, Miwa Katayama, M. Kawakita, T. Fujii, T. Mishina","doi":"10.23919/APSIPA.2018.8659709","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659709","url":null,"abstract":"Effective compression technology is required to reduce the huge amount of information for integral three-dimensional (3D) television. For compressing an integral 3D image, we propose a compression method of converting elemental images to multiview images and of applying multiview video coding to part of the multiview images and their depth maps. In this method, the relationship between the number of the part of the multiview images and the image quality degradation of a reconstructed 3D image was studied by subjective evaluation experiment, and we confirmed the amount of information required for displaying an acceptable reconstructed 3D image. As a result, the reconstructed 3D images with acceptable image quality were obtained with about 2/9 times the amount of information for coding all the multiview images converted from the elemental images.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121751092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659682
Renoh Johnson Chalakkal, W. Abdulla
Vessel segmentation from the fundus retinal images is highly significant in diagnosing many pathologies related to eye and other systemic diseases. Even though there are many methods in the literature focusing on this task, most of these methods are not focusing on the small peripheral vessels segmentation. In this paper, we propose a new approach based on curvelet transform and line operators which can segment the small peripheral vessels with very high accuracy resulting in a higher sensitivity compared to the other state-of-the-art methods. In the proposed approach, the contrast between the retinal vessels and the background pixels is enhanced by applying a series of image processing steps involving color space transformation, adaptive histogram equalization, and anisotropic diffusion filtering. Then by using the modified curvelet transform coefficients, the retinal vessel edge contrast is further enhanced. Finally, the vessels are segmented out by applying the line operator response, followed by suitable thresholding to obtain the segmented vessels. Post processing is carried out to remove the scattered unwanted background pixels. The performance of the method is compared against the other state-of-the-art methods using DRIVE as a testing database. An average sensitivity, specificity, accuracy and positive predictive value of 0.7653, 0.9735, 0.9542 and 0.7438 are respectively achieved.
{"title":"Improved Vessel Segmentation Using Curvelet Transform and Line Operators","authors":"Renoh Johnson Chalakkal, W. Abdulla","doi":"10.23919/APSIPA.2018.8659682","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659682","url":null,"abstract":"Vessel segmentation from the fundus retinal images is highly significant in diagnosing many pathologies related to eye and other systemic diseases. Even though there are many methods in the literature focusing on this task, most of these methods are not focusing on the small peripheral vessels segmentation. In this paper, we propose a new approach based on curvelet transform and line operators which can segment the small peripheral vessels with very high accuracy resulting in a higher sensitivity compared to the other state-of-the-art methods. In the proposed approach, the contrast between the retinal vessels and the background pixels is enhanced by applying a series of image processing steps involving color space transformation, adaptive histogram equalization, and anisotropic diffusion filtering. Then by using the modified curvelet transform coefficients, the retinal vessel edge contrast is further enhanced. Finally, the vessels are segmented out by applying the line operator response, followed by suitable thresholding to obtain the segmented vessels. Post processing is carried out to remove the scattered unwanted background pixels. The performance of the method is compared against the other state-of-the-art methods using DRIVE as a testing database. An average sensitivity, specificity, accuracy and positive predictive value of 0.7653, 0.9735, 0.9542 and 0.7438 are respectively achieved.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131353059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659625
Rui Cheng, C. Bao, Yang Xiang
Speech enhancement is an important issue in the field of speech signal processing. With the development of deep learning, speech enhancement technology combined with neural network has provided a more diverse solution for this field. In this paper, we present a new approach to enhance the noisy speech, which is recorded by a single channel. We propose a phase correction method, which is based on the joint optimization of clean speech and noise by deep neural network (DNN). In this method, the ideal ratio masking (IRM) is employed to estimate the clean speech and noise, and the phase correction is combined to get the final clean speech. Experiments are conducted by using TIMIT corpus combined with four types of noises at three different signal to noise ratio (SNR) levels. The results show that the proposed method has a significant improvement over the referenced DNN-based enhancement method for both objective evaluation criterion and subjective evaluation criterion.
{"title":"Speech Enhancement with Phase Correction based on Modified DNN Architecture","authors":"Rui Cheng, C. Bao, Yang Xiang","doi":"10.23919/APSIPA.2018.8659625","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659625","url":null,"abstract":"Speech enhancement is an important issue in the field of speech signal processing. With the development of deep learning, speech enhancement technology combined with neural network has provided a more diverse solution for this field. In this paper, we present a new approach to enhance the noisy speech, which is recorded by a single channel. We propose a phase correction method, which is based on the joint optimization of clean speech and noise by deep neural network (DNN). In this method, the ideal ratio masking (IRM) is employed to estimate the clean speech and noise, and the phase correction is combined to get the final clean speech. Experiments are conducted by using TIMIT corpus combined with four types of noises at three different signal to noise ratio (SNR) levels. The results show that the proposed method has a significant improvement over the referenced DNN-based enhancement method for both objective evaluation criterion and subjective evaluation criterion.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129965004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659548
Byeongyong Ahn, Yoonsik Kim, G. Park, N. Cho
There are two main streams in up-to-date image denoising algorithms: non-local self similarity (NSS) prior based methods and convolutional neural network (CNN) based methods. The NSS based methods are favorable on images with regular and repetitive patterns while the CNN based methods perform better on irregular structures. In this paper, we propose a block-matching convolutional neural network (BMCNN) method that combines NSS prior and CNN. Initially, similar local patches in the input image are integrated into a 3D block. In order to prevent the noise from messing up the block matching, we first apply an existing denoising algorithm on the noisy image. The denoised image is employed as a pilot signal for the block matching, and then denoising function for the block is learned by a CNN structure. Experimental results show that the proposed BMCNN algorithm achieves state-of-the-art performance. In detail, BMCNN can restore both repetitive and irregular structures.
{"title":"Block-Matching Convolutional Neural Network (BMCNN): Improving CNN-Based Denoising by Block-Matched Inputs","authors":"Byeongyong Ahn, Yoonsik Kim, G. Park, N. Cho","doi":"10.23919/APSIPA.2018.8659548","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659548","url":null,"abstract":"There are two main streams in up-to-date image denoising algorithms: non-local self similarity (NSS) prior based methods and convolutional neural network (CNN) based methods. The NSS based methods are favorable on images with regular and repetitive patterns while the CNN based methods perform better on irregular structures. In this paper, we propose a block-matching convolutional neural network (BMCNN) method that combines NSS prior and CNN. Initially, similar local patches in the input image are integrated into a 3D block. In order to prevent the noise from messing up the block matching, we first apply an existing denoising algorithm on the noisy image. The denoised image is employed as a pilot signal for the block matching, and then denoising function for the block is learned by a CNN structure. Experimental results show that the proposed BMCNN algorithm achieves state-of-the-art performance. In detail, BMCNN can restore both repetitive and irregular structures.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"306 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133912254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659676
Yuki Kojoma, Y. Washizawa
Electroencephalography(EEG) has been used widely in biomedical research and consumer products because of its reasonable size and cost. In order to reduce the electrical impedance between electrodes and skin of the scalp, we use conductive gel. However, it takes time to setup EEG. This problem is solved by dry electrodes, which do not require to use the conductive gel, however, the signal quality of dry electrodes is lower than that of wet electrodes. In this research, we propose a method to improve quality of the dry EEG signal. In order to design a restoration filter, we prepare wet and dry EEG signals recorded simultaneously. Then the filter is trained by both wet and dry EEG signals to restore wet EEG signal from dry EEG signal input. We used the fully connected deep neural network (DNN) and convolutional neural network (CNN). We conducted an experiment using the oddball paradigm to demonstrate the proposed method and compare with the classical Wiener filter.
{"title":"Restoration of dry electrode EEG using deep convolutional neural network","authors":"Yuki Kojoma, Y. Washizawa","doi":"10.23919/APSIPA.2018.8659676","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659676","url":null,"abstract":"Electroencephalography(EEG) has been used widely in biomedical research and consumer products because of its reasonable size and cost. In order to reduce the electrical impedance between electrodes and skin of the scalp, we use conductive gel. However, it takes time to setup EEG. This problem is solved by dry electrodes, which do not require to use the conductive gel, however, the signal quality of dry electrodes is lower than that of wet electrodes. In this research, we propose a method to improve quality of the dry EEG signal. In order to design a restoration filter, we prepare wet and dry EEG signals recorded simultaneously. Then the filter is trained by both wet and dry EEG signals to restore wet EEG signal from dry EEG signal input. We used the fully connected deep neural network (DNN) and convolutional neural network (CNN). We conducted an experiment using the oddball paradigm to demonstrate the proposed method and compare with the classical Wiener filter.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134000673","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659510
Gajan Suthokumar, Kaavya Sriskandaraja, V. Sethu, C. Wijenayake, E. Ambikairajah, Haizhou Li
Replay attacks are the simplest form of spoofing attacks on automatic speaker verification (ASV) systems and consequently the detection of these attacks is a critical research problem. Currently, most research on replay detection focuses on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single common spoofed model as well as a single common genuine model. This paper investigates the potential advantages of sharing speaker data between the speaker verification system and the replay detection system. Specifically, it explores the benefits of using the claimed speaker's model in place of the common genuine model. The proposed approach is validated on a modified evaluation set of the ASVspoof 2017 version 2.0 corpus and show that the use of adapted speaker models is far superior to the use of a single common genuine model.
{"title":"Use of Claimed Speaker Models for Replay Detection","authors":"Gajan Suthokumar, Kaavya Sriskandaraja, V. Sethu, C. Wijenayake, E. Ambikairajah, Haizhou Li","doi":"10.23919/APSIPA.2018.8659510","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659510","url":null,"abstract":"Replay attacks are the simplest form of spoofing attacks on automatic speaker verification (ASV) systems and consequently the detection of these attacks is a critical research problem. Currently, most research on replay detection focuses on developing a stand-alone countermeasure that runs independently of a speaker verification system by training a single common spoofed model as well as a single common genuine model. This paper investigates the potential advantages of sharing speaker data between the speaker verification system and the replay detection system. Specifically, it explores the benefits of using the claimed speaker's model in place of the common genuine model. The proposed approach is validated on a modified evaluation set of the ASVspoof 2017 version 2.0 corpus and show that the use of adapted speaker models is far superior to the use of a single common genuine model.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134295562","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2018-11-01DOI: 10.23919/APSIPA.2018.8659486
Shahab Pasha, C. Ritz, D. Stirling, P. Zulli, D. Pinson, S. Chew
This paper proposes the use of deep learning classification for acoustic monitoring of an industrial process. Specifically, the application is to process sound recordings to detect when additional air leaks through gaps between grate bars lining the bottom of the sinter strand pallets, caused by thermal cycling, aging and deterioration. Detecting holes is not possible visually as the hole is usually small and covered with a granular bed of sinter/blend material. Acoustic signals from normal operation and periods of air leakage are fed into the basic supervised classification methods (SVM and J48) and the deep learning networks, to learn and distinguish the differences. Results suggest that the applied deep learning approach can effectively detect the acoustic emissions from holes time segments with a minimum 79% of accuracy.
{"title":"A Deep Learning Approach to the Acoustic Condition Monitoring of a Sintering Plant","authors":"Shahab Pasha, C. Ritz, D. Stirling, P. Zulli, D. Pinson, S. Chew","doi":"10.23919/APSIPA.2018.8659486","DOIUrl":"https://doi.org/10.23919/APSIPA.2018.8659486","url":null,"abstract":"This paper proposes the use of deep learning classification for acoustic monitoring of an industrial process. Specifically, the application is to process sound recordings to detect when additional air leaks through gaps between grate bars lining the bottom of the sinter strand pallets, caused by thermal cycling, aging and deterioration. Detecting holes is not possible visually as the hole is usually small and covered with a granular bed of sinter/blend material. Acoustic signals from normal operation and periods of air leakage are fed into the basic supervised classification methods (SVM and J48) and the deep learning networks, to learn and distinguish the differences. Results suggest that the applied deep learning approach can effectively detect the acoustic emissions from holes time segments with a minimum 79% of accuracy.","PeriodicalId":287799,"journal":{"name":"2018 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2018-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130781560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}