Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840854
Vidya Bhasker Shukla, R. Mitra, V. Bhatia
Millimeter-wave multiple-input multiple-output (mmWave MIMO) has emerged as a viable technique for 5G and beyond 5G(B5G) wireless networks, promising higher spectral efficiency and increased data speeds. However, achieving high spectral efficiency and data rates requires precise channel estimation, which is difficult for mmWave MIMO due to scattering and blockages in general. Because of scattering and blockages, mmWave MIMO channels have intrinsic sparsity, which needs sparse-aware channel estimation algorithms. As a result, this work propose a variable step-size zero-attracting least mean squares (VSSZALMS) based channel-estimator. In VSSZALMS the step-size increases (or decreases) as the mean-square error (MSE) increases (or decreases) that’s result adaptive estimator based on VSSZALMS achieves better tracking and faster convergence rate. Convergence and steady-state behavior of estimator is analyzed. Simulations for a typical mmWave MIMO channels demonstrate the benefits of the proposed sparse channel-estimation approach and its convergence.
{"title":"Millimeter Wave Hybrid MIMO System Channel Estimation Using Variable Step Size Zero Attracting LMS","authors":"Vidya Bhasker Shukla, R. Mitra, V. Bhatia","doi":"10.1109/SPCOM55316.2022.9840854","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840854","url":null,"abstract":"Millimeter-wave multiple-input multiple-output (mmWave MIMO) has emerged as a viable technique for 5G and beyond 5G(B5G) wireless networks, promising higher spectral efficiency and increased data speeds. However, achieving high spectral efficiency and data rates requires precise channel estimation, which is difficult for mmWave MIMO due to scattering and blockages in general. Because of scattering and blockages, mmWave MIMO channels have intrinsic sparsity, which needs sparse-aware channel estimation algorithms. As a result, this work propose a variable step-size zero-attracting least mean squares (VSSZALMS) based channel-estimator. In VSSZALMS the step-size increases (or decreases) as the mean-square error (MSE) increases (or decreases) that’s result adaptive estimator based on VSSZALMS achieves better tracking and faster convergence rate. Convergence and steady-state behavior of estimator is analyzed. Simulations for a typical mmWave MIMO channels demonstrate the benefits of the proposed sparse channel-estimation approach and its convergence.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132594962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840805
Deepika Gupta, H. S. Shekhawat
Artificial bandwidth extension is applied to speech signals to improve their quality in narrowband telephonic communication. For accomplishing this, the missing high-frequency components of speech signals are recovered by utilizing an extrapolation process. In this context, we propose another structure wherein we apply the gain adjustment as well as the discrete Fourier transform addition for adding the narrowband signal and corresponding estimated high-band signal. The high-band signal is evaluated by using a synthesis filter, which is acquired by utilizing the $H^{infty}$ optimization and speech production model. Non-stationary (time-varying) characteristics of speech signals produce assorted variety in the synthesis filters. So, we use a feed-forward deep neural network (DNN) to estimate the synthesis filter information and gain factor for a given narrowband feature of the signal. Objective analysis is done on the RSR15 and TIMIT datasets. Additionally, objective analysis is performed separately for the voiced speech as well as for the unvoiced speech. Subjective evaluation is conducted on the RSR15 dataset.
{"title":"Artificial Bandwidth Extension Using H∞ Optimization, Deep Neural Network, and Speech Production Model","authors":"Deepika Gupta, H. S. Shekhawat","doi":"10.1109/SPCOM55316.2022.9840805","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840805","url":null,"abstract":"Artificial bandwidth extension is applied to speech signals to improve their quality in narrowband telephonic communication. For accomplishing this, the missing high-frequency components of speech signals are recovered by utilizing an extrapolation process. In this context, we propose another structure wherein we apply the gain adjustment as well as the discrete Fourier transform addition for adding the narrowband signal and corresponding estimated high-band signal. The high-band signal is evaluated by using a synthesis filter, which is acquired by utilizing the $H^{infty}$ optimization and speech production model. Non-stationary (time-varying) characteristics of speech signals produce assorted variety in the synthesis filters. So, we use a feed-forward deep neural network (DNN) to estimate the synthesis filter information and gain factor for a given narrowband feature of the signal. Objective analysis is done on the RSR15 and TIMIT datasets. Additionally, objective analysis is performed separately for the voiced speech as well as for the unvoiced speech. Subjective evaluation is conducted on the RSR15 dataset.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124863883","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840515
Gauri P. Prajapati, D. Singh, H. Patil
Privacy preservation methods for voice data are evolving day by day. A recent state-of-the-art voice privacy algorithm uses an x-vector and neural source-filter (NSF)- based anonymization approach that converts the original input voice into a pseudo speaker’s voice. The method uses an affinity propagation clustering (APC) algorithm to choose a pseudo speaker’s x-vector. Finding a set of distance measures for this clustering technique is important to get optimal anonymization. To that effect, in this paper, an attempt has been made to investigate the effect of six distance measures, namely, Euclidean, cosine, probabilistic linear discriminant analysis (PLDA), correlation, Manhattan, and Mahalanobis for voice privacy preservation using an x-vector-based anonymization system. This approach gave a 4.75% relative improvement in Equal Error Rate(EER) for original enrolls and anonymized trials. In addition, 11.49% relative improvement in EER is observed for anonymized enrolls and trials. Experimental results show that Mahalanobis and Pearson correlation coefficient-based distance are better choices for anonymization tasks. It provides better speaker de-identification and good speech intelligibility without increasing system complexity.
{"title":"Significance of Distance Measures for Speaker Anonymization","authors":"Gauri P. Prajapati, D. Singh, H. Patil","doi":"10.1109/SPCOM55316.2022.9840515","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840515","url":null,"abstract":"Privacy preservation methods for voice data are evolving day by day. A recent state-of-the-art voice privacy algorithm uses an x-vector and neural source-filter (NSF)- based anonymization approach that converts the original input voice into a pseudo speaker’s voice. The method uses an affinity propagation clustering (APC) algorithm to choose a pseudo speaker’s x-vector. Finding a set of distance measures for this clustering technique is important to get optimal anonymization. To that effect, in this paper, an attempt has been made to investigate the effect of six distance measures, namely, Euclidean, cosine, probabilistic linear discriminant analysis (PLDA), correlation, Manhattan, and Mahalanobis for voice privacy preservation using an x-vector-based anonymization system. This approach gave a 4.75% relative improvement in Equal Error Rate(EER) for original enrolls and anonymized trials. In addition, 11.49% relative improvement in EER is observed for anonymized enrolls and trials. Experimental results show that Mahalanobis and Pearson correlation coefficient-based distance are better choices for anonymization tasks. It provides better speaker de-identification and good speech intelligibility without increasing system complexity.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125588949","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840819
Nageswara Rao Dusari, Shipra, M. Rawat
This paper presents the implementation of digital beamforming (DBF) and linearization of power amplifiers (PAs) using Xilinx RF SoC ZCU216 software-defined radio (SDR). PAs in all the transmitting paths are linearized using the Memory polynomial (MP) model of the digital predistortion (DPD) technique to increase PA's efficiency and reduce the overall cost of the beamforming system. This paper also presents the methodology for phase calibration of each SDR transmitter path before applying DPD to provide an accurate phase to each antenna element. A 1x4 uniform microstrip antenna array is designed to operate at 3.5 GHz to perform DBF. The 5G NR signal with 20 MHz bandwidth is used for testing. The experimental results show that the beam is formed in the desired direction as per the applied phase shift between the antenna elements. An ACPR of -48 dB is obtained after DPD in each channel.
{"title":"Digital Beamforming with Digital Predistortion using Xilinx RF SoC ZCU216","authors":"Nageswara Rao Dusari, Shipra, M. Rawat","doi":"10.1109/SPCOM55316.2022.9840819","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840819","url":null,"abstract":"This paper presents the implementation of digital beamforming (DBF) and linearization of power amplifiers (PAs) using Xilinx RF SoC ZCU216 software-defined radio (SDR). PAs in all the transmitting paths are linearized using the Memory polynomial (MP) model of the digital predistortion (DPD) technique to increase PA's efficiency and reduce the overall cost of the beamforming system. This paper also presents the methodology for phase calibration of each SDR transmitter path before applying DPD to provide an accurate phase to each antenna element. A 1x4 uniform microstrip antenna array is designed to operate at 3.5 GHz to perform DBF. The 5G NR signal with 20 MHz bandwidth is used for testing. The experimental results show that the beam is formed in the desired direction as per the applied phase shift between the antenna elements. An ACPR of -48 dB is obtained after DPD in each channel.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"77 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130296084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840771
Anand Mehrotra, R. Singh, Suraj Srivastava, A. Jagannatham
This paper develops delay-Doppler domain CSI estimation techniques for cyclic prefix (CP)-aided OTFS systems that employ arbitrary pulse shapes at the transmitter/receiver. In this context, the vectorized model and an equivalent 2D-circular convolution relationship is derived between the input and output symbols for a CP-aided OTFS system. Initially, a pilot impulsebased CSI estimation technique is derived that exploits only the 2D-circular convolution relationship between the input and output symbols. Subsequently, an enhanced data-embedded pilot frame is proposed, where the data symbols are appropriately placed in the pilot frame, while being separated by delay-Doppler domain guard intervals from the pilot impulse, which eliminates the interference between the resulting data and pilot outputs. The proposed data-embedded channel estimation scheme exploits the pilot impulse for channel estimation. Subsequently, a vectorized input-output relationship is determined for data detection, followed by a linear MMSE (LMMSE) receiver that employs the estimated channel state information (CSI). Finally, simulation results are presented to demonstrate the performance of the proposed CSI estimation techniques in various settings and also to benchmark their performance with respect to an ideal system with perfect CSI.
{"title":"Channel Estimation Techniques for CP-Aided OTFS Systems Relying on Practical Pulse Shapes","authors":"Anand Mehrotra, R. Singh, Suraj Srivastava, A. Jagannatham","doi":"10.1109/SPCOM55316.2022.9840771","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840771","url":null,"abstract":"This paper develops delay-Doppler domain CSI estimation techniques for cyclic prefix (CP)-aided OTFS systems that employ arbitrary pulse shapes at the transmitter/receiver. In this context, the vectorized model and an equivalent 2D-circular convolution relationship is derived between the input and output symbols for a CP-aided OTFS system. Initially, a pilot impulsebased CSI estimation technique is derived that exploits only the 2D-circular convolution relationship between the input and output symbols. Subsequently, an enhanced data-embedded pilot frame is proposed, where the data symbols are appropriately placed in the pilot frame, while being separated by delay-Doppler domain guard intervals from the pilot impulse, which eliminates the interference between the resulting data and pilot outputs. The proposed data-embedded channel estimation scheme exploits the pilot impulse for channel estimation. Subsequently, a vectorized input-output relationship is determined for data detection, followed by a linear MMSE (LMMSE) receiver that employs the estimated channel state information (CSI). Finally, simulation results are presented to demonstrate the performance of the proposed CSI estimation techniques in various settings and also to benchmark their performance with respect to an ideal system with perfect CSI.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129426802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840827
G. Thiagarajan, Shilpa Hosur, Sanjeev Gurugopinath
We propose a novel multi-stage constant false alarm rate (CFAR) detector for millimeter wave radars. In particular, we consider the frequency modulated continuous wave (FMCW) radars. First, we employ order statistics-based detector (OSD) on the range and Doppler dimensions, to obtain potential target locations as a coarse detection procedure. Next, we propose a weighted centroid detector (WCD) for fine detection on the range-Doppler matrix obtained from OSD, which is agnostic to the knowledge of noise variance. We obtain analytical expressions for the probabilities of false-alarm and detection threshold for both OSD and WCD, which are validated using Monte Carlo simulations. Through synthetic data and real-world experimental data, we highlight the efficacy of the proposed detectors in terms of the receiver operating characteristics and detection probability.
{"title":"A Multi-Stage Constant False-Alarm Rate Detector for Millimeter Wave Radars","authors":"G. Thiagarajan, Shilpa Hosur, Sanjeev Gurugopinath","doi":"10.1109/SPCOM55316.2022.9840827","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840827","url":null,"abstract":"We propose a novel multi-stage constant false alarm rate (CFAR) detector for millimeter wave radars. In particular, we consider the frequency modulated continuous wave (FMCW) radars. First, we employ order statistics-based detector (OSD) on the range and Doppler dimensions, to obtain potential target locations as a coarse detection procedure. Next, we propose a weighted centroid detector (WCD) for fine detection on the range-Doppler matrix obtained from OSD, which is agnostic to the knowledge of noise variance. We obtain analytical expressions for the probabilities of false-alarm and detection threshold for both OSD and WCD, which are validated using Monte Carlo simulations. Through synthetic data and real-world experimental data, we highlight the efficacy of the proposed detectors in terms of the receiver operating characteristics and detection probability.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127566141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We investigate the problem of variable-length compression with random access for stationary and ergodic sources, wherein short substrings of the raw file can be extracted from the compressed file without decompressing the entire file. It is possible to design compressors for sequences of length n that achieve compression rates close to the entropy rate of the source, and still be able to extract individual source symbols in time $theta(1)$ under the word-RAM model. In this article, we analyze a simple well-known approach used for compression with random access. We theoretically show that this is suboptimal, and design two simple compressors that simultaneously achieve entropy rate and constant-time random access. We then propose dictionary compression as a means to further improve performance, and experimentally validate this on various datasets.
{"title":"Low-Complexity Compression with Random Access","authors":"Srikanth Kamparaju, Shaik Mastan, Shashank Vatedka","doi":"10.1109/SPCOM55316.2022.9840790","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840790","url":null,"abstract":"We investigate the problem of variable-length compression with random access for stationary and ergodic sources, wherein short substrings of the raw file can be extracted from the compressed file without decompressing the entire file. It is possible to design compressors for sequences of length n that achieve compression rates close to the entropy rate of the source, and still be able to extract individual source symbols in time $theta(1)$ under the word-RAM model. In this article, we analyze a simple well-known approach used for compression with random access. We theoretically show that this is suboptimal, and design two simple compressors that simultaneously achieve entropy rate and constant-time random access. We then propose dictionary compression as a means to further improve performance, and experimentally validate this on various datasets.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124538655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840780
K. G. Gopan, Pavan Sudeesh Peruru, N. Sinha
Computed Tomography (CT) based analysis will assist doctors in a prompt diagnosis of the Covid-19 infection. Automated segmentation of lesions in chest CT scans helps in determining the severity of the infection. The presented work addresses the task of automated segmentation of Covid-19 lesions. A U-Net framework incorporated with spatial-channel attention modules (contextual relationships), Atrous Spatial Pyramid Pooling module (a wider receptive field) and Deep Supervision (lesion focus, less error propagation) is proposed. Focal Tversky Loss is used to evaluate the outputs at coarser scales while Tversky loss evaluates the final segmentation output. This combination of losses is used to enhance segmentation of the small lesions. The framework is trained on CT scans of 20 subjects of COVID19 CT Lung and Infection Segmentation Dataset and tested on Mosmed dataset of 50 subjects, where infection has affected less than 25% of lung parenchyma. The experimental results show that the proposed method is effective in segmenting the hard ROIs in Mosmed data resulting in a mean Dice score of 0.57 (9% more than the state-of-the-art).
{"title":"Modified U-Net Based Covid-19 Lesion Segmentation Using CT Scans","authors":"K. G. Gopan, Pavan Sudeesh Peruru, N. Sinha","doi":"10.1109/SPCOM55316.2022.9840780","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840780","url":null,"abstract":"Computed Tomography (CT) based analysis will assist doctors in a prompt diagnosis of the Covid-19 infection. Automated segmentation of lesions in chest CT scans helps in determining the severity of the infection. The presented work addresses the task of automated segmentation of Covid-19 lesions. A U-Net framework incorporated with spatial-channel attention modules (contextual relationships), Atrous Spatial Pyramid Pooling module (a wider receptive field) and Deep Supervision (lesion focus, less error propagation) is proposed. Focal Tversky Loss is used to evaluate the outputs at coarser scales while Tversky loss evaluates the final segmentation output. This combination of losses is used to enhance segmentation of the small lesions. The framework is trained on CT scans of 20 subjects of COVID19 CT Lung and Infection Segmentation Dataset and tested on Mosmed dataset of 50 subjects, where infection has affected less than 25% of lung parenchyma. The experimental results show that the proposed method is effective in segmenting the hard ROIs in Mosmed data resulting in a mean Dice score of 0.57 (9% more than the state-of-the-art).","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130904858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840764
Himadri Nirjhar Mandal, Soumya Sidhishwari
An apodized fiber Bragg grating (FBG) is designed for quasi-distributed sensing of temperature and strain due its various advantages particularly in hazardous environment. The main purpose of apodized FBG is to attain maximum reflectivity, narrow bandwidth and low level of side lobes, which are crucial for quasi-distributed sensing applications. Relationship between FBG properties and grating length have been explored to enhance and optimize the FBG. K Nearest Neighbors (KNN) algorithm is introduced for predictive analysis of FBG properties with different K values for the reliability of apodized FBG particularly for sensing applications. The optimal value of K has been identified for KNN by using various statistical techniques such as Mean Squared Error and Mean Absolute Error. Strong linearity has been obtained for both temperature and strain sensitivity of the designed apodized FBG. The optimized apodized FBG is utilized on wavelength division multiplexing (WDM) based quasi-distributed sensing system of four FBG signifying high reliability. High temperature and strain sensitivity ranges have been achieved in quasi-distributed sensing. The obtained ranges can be imposed in FBG-based sensing applications for monitoring of civil structure in hazardous environment.
{"title":"Predictive Analysis on Apodized FBG for Quasi-Distributed Temperature-Strain Sensing","authors":"Himadri Nirjhar Mandal, Soumya Sidhishwari","doi":"10.1109/SPCOM55316.2022.9840764","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840764","url":null,"abstract":"An apodized fiber Bragg grating (FBG) is designed for quasi-distributed sensing of temperature and strain due its various advantages particularly in hazardous environment. The main purpose of apodized FBG is to attain maximum reflectivity, narrow bandwidth and low level of side lobes, which are crucial for quasi-distributed sensing applications. Relationship between FBG properties and grating length have been explored to enhance and optimize the FBG. K Nearest Neighbors (KNN) algorithm is introduced for predictive analysis of FBG properties with different K values for the reliability of apodized FBG particularly for sensing applications. The optimal value of K has been identified for KNN by using various statistical techniques such as Mean Squared Error and Mean Absolute Error. Strong linearity has been obtained for both temperature and strain sensitivity of the designed apodized FBG. The optimized apodized FBG is utilized on wavelength division multiplexing (WDM) based quasi-distributed sensing system of four FBG signifying high reliability. High temperature and strain sensitivity ranges have been achieved in quasi-distributed sensing. The obtained ranges can be imposed in FBG-based sensing applications for monitoring of civil structure in hazardous environment.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128373001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840792
Ayush Agarwal, Amitabh Swain, S. Prasanna
With the widespread use of speech technologies, speaker identity/voiceprint protection has become very important. Many methods have been proposed in the literature that protects the speaker’s identity either by modifying the voice or replacing it with another speaker’s identity. Both authentication systems and humans cannot recognize the speaker’s identity in those approaches. Changing the speaker identity of original speech cannot be used for the applications in which we want to conceal speaker identity from machine authentication and, at the same time, keep the speaker’s voice as it is. Noise addition methods have been proposed in the literature to address this issue. However, adding noise to the signal increases the irritation effect on speech perception. This paper proposes a sinusoidal model-based approach that solves this issue. The proposed method does not interfere with the originality of speech but, at the same time, protects the speaker’s identity for the automatic speaker verification (ASV) system by degrading its performance. The proposed approach’s anonymized speech is tested on the ASV system for TIMIT and IITG-MV datasets, and an equal error rate (EER) is reported. Intelligence tests like short-time objective intelligibility (STOI) and mean opinion score (MOS) is also done. By taking both EER and intelligibility tests together into consideration, it is shown that the proposed approach can solve the discussed issue.
{"title":"Speaker Anonymization for Machines using Sinusoidal Model","authors":"Ayush Agarwal, Amitabh Swain, S. Prasanna","doi":"10.1109/SPCOM55316.2022.9840792","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840792","url":null,"abstract":"With the widespread use of speech technologies, speaker identity/voiceprint protection has become very important. Many methods have been proposed in the literature that protects the speaker’s identity either by modifying the voice or replacing it with another speaker’s identity. Both authentication systems and humans cannot recognize the speaker’s identity in those approaches. Changing the speaker identity of original speech cannot be used for the applications in which we want to conceal speaker identity from machine authentication and, at the same time, keep the speaker’s voice as it is. Noise addition methods have been proposed in the literature to address this issue. However, adding noise to the signal increases the irritation effect on speech perception. This paper proposes a sinusoidal model-based approach that solves this issue. The proposed method does not interfere with the originality of speech but, at the same time, protects the speaker’s identity for the automatic speaker verification (ASV) system by degrading its performance. The proposed approach’s anonymized speech is tested on the ASV system for TIMIT and IITG-MV datasets, and an equal error rate (EER) is reported. Intelligence tests like short-time objective intelligibility (STOI) and mean opinion score (MOS) is also done. By taking both EER and intelligibility tests together into consideration, it is shown that the proposed approach can solve the discussed issue.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"194 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131167180","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}