Pub Date : 2022-07-11DOI: 10.1109/spcom55316.2022.9840800
{"title":"SPCOM 2022 Cover Page","authors":"","doi":"10.1109/spcom55316.2022.9840800","DOIUrl":"https://doi.org/10.1109/spcom55316.2022.9840800","url":null,"abstract":"","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122429292","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840811
Vartika Sengar, S. VivekB., Gaurab Bhattacharya, J. Gubbi, Arpan Pal, P. Balamuralidhar
Identification of bias and its mitigation in a classifier is a fundamental sanity check required in trustworthy AI systems. There have been many methods for mitigation of bias in literature that use bias as apriori information. In this work, we propose a system that can detect the low-level bias (e.g., color, texture) and mitigate the same. A novel auto-encoder architecture to explain the predictions made by a deep neural network is built that helps in identification of the bias. The auto-encoder is trained to produce a generalized representation of the input image by decomposing it into a set of latent embeddings. These embeddings are learned by specializing the group of higher dimensional feature maps to learn the disentangled color and shape concepts. The shape embeddings are trained to reconstruct discrete wavelet transform components of an image and the color embeddings are trained to capture the color information. The feature specialization is done by reconstructing the RGB image using the shape embeddings modulated by color embeddings. We have shown that these representations can be used to detect low level bias in a classification task. Post detection of bias, we also propose a method to de-bias the classifier by training it with counterfactual images generated by manipulating the representations learned by the auto-encoder. We have shown that our proposed method of bias discovery and mitigation is able to achieve state-of-the-art results on ColorMNIST and the newly proposed BiasedShape dataset.
{"title":"Low-level Bias discovery and Mitigation for Image Classification","authors":"Vartika Sengar, S. VivekB., Gaurab Bhattacharya, J. Gubbi, Arpan Pal, P. Balamuralidhar","doi":"10.1109/SPCOM55316.2022.9840811","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840811","url":null,"abstract":"Identification of bias and its mitigation in a classifier is a fundamental sanity check required in trustworthy AI systems. There have been many methods for mitigation of bias in literature that use bias as apriori information. In this work, we propose a system that can detect the low-level bias (e.g., color, texture) and mitigate the same. A novel auto-encoder architecture to explain the predictions made by a deep neural network is built that helps in identification of the bias. The auto-encoder is trained to produce a generalized representation of the input image by decomposing it into a set of latent embeddings. These embeddings are learned by specializing the group of higher dimensional feature maps to learn the disentangled color and shape concepts. The shape embeddings are trained to reconstruct discrete wavelet transform components of an image and the color embeddings are trained to capture the color information. The feature specialization is done by reconstructing the RGB image using the shape embeddings modulated by color embeddings. We have shown that these representations can be used to detect low level bias in a classification task. Post detection of bias, we also propose a method to de-bias the classifier by training it with counterfactual images generated by manipulating the representations learned by the auto-encoder. We have shown that our proposed method of bias discovery and mitigation is able to achieve state-of-the-art results on ColorMNIST and the newly proposed BiasedShape dataset.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"2014 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127416977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840779
Snigdha Agarwal, Chakka Sai Pradeep, N. Sinha
Temporal Gesture Segmentation is an active research problem for many applications such as surgical skill assessment, surgery training, robotic training. In this paper, we propose a novel method for Gesture Segmentation on untrimmed surgical videos of the challenging JIGSAWS dataset by using a two-step methodology. We train and evaluate our method on 39 videos of the Suturing task which has 10 gestures. The length of gestures ranges from 1 second to 75 seconds and full video length varies from 1 minute to 5 minutes. In step one, we extract encoded frame-wise spatio-temporal features on full temporal resolution of the untrimmed videos. In step two, we use these extracted features to identify gesture segments for temporal segmentation and classification. To extract high-quality features from the surgical videos, we also pre-train gesture classification models using transfer learning on the JIGSAWS dataset using two state-of-the-art pretrained backbone architectures. For segmentation, we propose an improved calibrated MS-TCN (CMS-TCN) by introducing a smoothed focal loss as loss function which helps in regularizing our TCN to avoid making over-confident decisions. We achieve a frame-wise accuracy of 89.8% and an Edit Distance score of 91.5%, an improvement of 2.2% from previous works. We also propose a novel evaluation metric that normalizes the effect of correctly classifying the frames of larger segments versus smaller segments in a single score.
{"title":"Temporal Surgical Gesture Segmentation and Classification in Multi-gesture Robotic Surgery using Fine-tuned features and Calibrated MS-TCN","authors":"Snigdha Agarwal, Chakka Sai Pradeep, N. Sinha","doi":"10.1109/SPCOM55316.2022.9840779","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840779","url":null,"abstract":"Temporal Gesture Segmentation is an active research problem for many applications such as surgical skill assessment, surgery training, robotic training. In this paper, we propose a novel method for Gesture Segmentation on untrimmed surgical videos of the challenging JIGSAWS dataset by using a two-step methodology. We train and evaluate our method on 39 videos of the Suturing task which has 10 gestures. The length of gestures ranges from 1 second to 75 seconds and full video length varies from 1 minute to 5 minutes. In step one, we extract encoded frame-wise spatio-temporal features on full temporal resolution of the untrimmed videos. In step two, we use these extracted features to identify gesture segments for temporal segmentation and classification. To extract high-quality features from the surgical videos, we also pre-train gesture classification models using transfer learning on the JIGSAWS dataset using two state-of-the-art pretrained backbone architectures. For segmentation, we propose an improved calibrated MS-TCN (CMS-TCN) by introducing a smoothed focal loss as loss function which helps in regularizing our TCN to avoid making over-confident decisions. We achieve a frame-wise accuracy of 89.8% and an Edit Distance score of 91.5%, an improvement of 2.2% from previous works. We also propose a novel evaluation metric that normalizes the effect of correctly classifying the frames of larger segments versus smaller segments in a single score.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132781519","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840794
Bhagyasree Kanuparthi, A. Turlapaty
Physically disabled patients such as the paralyzed, amputees and stroke patients find it difficult to perform daily activities on their own. A Brain-Computer Interface (BCI) using Electroencephalography (EEG) signals is an option for the rehabilitation of these patients. The BCI function can be enhanced by decoding the movements from a limb through an intuitive control of the prosthetic arm. However, decoding them with the traditional classifiers is a challenging task. In this paper, a two-stage hierarchical framework is proposed for the decoding of reach-and-grasp actions. In stage-l, the action signals are separated from rest segments based on power spectral density features and a fine k-nearest neighbor classifier (FKNN). In stage-2, the signals identified as action are further classified into palmar and lateral type reach-and-grasp actions using the mean absolute value features with the FKNN classifier. In comparison with the existing classifiers, the proposed method has a superior performance of 85.38% test accuracy.
{"title":"A Hierarchical Approach for Decoding Human Reach-and-Grasp Activities based on EEG Signals","authors":"Bhagyasree Kanuparthi, A. Turlapaty","doi":"10.1109/SPCOM55316.2022.9840794","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840794","url":null,"abstract":"Physically disabled patients such as the paralyzed, amputees and stroke patients find it difficult to perform daily activities on their own. A Brain-Computer Interface (BCI) using Electroencephalography (EEG) signals is an option for the rehabilitation of these patients. The BCI function can be enhanced by decoding the movements from a limb through an intuitive control of the prosthetic arm. However, decoding them with the traditional classifiers is a challenging task. In this paper, a two-stage hierarchical framework is proposed for the decoding of reach-and-grasp actions. In stage-l, the action signals are separated from rest segments based on power spectral density features and a fine k-nearest neighbor classifier (FKNN). In stage-2, the signals identified as action are further classified into palmar and lateral type reach-and-grasp actions using the mean absolute value features with the FKNN classifier. In comparison with the existing classifiers, the proposed method has a superior performance of 85.38% test accuracy.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125976282","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840830
V. R. Lakkavalli
In this paper the classical paradigm of analysis by synthesis (AbS) for automatic speech recognition (ASR) is re-visited to enhance the performance of ASR. Although AbS paradigm holds promise to explain the process of perception as proposed in Motor Theory many challenges remain to be addressed to realize a practical ASR system based on it. In this paper, i) a general architecture for ASR using AbS is presented; and, ii) a new AbS-trellis is proposed which is used to realize the AbS loop considering combination of transition (coarticulation) cost and classification cost to search for best decoding path. Initial results on TIMIT database shows that substitution errors may be reduced by employing AbS. This shows promise for using AbS in ASR, and the results further highlight the need to identify an invariant phonetic representation space, a better distance metric (or coarticulation modelling), and synthesizer.
{"title":"AbS for ASR: A New Computational Perspective","authors":"V. R. Lakkavalli","doi":"10.1109/SPCOM55316.2022.9840830","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840830","url":null,"abstract":"In this paper the classical paradigm of analysis by synthesis (AbS) for automatic speech recognition (ASR) is re-visited to enhance the performance of ASR. Although AbS paradigm holds promise to explain the process of perception as proposed in Motor Theory many challenges remain to be addressed to realize a practical ASR system based on it. In this paper, i) a general architecture for ASR using AbS is presented; and, ii) a new AbS-trellis is proposed which is used to realize the AbS loop considering combination of transition (coarticulation) cost and classification cost to search for best decoding path. Initial results on TIMIT database shows that substitution errors may be reduced by employing AbS. This shows promise for using AbS in ASR, and the results further highlight the need to identify an invariant phonetic representation space, a better distance metric (or coarticulation modelling), and synthesizer.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115298717","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840840
Priyanka Gupta, Piyushkumar K. Chodingala, H. Patil
Spoofed Speech Detection (SSD) problem has been an important problem, especially for Automatic Speaker Verification (ASV) systems. However, the techniques used for designing countermeasure systems for SSD task are attack-specific, and therefore the solutions are far from a generalized SSD system, which can detect any type of spoofed speech. On the other hand, Voice Liveness Detection (VLD) systems rely on the characteristics of live speech (i.e., pop noise) to detect whether an utterance is live or not. Given that the attacker has the freedom to mount any type of attack, VLD systems play a crucial role in defending against spoofing attacks, irrespective of the type of spoof used by the attacker. To that effect, we propose Generalized Morse Wavelet (GMW)-based features for VLD, with Convolutional Neural Network (CNN) as the classifier at the back-end. In this context, we use pop noise as a discriminative acoustic cue to detect live speech. Pop noise is present in live speech signals at low frequencies (typically $leq 40$ Hz), caused by human breath reaching at the closely-placed microphone. We show that for $gamma =3$, the Morse wavelet has the highest concentration of information denoted by the least area of the Heisenberg’s box. Hence, we take $gamma =3$ for our experiments on Morse wavelets. We compare the performance of our system with Short-Time Fourier Transform (STFT)-Support Vector Machine (SVM)-based original baseline, and other existing systems, such as Constant Q-Transform (CQT)-SVM, STFT-CNN, and bump wavelet-CNN. With overall accuracy of 86.90% on evaluation set, our proposed system significantly outperforms STFT-SVM-based original baseline, CQT-SVM, STFT-CNN, and bump wavelet-CNN by an absolute margin of 18.97 %, 8. 02%, 15. 09%, and 12. 21%, respectively. Finally, we have also analyzed the effect of various phoneme types on VLD system performance.
欺骗语音检测(SSD)问题一直是一个重要的问题,特别是在自动说话人验证(ASV)系统中。然而,用于设计SSD任务的对抗系统的技术是针对特定攻击的,因此解决方案与可以检测任何类型的欺骗语音的通用SSD系统相去甚远。另一方面,语音活性检测(VLD)系统依赖于实时语音的特征(即流行噪声)来检测话语是否实时。鉴于攻击者可以自由地发起任何类型的攻击,VLD系统在防御欺骗攻击方面发挥着至关重要的作用,而不管攻击者使用哪种类型的欺骗。为此,我们提出了基于广义莫尔斯小波(GMW)的VLD特征,并将卷积神经网络(CNN)作为后端分类器。在这种情况下,我们使用流行噪音作为判别声学线索来检测现场语音。流行噪声存在于低频率(通常为$leq 40$ Hz)的实时语音信号中,是由人的呼吸到达靠近的麦克风引起的。我们表明,对于$gamma =3$,莫尔斯小波具有最高的信息集中度,由海森堡盒子的最小面积表示。因此,我们选取$gamma =3$作为摩尔斯小波的实验。我们将系统的性能与基于短时傅立叶变换(STFT)-支持向量机(SVM)的原始基线,以及其他现有系统(如常数q变换(CQT)-SVM, STFT- cnn和bump wavelet-CNN)进行了比较。总体准确率为86.90% on evaluation set, our proposed system significantly outperforms STFT-SVM-based original baseline, CQT-SVM, STFT-CNN, and bump wavelet-CNN by an absolute margin of 18.97 %, 8. 02%, 15. 09%, and 12. 21%, respectively. Finally, we have also analyzed the effect of various phoneme types on VLD system performance.
{"title":"Morse Wavelet Features for Pop Noise Detection","authors":"Priyanka Gupta, Piyushkumar K. Chodingala, H. Patil","doi":"10.1109/SPCOM55316.2022.9840840","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840840","url":null,"abstract":"Spoofed Speech Detection (SSD) problem has been an important problem, especially for Automatic Speaker Verification (ASV) systems. However, the techniques used for designing countermeasure systems for SSD task are attack-specific, and therefore the solutions are far from a generalized SSD system, which can detect any type of spoofed speech. On the other hand, Voice Liveness Detection (VLD) systems rely on the characteristics of live speech (i.e., pop noise) to detect whether an utterance is live or not. Given that the attacker has the freedom to mount any type of attack, VLD systems play a crucial role in defending against spoofing attacks, irrespective of the type of spoof used by the attacker. To that effect, we propose Generalized Morse Wavelet (GMW)-based features for VLD, with Convolutional Neural Network (CNN) as the classifier at the back-end. In this context, we use pop noise as a discriminative acoustic cue to detect live speech. Pop noise is present in live speech signals at low frequencies (typically $leq 40$ Hz), caused by human breath reaching at the closely-placed microphone. We show that for $gamma =3$, the Morse wavelet has the highest concentration of information denoted by the least area of the Heisenberg’s box. Hence, we take $gamma =3$ for our experiments on Morse wavelets. We compare the performance of our system with Short-Time Fourier Transform (STFT)-Support Vector Machine (SVM)-based original baseline, and other existing systems, such as Constant Q-Transform (CQT)-SVM, STFT-CNN, and bump wavelet-CNN. With overall accuracy of 86.90% on evaluation set, our proposed system significantly outperforms STFT-SVM-based original baseline, CQT-SVM, STFT-CNN, and bump wavelet-CNN by an absolute margin of 18.97 %, 8. 02%, 15. 09%, and 12. 21%, respectively. Finally, we have also analyzed the effect of various phoneme types on VLD system performance.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124847403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840761
Vishwesh Pillai, Pranav Mehar, M. Das, Deep Gupta, P. Radeva
The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples.
{"title":"Integrated Hierarchical and Flat Classifiers for Food Image Classification using Epistemic Uncertainty","authors":"Vishwesh Pillai, Pranav Mehar, M. Das, Deep Gupta, P. Radeva","doi":"10.1109/SPCOM55316.2022.9840761","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840761","url":null,"abstract":"The problem of food image recognition is an essential one in today’s context because health conditions such as diabetes, obesity, and heart disease require constant monitoring of a person’s diet. To automate this process, several models are available to recognize food images. Due to a considerable number of unique food dishes and various cuisines, a traditional flat classifier ceases to perform well. To address this issue, prediction schemes consisting of both flat and hierarchical classifiers, with the analysis of epistemic uncertainty are used to switch between the classifiers. However, the accuracy of the predictions made using epistemic uncertainty data remains considerably low. Therefore, this paper presents a prediction scheme using three different threshold criteria that helps to increase the accuracy of epistemic uncertainty predictions. The performance of the proposed method is demonstrated using several experiments performed on the MAFood-121 dataset. The experimental results validate the proposal performance and show that the proposed threshold criteria help to increase the overall accuracy of the predictions by correctly classifying the uncertainty distribution of the samples.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129322953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents the design of compact, tunable, high rejection 6th-order C-Band Iris Coupled Cavity Bandpass Filter. The design approach followed includes the use of Chebychev low pass filter prototype elements to calculate normalized capacitance per unit length between resonators & ground and also between adjacent resonators. With the help of coupling and tuning screws, the bandwidth and center frequency of the filter can be tuned for desired performance. Coaxial capacitance formula is used to compute the diameter of the screws. CST tool is used to simulate & optimize the theoretically calculate physical dimensions to further improve the filter performance and obtain better tolerance sensitivity. Finally, a 6th order prototype is fabricated and tuned to obtain the desired performance. The cavity design & resonator calculations have been carried out in such a manner that the same hardware can be tuned to both the frequency bands i.e., 4.4-4.6 GHz (Band I) and 4.8-5.0 GHz (Band II) to meet the desired specifications. A prototype is fabricated and experimental validation is presented.
{"title":"C-Band Iris Coupled Cavity Bandpass Filter","authors":"Shashank Soi, Sudheer Kumar Singh, Rajendra Singh, Ashok Kumar","doi":"10.1109/SPCOM55316.2022.9840777","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840777","url":null,"abstract":"This paper presents the design of compact, tunable, high rejection 6th-order C-Band Iris Coupled Cavity Bandpass Filter. The design approach followed includes the use of Chebychev low pass filter prototype elements to calculate normalized capacitance per unit length between resonators & ground and also between adjacent resonators. With the help of coupling and tuning screws, the bandwidth and center frequency of the filter can be tuned for desired performance. Coaxial capacitance formula is used to compute the diameter of the screws. CST tool is used to simulate & optimize the theoretically calculate physical dimensions to further improve the filter performance and obtain better tolerance sensitivity. Finally, a 6th order prototype is fabricated and tuned to obtain the desired performance. The cavity design & resonator calculations have been carried out in such a manner that the same hardware can be tuned to both the frequency bands i.e., 4.4-4.6 GHz (Band I) and 4.8-5.0 GHz (Band II) to meet the desired specifications. A prototype is fabricated and experimental validation is presented.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"19 34","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114044168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840791
L. Yashvanth, C. Murthy, B. Deepak
Intelligent reflecting surfaces (IRSs) enhance the performance of wireless systems by reflecting the incoming signals towards a desired user, especially in the mmWave bands. However, this requires optimizing the discrete reflection coefficients of the IRS elements, which crucially depends on the availability of accurate channel state information (CSI) of all links in the system. Further, in wideband systems employing orthogonal frequency division multiplexing (OFDM), a given IRS configuration cannot be simultaneously optimal for all the subcarriers, and hence the phase optimization is not straightforward. In this paper, we propose a novel IRS phase configuration scheme in OFDM systems by first leveraging the sparsity of the channel in the angular domain to estimate the CSI using simultaneous orthogonal matching pursuit (SOMP) algorithm, and then devising a novel and computationally efficient binary IRS phase configuration algorithm using majorization-minimization (MM). Simulation results illustrate the efficacy of the approach in comparison with the state-of-the-art.
{"title":"Binary Intelligent Reflecting Surfaces Assisted OFDM Systems","authors":"L. Yashvanth, C. Murthy, B. Deepak","doi":"10.1109/SPCOM55316.2022.9840791","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840791","url":null,"abstract":"Intelligent reflecting surfaces (IRSs) enhance the performance of wireless systems by reflecting the incoming signals towards a desired user, especially in the mmWave bands. However, this requires optimizing the discrete reflection coefficients of the IRS elements, which crucially depends on the availability of accurate channel state information (CSI) of all links in the system. Further, in wideband systems employing orthogonal frequency division multiplexing (OFDM), a given IRS configuration cannot be simultaneously optimal for all the subcarriers, and hence the phase optimization is not straightforward. In this paper, we propose a novel IRS phase configuration scheme in OFDM systems by first leveraging the sparsity of the channel in the angular domain to estimate the CSI using simultaneous orthogonal matching pursuit (SOMP) algorithm, and then devising a novel and computationally efficient binary IRS phase configuration algorithm using majorization-minimization (MM). Simulation results illustrate the efficacy of the approach in comparison with the state-of-the-art.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114606296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-07-11DOI: 10.1109/SPCOM55316.2022.9840760
K. K. Tarafdar, Q. Saifee, V. Gadre
This paper introduces a novel unified neural Multi-Resolution Analysis (MRA) architecture that uses Discrete Wavelet Transform (DWT) integrated Convolutional Neural Network (CNN) along with DWT pooling. As convolution with pooling operation in CNN has equivalence with filtering and downsampling operation in a DWT filter bank, both are unified to form an end-to-end deep learning wavelet CNN model. The DWT pooling mechanism is also used to further enhance the MRA capability of this wavelet CNN. Using the first two wavelets of the Daubechies family, we present here a comprehensive set of improved texture classification results with several updates in the model architecture. These updates in the CNN model architecture apply to any node generally associated with the time-frequency analysis of the input signal.
{"title":"A unified neural MRA architecture combining wavelet CNN and wavelet pooling for texture classification","authors":"K. K. Tarafdar, Q. Saifee, V. Gadre","doi":"10.1109/SPCOM55316.2022.9840760","DOIUrl":"https://doi.org/10.1109/SPCOM55316.2022.9840760","url":null,"abstract":"This paper introduces a novel unified neural Multi-Resolution Analysis (MRA) architecture that uses Discrete Wavelet Transform (DWT) integrated Convolutional Neural Network (CNN) along with DWT pooling. As convolution with pooling operation in CNN has equivalence with filtering and downsampling operation in a DWT filter bank, both are unified to form an end-to-end deep learning wavelet CNN model. The DWT pooling mechanism is also used to further enhance the MRA capability of this wavelet CNN. Using the first two wavelets of the Daubechies family, we present here a comprehensive set of improved texture classification results with several updates in the model architecture. These updates in the CNN model architecture apply to any node generally associated with the time-frequency analysis of the input signal.","PeriodicalId":246982,"journal":{"name":"2022 IEEE International Conference on Signal Processing and Communications (SPCOM)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123038488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}