Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530043
Sparsh Garg, Utkarsh Mehrotra, G. Krishna, A. Vuppala
The detection and removal of disfluencies from speech is an important task since the presence of disfluencies can adversely affect the performance of speech-based applications such as Automatic Speech Recognition (ASR) systems and speech-to-speech translation systems. From the perspective of Indian languages, there is a lack of studies pertaining to speech disfluencies, their types and frequency of occurrence. Also, the resources available to perform such studies in an Indian context are limited. Through this paper, we attempt to address this issue by introducing the IIITH-Indian English Disfluency (IIITH-IED) Dataset. This dataset consists of 10-hours of lecture mode speech in Indian English. Five types of disfluencies - filled pause, prolongation, word repetition, part-word repetition and phrase repetition were identified in the speech signal and annotated in the corresponding transcription to prepare this dataset. The IIITH-IED dataset was then used to develop frame-level automatic disfluency detection systems. Two sets of features were extracted from the speech signal and then used to train classifiers for the task of disfluency detection. Amongst all the systems employed, Random Forest with MFCC features resulted in the highest average accuracy of 89.61% and F1-score of 0.89.
检测和消除语音中的不流畅是一项重要的任务,因为不流畅的存在会对基于语音的应用程序(如自动语音识别(ASR)系统和语音到语音翻译系统)的性能产生不利影响。从印度语言的角度来看,缺乏有关言语不流畅,其类型和发生频率的研究。此外,在印度进行这类研究的资源有限。通过本文,我们试图通过引入IIITH-Indian English disfluent (IIITH-IED) Dataset来解决这个问题。该数据集由10小时的印度英语讲座模式演讲组成。在语音信号中识别出五种不流畅类型——充满停顿、延长、单词重复、部分单词重复和短语重复,并在相应的转录中进行注释,以制备该数据集。IIITH-IED数据集随后被用于开发帧级自动不流畅检测系统。从语音信号中提取两组特征,然后用于训练分类器来完成不流畅检测任务。在所有采用的系统中,具有MFCC特征的Random Forest平均准确率最高,达到89.61%,f1得分为0.89。
{"title":"Towards a Database For Detection of Multiple Speech Disfluencies in Indian English","authors":"Sparsh Garg, Utkarsh Mehrotra, G. Krishna, A. Vuppala","doi":"10.1109/NCC52529.2021.9530043","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530043","url":null,"abstract":"The detection and removal of disfluencies from speech is an important task since the presence of disfluencies can adversely affect the performance of speech-based applications such as Automatic Speech Recognition (ASR) systems and speech-to-speech translation systems. From the perspective of Indian languages, there is a lack of studies pertaining to speech disfluencies, their types and frequency of occurrence. Also, the resources available to perform such studies in an Indian context are limited. Through this paper, we attempt to address this issue by introducing the IIITH-Indian English Disfluency (IIITH-IED) Dataset. This dataset consists of 10-hours of lecture mode speech in Indian English. Five types of disfluencies - filled pause, prolongation, word repetition, part-word repetition and phrase repetition were identified in the speech signal and annotated in the corresponding transcription to prepare this dataset. The IIITH-IED dataset was then used to develop frame-level automatic disfluency detection systems. Two sets of features were extracted from the speech signal and then used to train classifiers for the task of disfluency detection. Amongst all the systems employed, Random Forest with MFCC features resulted in the highest average accuracy of 89.61% and F1-score of 0.89.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"160 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121258692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530157
Aritra Paul, K. P. Kumar
The paper describes the process of optical frequency comb generation using cascaded stimulated Brillouin scattering in optical fibers. The cascaded stimulated Brillouin scattering process is induced by the SBS-pump recycling technique in a single mode fiber. The single mode fiber is placed inside a recirculating cavity, with a loop mirror placed at the terminal end of the fiber. The pumps are obtained from four wave mixing process in a semiconductor optical amplifier. We have achieved a total of 8 comb lines − 5 lines within 6 dB power variation. The comb lines are separated by approximately 11 GHz (~0.085 nm).
{"title":"Generation of optical frequency comb using cascaded Brillouin scattering at low power utilizing pump recycling technique in a single mode fiber","authors":"Aritra Paul, K. P. Kumar","doi":"10.1109/NCC52529.2021.9530157","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530157","url":null,"abstract":"The paper describes the process of optical frequency comb generation using cascaded stimulated Brillouin scattering in optical fibers. The cascaded stimulated Brillouin scattering process is induced by the SBS-pump recycling technique in a single mode fiber. The single mode fiber is placed inside a recirculating cavity, with a loop mirror placed at the terminal end of the fiber. The pumps are obtained from four wave mixing process in a semiconductor optical amplifier. We have achieved a total of 8 comb lines − 5 lines within 6 dB power variation. The comb lines are separated by approximately 11 GHz (~0.085 nm).","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115964741","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530051
Haseen Rahman
Maximizing the data throughput of point-to-point transmitting nodes which harvest exogenous energy is a widely considered problem in literature. In this work, we consider an additive white Gaussian noise channel in the presence of a jamming adversary. The legitimate transmitter is an energy harvesting (EH) node which attempts to maximize the amount of data conveyed before a specified deadline. The jamming node, on the other hand, tries to minimize the transmitter's data throughput by introducing targeted noise. We assume that the jammer has some fixed amount of energy for interfering. When both the nodes know the EH process in advance, known as the offline setting, we compute the actions of each node at the minmax equilibrium. In the online setting, where the energy arrivals are known in a causal manner, we first consider the case without jamming and show that a simple conservative algorithm can achieve at least a quarter of the optimal offline throughput. We then show that the algorithm has the same competitiveness in the presence of an offline jammer as well.
{"title":"Maximizing the Throughput of an Energy Harvesting Transmitter in the Presence of a Jammer with Fixed Energy","authors":"Haseen Rahman","doi":"10.1109/NCC52529.2021.9530051","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530051","url":null,"abstract":"Maximizing the data throughput of point-to-point transmitting nodes which harvest exogenous energy is a widely considered problem in literature. In this work, we consider an additive white Gaussian noise channel in the presence of a jamming adversary. The legitimate transmitter is an energy harvesting (EH) node which attempts to maximize the amount of data conveyed before a specified deadline. The jamming node, on the other hand, tries to minimize the transmitter's data throughput by introducing targeted noise. We assume that the jammer has some fixed amount of energy for interfering. When both the nodes know the EH process in advance, known as the offline setting, we compute the actions of each node at the minmax equilibrium. In the online setting, where the energy arrivals are known in a causal manner, we first consider the case without jamming and show that a simple conservative algorithm can achieve at least a quarter of the optimal offline throughput. We then show that the algorithm has the same competitiveness in the presence of an offline jammer as well.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125862685","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530126
Hareesh Devarakonda, Snehasis Mukherjee
Early action prediction in video is a challenging task where the action of a human performer is expected to be predicted using only the initial few frames. We propose a novel technique for action prediction based on Deep Reinforcement learning, employing a Deep Q-Network (DQN) and the ResNext as the basic CNN architecture. The proposed DQN can predict the actions in videos from features extracted from the first few frames of the video, and the basic CNN model is adjusted by tuning the hyperparameters of the CNN network. The ResNext model is adjusted based on the reward provided by the DQN, and the hyperparameters are updated to predict actions. The agent's stopping criteria is higher or equal to the validation accuracy value. The DQN is rewarded based on the sequential input frames and the transition of action states (i.e., prediction of action class for an incremental 10 percent of the video). The visual features extracted from the first 10 percent of the video is forwarded to the next 10 percent of the video for each action state. The proposed method is tested on the UCF101 dataset and has outperformed the state-of-the-art in action prediction.
{"title":"Early Prediction of Human Action by Deep Reinforcement Learning","authors":"Hareesh Devarakonda, Snehasis Mukherjee","doi":"10.1109/NCC52529.2021.9530126","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530126","url":null,"abstract":"Early action prediction in video is a challenging task where the action of a human performer is expected to be predicted using only the initial few frames. We propose a novel technique for action prediction based on Deep Reinforcement learning, employing a Deep Q-Network (DQN) and the ResNext as the basic CNN architecture. The proposed DQN can predict the actions in videos from features extracted from the first few frames of the video, and the basic CNN model is adjusted by tuning the hyperparameters of the CNN network. The ResNext model is adjusted based on the reward provided by the DQN, and the hyperparameters are updated to predict actions. The agent's stopping criteria is higher or equal to the validation accuracy value. The DQN is rewarded based on the sequential input frames and the transition of action states (i.e., prediction of action class for an incremental 10 percent of the video). The visual features extracted from the first 10 percent of the video is forwarded to the next 10 percent of the video for each action state. The proposed method is tested on the UCF101 dataset and has outperformed the state-of-the-art in action prediction.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122311749","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530170
Nageswara Rao Dusari, M. Rawat
Beamforming is the key technique used in 5G communication systems for transmitting/receiving signals only in a particular direction. An accurate phase is needed to apply to the beamforming antenna array to steer the beam in a particular direction. Generally, multiple software-defined radios (SDR) are used for flexible beamforming. Whereas these multiple SDRs contain phase differences in transmitting paths due to nonlinearities in their components and the use of an individual clock and local oscillators (LO). Therefore, this paper presents the methodology to calibrate the phase differences in different transmitting paths of SDR before applying signals to the antenna elements for beamforming. This paper presents the methodology to estimate the phase offset using the cross-covariance method. A method is presented to synchronize multiple SDRs accurately. As a proof of concept, the SDR setup is built with the analog transceiver AD9371 from Analog Devices and ZC706 FPGA board from Xilinx. The measurement results with phase compensation after synchronization achieves an NMSE of around −35 dB between the signals of different transmitter paths. A 1×4 antenna array operating at 2.4 GHz has been designed in simulation, and the main beam is achieved in the desired direction after phase compensation.
{"title":"Phase Calibration of Multiple Software Defined Radio Transmitters for Beamforming in 5G Communication","authors":"Nageswara Rao Dusari, M. Rawat","doi":"10.1109/NCC52529.2021.9530170","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530170","url":null,"abstract":"Beamforming is the key technique used in 5G communication systems for transmitting/receiving signals only in a particular direction. An accurate phase is needed to apply to the beamforming antenna array to steer the beam in a particular direction. Generally, multiple software-defined radios (SDR) are used for flexible beamforming. Whereas these multiple SDRs contain phase differences in transmitting paths due to nonlinearities in their components and the use of an individual clock and local oscillators (LO). Therefore, this paper presents the methodology to calibrate the phase differences in different transmitting paths of SDR before applying signals to the antenna elements for beamforming. This paper presents the methodology to estimate the phase offset using the cross-covariance method. A method is presented to synchronize multiple SDRs accurately. As a proof of concept, the SDR setup is built with the analog transceiver AD9371 from Analog Devices and ZC706 FPGA board from Xilinx. The measurement results with phase compensation after synchronization achieves an NMSE of around −35 dB between the signals of different transmitter paths. A 1×4 antenna array operating at 2.4 GHz has been designed in simulation, and the main beam is achieved in the desired direction after phase compensation.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133627910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530144
Narendra Vishwakarma, S. R.
Satellite communication (SATCOM) systems are generally used for broadcasting, disaster recovery, and navigation applications due to the large coverage area. Deployment of more SATCOM systems require a high data rate and large communication capacity. On the other hand, free space optics (FSO) technology has fulfilled the needs of the gigabit capacity due to its exemplary features. Nevertheless, the FSO link is vulnerable to atmospheric turbulence, pointing errors, weather conditions like fog, snow etc. Subsequently, the more reliable radio frequency (RF) link can be used in combination with the FSO link to counteract the limitations. Therefore, a hybrid FSO/RF system is a promising solution for next-generation satellite communication (SATCOM) systems. In this context, we investigate an adaptive-combining-based switching scheme for a hybrid FSO/RF system considering both uplink and downlink SATCOM scenarios. Adaptive combining involves switching of the FSO link to maximal ratio combining (MRC) of FSO and RF links provided the operating FSO link quality becomes unacceptable for transmission. Further, in this paper, the performance of the adaptive-combining-based hybrid FSO/RF system is examined through exact and asymptotic ergodic capacity analyses.
{"title":"Capacity Analysis of Adaptive Combining for Hybrid FSO/RF Satellite Communication System","authors":"Narendra Vishwakarma, S. R.","doi":"10.1109/NCC52529.2021.9530144","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530144","url":null,"abstract":"Satellite communication (SATCOM) systems are generally used for broadcasting, disaster recovery, and navigation applications due to the large coverage area. Deployment of more SATCOM systems require a high data rate and large communication capacity. On the other hand, free space optics (FSO) technology has fulfilled the needs of the gigabit capacity due to its exemplary features. Nevertheless, the FSO link is vulnerable to atmospheric turbulence, pointing errors, weather conditions like fog, snow etc. Subsequently, the more reliable radio frequency (RF) link can be used in combination with the FSO link to counteract the limitations. Therefore, a hybrid FSO/RF system is a promising solution for next-generation satellite communication (SATCOM) systems. In this context, we investigate an adaptive-combining-based switching scheme for a hybrid FSO/RF system considering both uplink and downlink SATCOM scenarios. Adaptive combining involves switching of the FSO link to maximal ratio combining (MRC) of FSO and RF links provided the operating FSO link quality becomes unacceptable for transmission. Further, in this paper, the performance of the adaptive-combining-based hybrid FSO/RF system is examined through exact and asymptotic ergodic capacity analyses.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"417 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134557329","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530084
Madhu Oruganti, T. Meenpal, Saikat Majumder
Based on two facial image appearances estimating their kinship is the main aim of the kinship verification. Age progression-based kinship verification is one of the obscure parts in this research. The similarities in facial features between parent and their children will be numerous in their childhood. As age progress, child facial features are varied and dispersed from parent facial features. It becomes a challenging task to estimate their kinship. So, a new dimensional database with parent in childhood and their child images is collected. This paper proposes and trains a metric to ensure that the model can predict whether the given pair images are kin or non-kin. In training module, differences of Histogram of Gradient (HoG) features for all combinations of pairs are computed and each pair absolute differences are calculated. Further, selective minimum variances are used to assess the kin similarity features. A global threshold is computed to classify kins and non-kins. After this comprehensive training, testing is also done in a similar way. The computed global threshold in training module is effectively used to estimate kinship verification in testing module. Experimental results are presented and out performed with an accuracy of 82%.
根据两张人脸图像的外观来估计其亲属关系是亲属关系验证的主要目的。基于年龄递进的亲属关系验证是本研究的难点之一。在孩子的童年时期,父母和孩子的面部特征会有很多相似之处。随着年龄的增长,儿童的面部特征与父母的面部特征不同且分散。评估它们的亲缘关系成为一项具有挑战性的任务。因此,收集了一个新的维度数据库,其中包含童年时期的父母及其子女的图像。本文提出并训练了一个度量,以确保模型能够预测给定的图像对是亲缘还是非亲缘。在训练模块中,计算所有对组合的梯度直方图(Histogram of Gradient, HoG)特征的差值,并计算每对的绝对差值。此外,选择最小方差用于评估亲属相似性特征。计算一个全局阈值来对亲属和非亲属进行分类。经过这种全面的培训后,测试也以类似的方式进行。训练模块中计算的全局阈值有效地用于测试模块中亲属验证的估计。给出了实验结果并进行了验证,准确率达到82%。
{"title":"Selective variance based kinship verification in parent's childhood and their children","authors":"Madhu Oruganti, T. Meenpal, Saikat Majumder","doi":"10.1109/NCC52529.2021.9530084","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530084","url":null,"abstract":"Based on two facial image appearances estimating their kinship is the main aim of the kinship verification. Age progression-based kinship verification is one of the obscure parts in this research. The similarities in facial features between parent and their children will be numerous in their childhood. As age progress, child facial features are varied and dispersed from parent facial features. It becomes a challenging task to estimate their kinship. So, a new dimensional database with parent in childhood and their child images is collected. This paper proposes and trains a metric to ensure that the model can predict whether the given pair images are kin or non-kin. In training module, differences of Histogram of Gradient (HoG) features for all combinations of pairs are computed and each pair absolute differences are calculated. Further, selective minimum variances are used to assess the kin similarity features. A global threshold is computed to classify kins and non-kins. After this comprehensive training, testing is also done in a similar way. The computed global threshold in training module is effectively used to estimate kinship verification in testing module. Experimental results are presented and out performed with an accuracy of 82%.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132519653","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530124
O. Pandey, Naga Srinivasarao Chilamkurthy, R. Hegde
In recent years, small world characteristics (SWC) received huge attention due to their various advantages in the context of social, electrical, computer, and wireless networks. A wireless sensor network (WSN) exhibiting SWC is known as small world WSN (SW-WSN). Therefore, SW-WSN consists small average path length and large average clustering coefficient. Here, in this paper, a novel optimal link scheduling method is proposed to develop SW-WSN. The proposed method determines, optimal number of new links need to be created in the network. Additionally, the method also finds the optimal node-pairs towards creation of these links. The developed algorithm considers node betweenness centrality measure for the introduction of SWC. SW-WSN obtained using proposed method yields reduced time complexity towards its development. Moreover, it also results in optimal SWC when compared to other existing methods. A reduced data transmission delay is noted over SW-WSN developed using proposed method. Random, near-optimal, and sub-optimal methods of introducing SWC and their time complexities are also investigated and compared to the proposed method. The results are computed over simulated and real WSN testbed. Obtained results demonstrate the significance of proposed method and its utilization over large scale network applications.
{"title":"Optimal Link Scheduling for Low Latency Data Transfer over Small World WSNs","authors":"O. Pandey, Naga Srinivasarao Chilamkurthy, R. Hegde","doi":"10.1109/NCC52529.2021.9530124","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530124","url":null,"abstract":"In recent years, small world characteristics (SWC) received huge attention due to their various advantages in the context of social, electrical, computer, and wireless networks. A wireless sensor network (WSN) exhibiting SWC is known as small world WSN (SW-WSN). Therefore, SW-WSN consists small average path length and large average clustering coefficient. Here, in this paper, a novel optimal link scheduling method is proposed to develop SW-WSN. The proposed method determines, optimal number of new links need to be created in the network. Additionally, the method also finds the optimal node-pairs towards creation of these links. The developed algorithm considers node betweenness centrality measure for the introduction of SWC. SW-WSN obtained using proposed method yields reduced time complexity towards its development. Moreover, it also results in optimal SWC when compared to other existing methods. A reduced data transmission delay is noted over SW-WSN developed using proposed method. Random, near-optimal, and sub-optimal methods of introducing SWC and their time complexities are also investigated and compared to the proposed method. The results are computed over simulated and real WSN testbed. Obtained results demonstrate the significance of proposed method and its utilization over large scale network applications.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129490078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530081
S. Cecilia, S. Murugan
Underwater Images are of degraded quality due to the scattering and absorption. The color cast and turbidity that hinder the visibility of such images are due to the sediments present that vary for diverse environments. Shallow water images are very turbid. The images too suffer from negative effects of artificial illumination when capturing data. Here a two-step approach is formulated to restore and enhance the underwater images from different locations. The images are then blended using a wavelet fusion considering the mean of the images. The output images demonstrate reduced haze, improved contrast and enhanced sharpness with adequate removal of the color cast. The results project better visibility on both subjective and objective measures compared to recent restoration and enhancement methods.
{"title":"Visibility Restoration of Diverse Turbid Underwater Images- Two Step Approach","authors":"S. Cecilia, S. Murugan","doi":"10.1109/NCC52529.2021.9530081","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530081","url":null,"abstract":"Underwater Images are of degraded quality due to the scattering and absorption. The color cast and turbidity that hinder the visibility of such images are due to the sediments present that vary for diverse environments. Shallow water images are very turbid. The images too suffer from negative effects of artificial illumination when capturing data. Here a two-step approach is formulated to restore and enhance the underwater images from different locations. The images are then blended using a wavelet fusion considering the mean of the images. The output images demonstrate reduced haze, improved contrast and enhanced sharpness with adequate removal of the color cast. The results project better visibility on both subjective and objective measures compared to recent restoration and enhancement methods.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114948065","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2021-07-27DOI: 10.1109/NCC52529.2021.9530161
D. Mahanta, D. Hazarika, V. K. Nath
A biomedical image retrieval technique using novel multi-scale pattern based feature is proposed. The introduced technique, in each scale, employs arbitrary shaped sampling structures in addition to a classical circular sampling structure in local bit-planes for effective texture description, and named as the multi-scale local bit-plane arbitrary-shaped pattern (MS-LBASP). The proposed feature descriptor first downsamples the input image into three different scales. Then the bit planes of each downsampled image are extracted and the corresponding bit-planes are locally encoded, characterizing the local spatial arbitrary and circular shaped structures of texture. The quantization and mean based fusion is utilized to reduce the features. Finally, the relationship between the center-pixel and the fused local bit-plane transformed values are encoded using both sign and magnitude information for better feature description. The experiments were conducted to test the performance of MS-LBASP. Two benchmark computer tomography (CT) image datasets and one magnetic resonance imaging (MRI) image dataset were used in the experiments. Results demonstrate that the MS-LBASP outperforms the existing relevant state of the art image descriptors.
{"title":"Biomedical Image Retrieval using Muti-Scale Local Bit-plane Arbitrary Shaped Patterns","authors":"D. Mahanta, D. Hazarika, V. K. Nath","doi":"10.1109/NCC52529.2021.9530161","DOIUrl":"https://doi.org/10.1109/NCC52529.2021.9530161","url":null,"abstract":"A biomedical image retrieval technique using novel multi-scale pattern based feature is proposed. The introduced technique, in each scale, employs arbitrary shaped sampling structures in addition to a classical circular sampling structure in local bit-planes for effective texture description, and named as the multi-scale local bit-plane arbitrary-shaped pattern (MS-LBASP). The proposed feature descriptor first downsamples the input image into three different scales. Then the bit planes of each downsampled image are extracted and the corresponding bit-planes are locally encoded, characterizing the local spatial arbitrary and circular shaped structures of texture. The quantization and mean based fusion is utilized to reduce the features. Finally, the relationship between the center-pixel and the fused local bit-plane transformed values are encoded using both sign and magnitude information for better feature description. The experiments were conducted to test the performance of MS-LBASP. Two benchmark computer tomography (CT) image datasets and one magnetic resonance imaging (MRI) image dataset were used in the experiments. Results demonstrate that the MS-LBASP outperforms the existing relevant state of the art image descriptors.","PeriodicalId":414087,"journal":{"name":"2021 National Conference on Communications (NCC)","volume":"49 6","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-07-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114022786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}