Image captioning is a challenging task that lies at the intersection of Computer Vision and Natural Language Processing. There exists a legion of works that generate meaningful and realistic descriptions of images. Recently, with the advent of attention mechanisms and transformers, there has been a drastic shift in modelling both language and vision tasks. However, there are very few extensive studies that review these approaches based on their progression, advantages and disadvantages. This paper presents a detailed summary of transformer-based models employed for tackling image captioning. In addition to this, we provide an overview of various pre-training tasks, datasets and metrics used for image captioning. Finally, the performance of all the reviewed approaches are compared on the COCO Captions dataset.
{"title":"Attending to Transforms: A Survey on Transformer-based Image Captioning","authors":"Kshitij Ambilduke, Thanmay Jayakumar, Luqman Farooqui, Himanshu Padole, Anamika Singh","doi":"10.1109/PCEMS58491.2023.10136098","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136098","url":null,"abstract":"Image captioning is a challenging task that lies at the intersection of Computer Vision and Natural Language Processing. There exists a legion of works that generate meaningful and realistic descriptions of images. Recently, with the advent of attention mechanisms and transformers, there has been a drastic shift in modelling both language and vision tasks. However, there are very few extensive studies that review these approaches based on their progression, advantages and disadvantages. This paper presents a detailed summary of transformer-based models employed for tackling image captioning. In addition to this, we provide an overview of various pre-training tasks, datasets and metrics used for image captioning. Finally, the performance of all the reviewed approaches are compared on the COCO Captions dataset.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129837145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136119
Manish Kumar, Vipin Kumar, Aditya Kumar
As per the report from Statista, the total global consumption of rice is approximately 497 million metric tons in 2019-2020, and 149 million metric tons is consumed by China alone. Worldwide (Over 170 countries), maize is cultivated in nearly 1147.7 metric tons as per the report of FOSTAT, 2020. Crop leaf disease detection is a critical issue faced by farmers in the field of agriculture. In this research, we are dealing with multiclass leaf disease classification of rice and maize. For this purpose, rice and maize leaf disease image has been taken from the plant village dataset. Rice and Maize are the most popular crops in the sub-continental scenario are produced in bulk and suffer from several diseases reasoning both natural and chemical factors. To precisely handle this problem, we propose an Ensemble-based framework comprising of DenseNet121 and ResNet50, called EDRNet. The proposed method produces a very high accuracy of 96.7 % on rice and 90.9% for maize. The comparative analysis of the proposed EDRNet method with other methods shows that the proposed method has better performance over the Rice and Maize disease classification.
{"title":"Ensemble Based ERDNet model for Leaf Disease Detection in Rice and Maize Crops","authors":"Manish Kumar, Vipin Kumar, Aditya Kumar","doi":"10.1109/PCEMS58491.2023.10136119","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136119","url":null,"abstract":"As per the report from Statista, the total global consumption of rice is approximately 497 million metric tons in 2019-2020, and 149 million metric tons is consumed by China alone. Worldwide (Over 170 countries), maize is cultivated in nearly 1147.7 metric tons as per the report of FOSTAT, 2020. Crop leaf disease detection is a critical issue faced by farmers in the field of agriculture. In this research, we are dealing with multiclass leaf disease classification of rice and maize. For this purpose, rice and maize leaf disease image has been taken from the plant village dataset. Rice and Maize are the most popular crops in the sub-continental scenario are produced in bulk and suffer from several diseases reasoning both natural and chemical factors. To precisely handle this problem, we propose an Ensemble-based framework comprising of DenseNet121 and ResNet50, called EDRNet. The proposed method produces a very high accuracy of 96.7 % on rice and 90.9% for maize. The comparative analysis of the proposed EDRNet method with other methods shows that the proposed method has better performance over the Rice and Maize disease classification.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117010769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136031
Flewin Dsouza, Aditi Bodade, Hrugved Kolhe, Paresh Chaudhari, M. Madankar
The attention model allows paying flexible attention to only those components of the input that contribute to the effective execution of the task at hand. An artificial intelligence competition known as Machine Reading Comprehension (MRC) asks machines to respond to questions based on passages that they have been provided with. The primary purpose of this research is to provide responses to questions that were taken from the Stanford Question Answering Dataset (SQUAD), which includes paragraphs along with questions and the answers that correlate to those questions. This study focuses on the implementation of various approaches that take advantage of the attention mechanism. A thorough examination of emerging methods for producing word embeddings, feature extraction, attention mechanisms, and answer selection. The flaws and concerns with the model’s fairness and trustworthiness have also been noted.
{"title":"Optimizing MRC Tasks: Understanding and Resolving Ambiguities","authors":"Flewin Dsouza, Aditi Bodade, Hrugved Kolhe, Paresh Chaudhari, M. Madankar","doi":"10.1109/PCEMS58491.2023.10136031","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136031","url":null,"abstract":"The attention model allows paying flexible attention to only those components of the input that contribute to the effective execution of the task at hand. An artificial intelligence competition known as Machine Reading Comprehension (MRC) asks machines to respond to questions based on passages that they have been provided with. The primary purpose of this research is to provide responses to questions that were taken from the Stanford Question Answering Dataset (SQUAD), which includes paragraphs along with questions and the answers that correlate to those questions. This study focuses on the implementation of various approaches that take advantage of the attention mechanism. A thorough examination of emerging methods for producing word embeddings, feature extraction, attention mechanisms, and answer selection. The flaws and concerns with the model’s fairness and trustworthiness have also been noted.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121658987","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136056
Aditi Jain, S. Sinha, S. Mazumdar
Colorectal cancer (CRC) is the world’s third most frequent disease. Polyps which are growths that emerge as lumps on the colon lining are often benign, some may develop into malignant tumours over time, thus it is advisable to have them removed to prevent the risk of colorectal cancer. Early identification and characterization of the kind of polyp are crucial for cancer prevention and treatment. DCNNs have proved to be extremely effective in object categorization over a wide range of object categories. In this study, we experimentally evaluated and compared the effectiveness of the ResNet50 and EfficientNetB0 models in distinguishing Hyperplastic from Adenoma polyps and diagnosing them. Our findings show that cutting-edge DCNN models may correctly characterize the polyps with accuracy equivalent to or greater than that predicted by doctors. As a result, our findings might be valuable for future polyp categorization studies.
{"title":"Hyperplastic and Adenoma polyp classification using Deep networks","authors":"Aditi Jain, S. Sinha, S. Mazumdar","doi":"10.1109/PCEMS58491.2023.10136056","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136056","url":null,"abstract":"Colorectal cancer (CRC) is the world’s third most frequent disease. Polyps which are growths that emerge as lumps on the colon lining are often benign, some may develop into malignant tumours over time, thus it is advisable to have them removed to prevent the risk of colorectal cancer. Early identification and characterization of the kind of polyp are crucial for cancer prevention and treatment. DCNNs have proved to be extremely effective in object categorization over a wide range of object categories. In this study, we experimentally evaluated and compared the effectiveness of the ResNet50 and EfficientNetB0 models in distinguishing Hyperplastic from Adenoma polyps and diagnosing them. Our findings show that cutting-edge DCNN models may correctly characterize the polyps with accuracy equivalent to or greater than that predicted by doctors. As a result, our findings might be valuable for future polyp categorization studies.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126845730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136068
J. Rahate, Sai Naga Venkata Ramana Tadepalli, Udit Saroj, Ashwin Kamble, P. Ghare
Patients suffering from paralysis, and neuro-muscular diseases are unable to communicate. Hence, there is a need for an alternative way of communication. This research work has tried to address this issue using Electroencephalograph (EEG) signals. EEG is the recording of electrical activity produced by the firing of neurons within the brain. However, EEG recordings are always contaminated with artifacts, which hinder the decoding process. Therefore, identifying and removing artifacts is an important step. For this, a fresh EEG dataset with six words is collected from 10 subjects. The artifacts which contaminate the quality of EEG data are removed and empirical mode decomposition is used to decompose EEG signals into various intrinsic mode functions. Linear and nonlinear timedomain features are extracted from the modes. A feature set is obtained by selecting highly discriminant features using the analysis of variance test. Classification is performed using seven recent machine learning algorithms.
{"title":"Silent Speech Recognition using EEG Signals *","authors":"J. Rahate, Sai Naga Venkata Ramana Tadepalli, Udit Saroj, Ashwin Kamble, P. Ghare","doi":"10.1109/PCEMS58491.2023.10136068","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136068","url":null,"abstract":"Patients suffering from paralysis, and neuro-muscular diseases are unable to communicate. Hence, there is a need for an alternative way of communication. This research work has tried to address this issue using Electroencephalograph (EEG) signals. EEG is the recording of electrical activity produced by the firing of neurons within the brain. However, EEG recordings are always contaminated with artifacts, which hinder the decoding process. Therefore, identifying and removing artifacts is an important step. For this, a fresh EEG dataset with six words is collected from 10 subjects. The artifacts which contaminate the quality of EEG data are removed and empirical mode decomposition is used to decompose EEG signals into various intrinsic mode functions. Linear and nonlinear timedomain features are extracted from the modes. A feature set is obtained by selecting highly discriminant features using the analysis of variance test. Classification is performed using seven recent machine learning algorithms.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129642167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136041
A.Muthu Kumar, D. Jhariya
Authentication plays a crucial role in our day-to-day life. Whether it is in your office, college, university, hospitals, banks, social media account, and whatnot. In the present day, most authentication systems are using a single metric to determine authenticity (i.e., password). Although there are some systems that use Multi-Factor Authentication which includes OTPs, unique codes such as Time-based codes, etc., they require an additional device to carry. In our project, we want to develop a Multi-modal Authentication system where we use the unique metrics of a person to identify him such as the face, and fingerprint. In order to develop such a system, we made use of Raspberry Pi, OpenCV, and AWS services.
{"title":"Multifactor Authentication System","authors":"A.Muthu Kumar, D. Jhariya","doi":"10.1109/PCEMS58491.2023.10136041","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136041","url":null,"abstract":"Authentication plays a crucial role in our day-to-day life. Whether it is in your office, college, university, hospitals, banks, social media account, and whatnot. In the present day, most authentication systems are using a single metric to determine authenticity (i.e., password). Although there are some systems that use Multi-Factor Authentication which includes OTPs, unique codes such as Time-based codes, etc., they require an additional device to carry. In our project, we want to develop a Multi-modal Authentication system where we use the unique metrics of a person to identify him such as the face, and fingerprint. In order to develop such a system, we made use of Raspberry Pi, OpenCV, and AWS services.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"179 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116502549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136053
Md. Jasim Uddin Qureshi, Md. Abu Sayeed, M. Hossain, Thohidul Islam
A microstrip line-fed broadband rectangular microstrip patch antenna is proposed to work in 5G applications in an endeavor to play a significant role in the communication system, which operates at 28 GHz. An initial rectangular microstrip patch antenna has been designed and analysis its performance. The intended operating frequency range has been tuned in using a cutting the slot and edge with a partial ground plane approach (28.00 GHz-29.96 GHz). The inset-fed microstrip patch antenna presented in this paper has a defected ground structure (DGS) and is constructed on a Teflon substrate with a 2.1 dielectric constant. The use of a defected ground slot increases the antenna’s bandwidth and efficiency compare to full ground plane of antenna. With a wide -10dB bandwidth of 1.96 GHz, the developed antenna is projected to function at 28.90 GHz for the 5G application. Its return loss is -39.42dB. Additionally, the antennas have an overall size of 35 x39x1.57 mm3, a wide bandwidth of 1.96 GHz, and strong gain and directivity across the whole operating band. The VSWR value is 1. This proposed antenna produces better outcomes than some of the existing antennas described in a recent scholarly study. First, an antenna is designed, followed by simulation and performance analysis done using Computer Simulation Technology (CST) software. The improvement of the antenna’s parameters and operational bandwidth for 5G applications significantly improved during analysis. It follows that this antenna will probably be adequate for 5G wireless communication systems.
{"title":"Design and Performance Analysis of a Broadband Rectengular Microstrip Patch Antenna for 5G Application","authors":"Md. Jasim Uddin Qureshi, Md. Abu Sayeed, M. Hossain, Thohidul Islam","doi":"10.1109/PCEMS58491.2023.10136053","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136053","url":null,"abstract":"A microstrip line-fed broadband rectangular microstrip patch antenna is proposed to work in 5G applications in an endeavor to play a significant role in the communication system, which operates at 28 GHz. An initial rectangular microstrip patch antenna has been designed and analysis its performance. The intended operating frequency range has been tuned in using a cutting the slot and edge with a partial ground plane approach (28.00 GHz-29.96 GHz). The inset-fed microstrip patch antenna presented in this paper has a defected ground structure (DGS) and is constructed on a Teflon substrate with a 2.1 dielectric constant. The use of a defected ground slot increases the antenna’s bandwidth and efficiency compare to full ground plane of antenna. With a wide -10dB bandwidth of 1.96 GHz, the developed antenna is projected to function at 28.90 GHz for the 5G application. Its return loss is -39.42dB. Additionally, the antennas have an overall size of 35 x39x1.57 mm3, a wide bandwidth of 1.96 GHz, and strong gain and directivity across the whole operating band. The VSWR value is 1. This proposed antenna produces better outcomes than some of the existing antennas described in a recent scholarly study. First, an antenna is designed, followed by simulation and performance analysis done using Computer Simulation Technology (CST) software. The improvement of the antenna’s parameters and operational bandwidth for 5G applications significantly improved during analysis. It follows that this antenna will probably be adequate for 5G wireless communication systems.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126887601","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136122
Amit Kumar, D. Jhariya
LNA stands for Low-Noise Amplifier.It is an integral part of the receiver chain in any RF system. The LNA is the first in a long chain of devices in the receiver chain however it also has the most impact on the sensitivity as well as the overall performance of the subsequent stages of the receiver. LNA design for RFIC has a lot of trade-offs that need to be made in order to achieve good performance both in terms of noise Figure and gain. The cascode topology is selected as the main focus for this brief. It is basically a two-stage amplifier that is built around a Commongate and Common-Source stages. The cascode LNA topology with inductive degeneration is best able to circumnavigate the hurdles posed in 5G or other wide-band RF technologies. It is able to balance good gain flatness with respectable noise Figure over a wide range of frequency spectrum while providing good impedance matching and reverse isolation. A three-stage Cascode LNA built using Advanced Spice Model for High Electron Mobility Transistor (ASM-HEMT) using Cadence Virtuoso is initially implemented. Its performance analysis using simulation results is done. This work also discusses the design of a cascode low-noise amplifier (LNA) using completely open-source tools namely Xschem and Magic among others. The design environment setup is also discussed in detail. The technology used to realize the circuitry is Google Skywater130.
{"title":"Low-Noise Amplifier Design: An Open-Source Perspective","authors":"Amit Kumar, D. Jhariya","doi":"10.1109/PCEMS58491.2023.10136122","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136122","url":null,"abstract":"LNA stands for Low-Noise Amplifier.It is an integral part of the receiver chain in any RF system. The LNA is the first in a long chain of devices in the receiver chain however it also has the most impact on the sensitivity as well as the overall performance of the subsequent stages of the receiver. LNA design for RFIC has a lot of trade-offs that need to be made in order to achieve good performance both in terms of noise Figure and gain. The cascode topology is selected as the main focus for this brief. It is basically a two-stage amplifier that is built around a Commongate and Common-Source stages. The cascode LNA topology with inductive degeneration is best able to circumnavigate the hurdles posed in 5G or other wide-band RF technologies. It is able to balance good gain flatness with respectable noise Figure over a wide range of frequency spectrum while providing good impedance matching and reverse isolation. A three-stage Cascode LNA built using Advanced Spice Model for High Electron Mobility Transistor (ASM-HEMT) using Cadence Virtuoso is initially implemented. Its performance analysis using simulation results is done. This work also discusses the design of a cascode low-noise amplifier (LNA) using completely open-source tools namely Xschem and Magic among others. The design environment setup is also discussed in detail. The technology used to realize the circuitry is Google Skywater130.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131977246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136080
Bipasha Parui, Yagnesh Devada, K. Surender
3D reconstruction or 3D mapping of an environment is one of the most crucial stages of Simultaneous Localisation and Mapping (SLAM). Numerous work have been done to optimize the tracking and mapping process of SLAM systems over the years in both classical computer vision and deep learning fields. Although there have been many surveys that extensively study SLAM-based work, most of them do not discuss 3D mapping and its developments in much detail. In this paper, we discuss the history of SLAM from a general perspective as well as focus on 3D reconstruction/mapping. To our knowledge, our paper is the first to dedicatedly explore Neural Radiance Field (NeRF) research that is used for SLAM, pose estimation and 3D reconstruction. Thus we track the history of mapping techniques in classical feature-based, direct-based, deep learning-based and most importantly NeRF based literature. Finally, we make a comparative study of all the existing methods and discuss the challenges faced by these concluding the survey.
{"title":"Vision based 3D mapping-From Traditional to NeRF based approaches","authors":"Bipasha Parui, Yagnesh Devada, K. Surender","doi":"10.1109/PCEMS58491.2023.10136080","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136080","url":null,"abstract":"3D reconstruction or 3D mapping of an environment is one of the most crucial stages of Simultaneous Localisation and Mapping (SLAM). Numerous work have been done to optimize the tracking and mapping process of SLAM systems over the years in both classical computer vision and deep learning fields. Although there have been many surveys that extensively study SLAM-based work, most of them do not discuss 3D mapping and its developments in much detail. In this paper, we discuss the history of SLAM from a general perspective as well as focus on 3D reconstruction/mapping. To our knowledge, our paper is the first to dedicatedly explore Neural Radiance Field (NeRF) research that is used for SLAM, pose estimation and 3D reconstruction. Thus we track the history of mapping techniques in classical feature-based, direct-based, deep learning-based and most importantly NeRF based literature. Finally, we make a comparative study of all the existing methods and discuss the challenges faced by these concluding the survey.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"256 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122431093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-04-05DOI: 10.1109/PCEMS58491.2023.10136083
Bharti Dakhale, K. Vipinkumar, Kalla Narotham, Shantanu Kadam, Ankit A. Bhurane, Ashwin Kothari
With the expanding usage of electronic devices such as smartphones and smartwatches in daily life, the need for advanced Integrated Circuits (ICs) is also increasing. Corporations are compelled to outsource IC design and production to several third-party vendors to keep up with demand. This has allowed adversaries to make unauthorized modifications to the circuits. As a result, malicious adversaries have been able to deploy Hardware Trojans (HTs), similar to software viruses, as they may cause data leakage and circuit disruption. The currently known methods for HT detection rely on expensive and often impractical destructive methods like reverse engineering or non-destructive methods like comparison with the golden chip. In this paper, we propose a method for detecting HTs based on the VGG-Net architecture. The model has an accuracy of 93%, 87%, 100%, 100%, and 76% on the Advanced Encryption Standard (AES) benchmarks of T500, T600, T700, T800, and T1600, respectively, for an average accuracy of 91.2%. It surpasses existing state-of-the-art models in the AES-T600, AES-T700, AES-T800, and AES-T1600 benchmarks.
{"title":"Automated Detection of Hardware Trojans using Power Side-Channel Analysis and VGG-Net","authors":"Bharti Dakhale, K. Vipinkumar, Kalla Narotham, Shantanu Kadam, Ankit A. Bhurane, Ashwin Kothari","doi":"10.1109/PCEMS58491.2023.10136083","DOIUrl":"https://doi.org/10.1109/PCEMS58491.2023.10136083","url":null,"abstract":"With the expanding usage of electronic devices such as smartphones and smartwatches in daily life, the need for advanced Integrated Circuits (ICs) is also increasing. Corporations are compelled to outsource IC design and production to several third-party vendors to keep up with demand. This has allowed adversaries to make unauthorized modifications to the circuits. As a result, malicious adversaries have been able to deploy Hardware Trojans (HTs), similar to software viruses, as they may cause data leakage and circuit disruption. The currently known methods for HT detection rely on expensive and often impractical destructive methods like reverse engineering or non-destructive methods like comparison with the golden chip. In this paper, we propose a method for detecting HTs based on the VGG-Net architecture. The model has an accuracy of 93%, 87%, 100%, 100%, and 76% on the Advanced Encryption Standard (AES) benchmarks of T500, T600, T700, T800, and T1600, respectively, for an average accuracy of 91.2%. It surpasses existing state-of-the-art models in the AES-T600, AES-T700, AES-T800, and AES-T1600 benchmarks.","PeriodicalId":330870,"journal":{"name":"2023 2nd International Conference on Paradigm Shifts in Communications Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125231569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}