Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986408
Yash Daga, S. Meena
Human activity recognition is a field where not enough work has been done yet, but its applications can be helpful in numerous domains. In this paper, we have discussed different methods by which implementation of human activity recognition can be carried away and compared those methods in terms of advantages, performance, accuracy, techniques, datasets, and limitations. We have also discussed numerous domains where this can be helpful such as healthcare, security, and augmented reality, along with the challenges and the type of method used.
{"title":"Applications of Human Activity Recognition in Different Fields: A Review","authors":"Yash Daga, S. Meena","doi":"10.1109/UPCON56432.2022.9986408","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986408","url":null,"abstract":"Human activity recognition is a field where not enough work has been done yet, but its applications can be helpful in numerous domains. In this paper, we have discussed different methods by which implementation of human activity recognition can be carried away and compared those methods in terms of advantages, performance, accuracy, techniques, datasets, and limitations. We have also discussed numerous domains where this can be helpful such as healthcare, security, and augmented reality, along with the challenges and the type of method used.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122666617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986404
Brij Kumar Bharti, A. Yadav
This paper presents a 3-dB Quadrature Branch Line Coupler (BLC) using substrate integrated waveguide (SIW) technology. The proposed design is a combination of equal width SIW lines. The structure is having fixed-width lines therefore, can be incorporated into the SIW based microwave circuits. The proposed coupler is designed using derived design equations and simulated at 15 GHz, simulation results show that the reflection coefficients of all the ports are better than 28 dB around the operating frequency, and a good isolation characteristic is achieved which is below 25 dB between the two input ports. At the operating band, two output signals have a phase difference value of $90^{circ}pm 1.5^{circ}$. The insertion loss at centre frequency is 0.84 dB. The proposed structure can be used in butler matrix, mono-pulse comparator circuits, and antenna feeding networks.
{"title":"Branch Line Coupler Based on Substrate Integrated Waveguide Technology","authors":"Brij Kumar Bharti, A. Yadav","doi":"10.1109/UPCON56432.2022.9986404","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986404","url":null,"abstract":"This paper presents a 3-dB Quadrature Branch Line Coupler (BLC) using substrate integrated waveguide (SIW) technology. The proposed design is a combination of equal width SIW lines. The structure is having fixed-width lines therefore, can be incorporated into the SIW based microwave circuits. The proposed coupler is designed using derived design equations and simulated at 15 GHz, simulation results show that the reflection coefficients of all the ports are better than 28 dB around the operating frequency, and a good isolation characteristic is achieved which is below 25 dB between the two input ports. At the operating band, two output signals have a phase difference value of $90^{circ}pm 1.5^{circ}$. The insertion loss at centre frequency is 0.84 dB. The proposed structure can be used in butler matrix, mono-pulse comparator circuits, and antenna feeding networks.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"450 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125071999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986483
S. K. Sharma, Anu Gupta, K. Raju
Over the last year, Deep neural networks (DNN) have been significantly accepted for computer vision applications because of high classification accuracy and versatility. Convolutional Neural Network (CNN) is one of the most popular architectures of DNN which is widely adopted for image, speech and video recognition. Extensive computation and large memory requirement of CNN s poses the bottleneck on its application. Field Programmable Gate Arrays (FPGAs) are considered to be suitable hardware platforms for deployment of CNNs with low power requirements. This paper focus on the design and implementation of hardware accelerator to perform the convolution product (matrix-matrix multiplication. We have used two optimization techniques to achieve energy efficiency. First, dataflow of the convolution phase is rescheduled to reduce the undesired on-chip memory accesses. Further, efficiency is enhanced by reducing the internal parallelism of structure as much as possible. Our architecture is implemented on the Xilinx ZCU104 evaluation board. The implemented design attains 98.1 GOPS/Joule and 32.77 GOPS/Joule for 8-bit and 16-bit data width respectively.
{"title":"Energy Efficient Hardware Implementation of 2-D Convolution for Convolutional Neural Network","authors":"S. K. Sharma, Anu Gupta, K. Raju","doi":"10.1109/UPCON56432.2022.9986483","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986483","url":null,"abstract":"Over the last year, Deep neural networks (DNN) have been significantly accepted for computer vision applications because of high classification accuracy and versatility. Convolutional Neural Network (CNN) is one of the most popular architectures of DNN which is widely adopted for image, speech and video recognition. Extensive computation and large memory requirement of CNN s poses the bottleneck on its application. Field Programmable Gate Arrays (FPGAs) are considered to be suitable hardware platforms for deployment of CNNs with low power requirements. This paper focus on the design and implementation of hardware accelerator to perform the convolution product (matrix-matrix multiplication. We have used two optimization techniques to achieve energy efficiency. First, dataflow of the convolution phase is rescheduled to reduce the undesired on-chip memory accesses. Further, efficiency is enhanced by reducing the internal parallelism of structure as much as possible. Our architecture is implemented on the Xilinx ZCU104 evaluation board. The implemented design attains 98.1 GOPS/Joule and 32.77 GOPS/Joule for 8-bit and 16-bit data width respectively.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123390029","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986445
Ananya Singh, Swati Jain
In the field of computation, the art of predicting the stock market has always been a tough nut to crack for researchers. This is because stock prices are highly influential values. The prices depend on many factors, ranging from physical to physiological, rational and irrational, from geopolitical stability to the sentiments of the investors – all play a crucial role. Investors anticipate market conditions in the future for a successful investment. Hence considering the past stock prices as an embodiment of the factors mentioned above, we propose a stacked long-short-term-memory (LSTM) model to predict the closing index of stock prices during this highly uncertain pandemic period using root mean square error (RSME) as the performance indicator. The model is optimized to improve the prediction accuracy in order to achieve high performance stock forecasting. The dataset considered is from NIFTY 50 scaling across four sectors, namely – auto, bank, healthcare and metal from a duration of 30th January 2020 to 31st March 2022. This paper aims to consider the historical data to analyze future patterns and insights.
{"title":"Stock Market Prediction During COVID Using Stacked LSTM","authors":"Ananya Singh, Swati Jain","doi":"10.1109/UPCON56432.2022.9986445","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986445","url":null,"abstract":"In the field of computation, the art of predicting the stock market has always been a tough nut to crack for researchers. This is because stock prices are highly influential values. The prices depend on many factors, ranging from physical to physiological, rational and irrational, from geopolitical stability to the sentiments of the investors – all play a crucial role. Investors anticipate market conditions in the future for a successful investment. Hence considering the past stock prices as an embodiment of the factors mentioned above, we propose a stacked long-short-term-memory (LSTM) model to predict the closing index of stock prices during this highly uncertain pandemic period using root mean square error (RSME) as the performance indicator. The model is optimized to improve the prediction accuracy in order to achieve high performance stock forecasting. The dataset considered is from NIFTY 50 scaling across four sectors, namely – auto, bank, healthcare and metal from a duration of 30th January 2020 to 31st March 2022. This paper aims to consider the historical data to analyze future patterns and insights.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125501014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986426
Anindita Sarkar, S. R. Chatterjee, M. Chakraborty
Reversible logic gates are universally adopted to replace the classical gates in various circuits to avail the advantages of power reduction and speed maximization. The application of reversible gates includes various fields like Quantum computing, Nano technology, low power CMOS, optical computing etc. Reversible gates are the key components for the quantum circuits. Here reversible gates are used to design a dynamic light weight S-box. The S-box is the key circuit component for cryptographic algorithm. The combination of Controlled NOT(CNOT), Peres and Selim Al Mamun (SAM) gates are used to achieve the dynamic nature of the S-Box. The simulation analysis is made to evaluate the cryptographic properties such as Avalanche Criteria (AC), Strict Avalanche Criteria (SAC), Bit Independence Criteria (BIC) and Nonlinearity to measure the strength and reliability of the proposed S-box. The comparative performance investigation is done with the Rijndael S-box used in popular AES algorithm as it was widely accepted by NISTIR and still shows that the proposed S-Box outperform with respect to the existing S-box. Use of input dependant reversible logic gate combination enhances the performance of the S-box in both classical and quantum computer.
在各种电路中,普遍采用可逆逻辑门来代替传统的门,以达到降低功耗和提高速度的目的。可逆门的应用领域包括量子计算、纳米技术、低功耗CMOS、光学计算等。可逆门是量子电路的关键器件。本文采用可逆栅极设计动态轻量化s盒。s盒是密码算法的关键电路元件。采用可控非门(CNOT)、Peres门和Selim Al Mamun门(SAM)的组合来实现S-Box的动态特性。通过仿真分析,对雪崩准则(AC)、严格雪崩准则(SAC)、位无关准则(BIC)和非线性等密码学特性进行了评价,以衡量所提出的s盒的强度和可靠性。比较性能调查是与流行的AES算法中使用的Rijndael S-box进行的,因为它被nistr广泛接受,并且仍然表明所提出的S-box优于现有的S-box。利用输入相关的可逆逻辑门组合增强了s盒在经典计算机和量子计算机中的性能。
{"title":"Dynamic S-box with Reversible Gates for both Classical and Quantum Computer","authors":"Anindita Sarkar, S. R. Chatterjee, M. Chakraborty","doi":"10.1109/UPCON56432.2022.9986426","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986426","url":null,"abstract":"Reversible logic gates are universally adopted to replace the classical gates in various circuits to avail the advantages of power reduction and speed maximization. The application of reversible gates includes various fields like Quantum computing, Nano technology, low power CMOS, optical computing etc. Reversible gates are the key components for the quantum circuits. Here reversible gates are used to design a dynamic light weight S-box. The S-box is the key circuit component for cryptographic algorithm. The combination of Controlled NOT(CNOT), Peres and Selim Al Mamun (SAM) gates are used to achieve the dynamic nature of the S-Box. The simulation analysis is made to evaluate the cryptographic properties such as Avalanche Criteria (AC), Strict Avalanche Criteria (SAC), Bit Independence Criteria (BIC) and Nonlinearity to measure the strength and reliability of the proposed S-box. The comparative performance investigation is done with the Rijndael S-box used in popular AES algorithm as it was widely accepted by NISTIR and still shows that the proposed S-Box outperform with respect to the existing S-box. Use of input dependant reversible logic gate combination enhances the performance of the S-box in both classical and quantum computer.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"111 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125571280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986477
Kumud Tripathi, Jatin Kumar
The objective of Multiple Speech Mode Transformation (MSMT) is to transform speech from one form to another on the basis of their mode characteristics. In this work, we have explored three different modes of speech (conversation, extempore, and read modes) for their inter-conversion while preserving the speaker identity and the linguistic content. To accomplish this we used a variant of Star Generative Adversarial Network (StarGAN) named as StarGAN-VC. For training, our model does not require parallel occurrences of the sentences and with relatively lesser number of training example we were able to generate high quality transformed outputs. On conducting objective and subjective evaluations, it is deduced that the transformed speech mode outputs are highly comparable to the target speech mode.
{"title":"Multiple Speech Mode Transformation using Adversarial Network","authors":"Kumud Tripathi, Jatin Kumar","doi":"10.1109/UPCON56432.2022.9986477","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986477","url":null,"abstract":"The objective of Multiple Speech Mode Transformation (MSMT) is to transform speech from one form to another on the basis of their mode characteristics. In this work, we have explored three different modes of speech (conversation, extempore, and read modes) for their inter-conversion while preserving the speaker identity and the linguistic content. To accomplish this we used a variant of Star Generative Adversarial Network (StarGAN) named as StarGAN-VC. For training, our model does not require parallel occurrences of the sentences and with relatively lesser number of training example we were able to generate high quality transformed outputs. On conducting objective and subjective evaluations, it is deduced that the transformed speech mode outputs are highly comparable to the target speech mode.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"171 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127097551","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986463
Shekhar Gehlaut, Mirnal Singh Rawat, D. Kumar
This paper proposes a novel, enhanced pole clustering (EPC) method for simplifying complex linear-time invariant (LTI) systems. In the presented technique, the grey wolf optimizer (GWO) is adopted to derive the numerator polynomial of the reduced-order model (ROM) by minimizing the integral squared error (ISE). The proposed technique assures the stability of the resulting ROM. A comparison of different performance indices is conducted to establish the effectiveness of the proposed model order reduction (MOR) method.
{"title":"Order Simplification of LTI Systems using Enhanced Pole Clustering Technique","authors":"Shekhar Gehlaut, Mirnal Singh Rawat, D. Kumar","doi":"10.1109/UPCON56432.2022.9986463","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986463","url":null,"abstract":"This paper proposes a novel, enhanced pole clustering (EPC) method for simplifying complex linear-time invariant (LTI) systems. In the presented technique, the grey wolf optimizer (GWO) is adopted to derive the numerator polynomial of the reduced-order model (ROM) by minimizing the integral squared error (ISE). The proposed technique assures the stability of the resulting ROM. A comparison of different performance indices is conducted to establish the effectiveness of the proposed model order reduction (MOR) method.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128204363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986436
R. S, S. S, Rakshitha R, B. Poornima
Finding areas in the image where the subsequent processing of the features concentrates is known as Region of Interest (ROI) extraction. Utilizing ROI helps speed up processing by excluding irrelevant image regions. ROI extraction in biomedical landmark annotation problems is challenging as radiograph images have varying contrast and intensity levels. Cephalometric landmark annotation is a domain where ROI extraction plays a vital role in traditional machine learning and deep learning solutions. This work proposes a simple and feasible extension to the template matching method to extract the ROI from the cephalometric images. The exact ROI patch is located based on a combined metric calculated using the Normalized correlation coefficient measure and the distance measure. The algorithm is tested on publicly available cephalometric landmark annotation dataset. The experimental results show that the ROIs are extracted with an accuracy of 99.69%. Additionally, a reported average distance between the ROI patch center and the ground truth landmark is 3.96 mm. This demonstrates that the method can practically be used as an initial estimator, significantly improving the accuracy of landmark localization.
{"title":"Extended Template Matching method for Region of Interest Extraction in Cephalometric Landmarks Annotation","authors":"R. S, S. S, Rakshitha R, B. Poornima","doi":"10.1109/UPCON56432.2022.9986436","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986436","url":null,"abstract":"Finding areas in the image where the subsequent processing of the features concentrates is known as Region of Interest (ROI) extraction. Utilizing ROI helps speed up processing by excluding irrelevant image regions. ROI extraction in biomedical landmark annotation problems is challenging as radiograph images have varying contrast and intensity levels. Cephalometric landmark annotation is a domain where ROI extraction plays a vital role in traditional machine learning and deep learning solutions. This work proposes a simple and feasible extension to the template matching method to extract the ROI from the cephalometric images. The exact ROI patch is located based on a combined metric calculated using the Normalized correlation coefficient measure and the distance measure. The algorithm is tested on publicly available cephalometric landmark annotation dataset. The experimental results show that the ROIs are extracted with an accuracy of 99.69%. Additionally, a reported average distance between the ROI patch center and the ground truth landmark is 3.96 mm. This demonstrates that the method can practically be used as an initial estimator, significantly improving the accuracy of landmark localization.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129704017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986405
S. D, P. K, Sivamani D, N. A, N. S, R. R
Nowadays, a lot of different sectors and researchers employ multilevel inverters for high power medium voltage applications. Because of this, hybrid topologies with fewer components have become increasingly popular. In this paper, the authors present an architecture for Hybrid Packed U-Cells that combines fewer switches with or without voltage balancing (H-PUCs). Eight switches are necessary for the proposed H-PUC for it to be able to offer voltage output at levels 7 and 15. H-PUC makes use of a combination of high voltage low frequency (HVLF) and low voltage high frequency (LVHF), which helps to reduce the amount of power that is wasted while simultaneously improving efficiency. A simulation and validation of the topology for the 7-level asymmetry is carried out on a microcontroller with the model number C2000. To evaluate the functionality of the proposed inverter, a simulation is run in the MATLAB programming environment. The H-PUC inverter topology that has been presented is one that is capable of being implemented in applications involving the integration of renewable energy.
{"title":"PUC Optimal Switching Strategies for Renewable Applications in Single Phase Inverter","authors":"S. D, P. K, Sivamani D, N. A, N. S, R. R","doi":"10.1109/UPCON56432.2022.9986405","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986405","url":null,"abstract":"Nowadays, a lot of different sectors and researchers employ multilevel inverters for high power medium voltage applications. Because of this, hybrid topologies with fewer components have become increasingly popular. In this paper, the authors present an architecture for Hybrid Packed U-Cells that combines fewer switches with or without voltage balancing (H-PUCs). Eight switches are necessary for the proposed H-PUC for it to be able to offer voltage output at levels 7 and 15. H-PUC makes use of a combination of high voltage low frequency (HVLF) and low voltage high frequency (LVHF), which helps to reduce the amount of power that is wasted while simultaneously improving efficiency. A simulation and validation of the topology for the 7-level asymmetry is carried out on a microcontroller with the model number C2000. To evaluate the functionality of the proposed inverter, a simulation is run in the MATLAB programming environment. The H-PUC inverter topology that has been presented is one that is capable of being implemented in applications involving the integration of renewable energy.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"380 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124737323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-12-02DOI: 10.1109/UPCON56432.2022.9986453
Memes are a way of communicating concepts across social media. However, while most memes are intended to be funny, some can turn into offensive as well when text and images are combined together. Recently many successful studies related to sentiment analysis of both image and text have been performed. Such technology, when developed successfully, can be useful for effective Human-Robot-Interactions, specially with humanoid and collaborative robots. In this research, we intend to first develop such technology with available data set using given classes only, since getting labelled data in the robotics domain, specially in robot grasping domain is difficult. In subsequent research, we may extend the same technology for intelligent robot grasping. However, the majority of the research uses either text or images for the sentiment analysis. Since the content and image in memes are sometimes unrelated, detecting hateful memes is a more challenging problem, so the present work considers both as features and uses a multimodal approach for sentiment analysis which could also be useful for Human-Robot-Interactions. Being constrained however with the available data sets, in the present investigation, our focus is on developing multimodal and sequential approaches for classifying these memes into different required classes, more specifically, here two classes: offensive and non-offensive. The fusion approach has been used within multiple modes to take features of both image and text through different models and then it has been used for the classification. While in the sequential approach, the image captioning model which is trained on the MS COCO dataset, with Optical Character Recognition (OCR), is used and classified with the help of the FastText classifier. Both approaches are used on two datasets, one is the MultiOFF dataset, and the other is the Facebook Hateful Meme dataset. Results on both datasets are found to be promising for both approaches.
{"title":"Meme Detection For Sentiment Analysis and Human Robot Interactions Using Multiple Modes","authors":"","doi":"10.1109/UPCON56432.2022.9986453","DOIUrl":"https://doi.org/10.1109/UPCON56432.2022.9986453","url":null,"abstract":"Memes are a way of communicating concepts across social media. However, while most memes are intended to be funny, some can turn into offensive as well when text and images are combined together. Recently many successful studies related to sentiment analysis of both image and text have been performed. Such technology, when developed successfully, can be useful for effective Human-Robot-Interactions, specially with humanoid and collaborative robots. In this research, we intend to first develop such technology with available data set using given classes only, since getting labelled data in the robotics domain, specially in robot grasping domain is difficult. In subsequent research, we may extend the same technology for intelligent robot grasping. However, the majority of the research uses either text or images for the sentiment analysis. Since the content and image in memes are sometimes unrelated, detecting hateful memes is a more challenging problem, so the present work considers both as features and uses a multimodal approach for sentiment analysis which could also be useful for Human-Robot-Interactions. Being constrained however with the available data sets, in the present investigation, our focus is on developing multimodal and sequential approaches for classifying these memes into different required classes, more specifically, here two classes: offensive and non-offensive. The fusion approach has been used within multiple modes to take features of both image and text through different models and then it has been used for the classification. While in the sequential approach, the image captioning model which is trained on the MS COCO dataset, with Optical Character Recognition (OCR), is used and classified with the help of the FastText classifier. Both approaches are used on two datasets, one is the MultiOFF dataset, and the other is the Facebook Hateful Meme dataset. Results on both datasets are found to be promising for both approaches.","PeriodicalId":185782,"journal":{"name":"2022 IEEE 9th Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124089888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}