In this day and age of widespread multimedia content, it is quite common to listen to a song, wish to identify it and also continue to listen to it in sync, even in the absence of the original sound source. However, this is quite difficult to achieve in noisy environments or in radio transmissions, due to features like time-stretching. Despite the presence of popular query-by-example apps like shazam, there are no applications that combine audio recognition with real-time synchronization of the song. This paper attempts to present a novel method of realtime audio synchronization by making use of established audio fingerprinting techniques and proposing a scaleable distributed handling mechanism for handling larger databases.
{"title":"Real Time Audio Synchronization Using Audio Fingerprinting Techniques","authors":"Tarun Kumar Yadav, Gautam Sanjeev Bidari, Adwait Abhay Pande, K. Surender","doi":"10.1109/PCEMS55161.2022.9808050","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9808050","url":null,"abstract":"In this day and age of widespread multimedia content, it is quite common to listen to a song, wish to identify it and also continue to listen to it in sync, even in the absence of the original sound source. However, this is quite difficult to achieve in noisy environments or in radio transmissions, due to features like time-stretching. Despite the presence of popular query-by-example apps like shazam, there are no applications that combine audio recognition with real-time synchronization of the song. This paper attempts to present a novel method of realtime audio synchronization by making use of established audio fingerprinting techniques and proposing a scaleable distributed handling mechanism for handling larger databases.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127153683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807953
Neeraj Chidella, N. K. Reddy, Nicole Reddy, Maddi Mohan, Joydeep Sengupta
With the rapidly increasing technology and development in machine learning, deep learning and artificial intelligence, improving the billing system is an effective means of reducing wastage of time. Nowadays, even though barcode scanners have become as fast as ever but for fruits and vegetables, it still needs to be entered manually into the computer which is very time taking and hectic process. Vegetable and fruit markets have become an integral part of our life hence in such places the environment must be made hassle free and more importantly, the billing should be less laborious and efficient without wasting time. In order to overcome the existing problems associated with the barcode and RFID tags, we proposed an automatic billing system that detects the fruits and vegetables and then displays the final Bill. The main objective of this project is to detect the fruits, display the fruits detected and then to bill these items. To achieve this, we have used two different algorithms, 1) Fine tuned Convolutional Neural Network that we built from base model. 2) To increase accuracy for real time object detection and for the bounding boxes to be displayed, we used state of the art YOLO based on pytorch as YOLO predicts the bounding boxes and detects the object faster than other detection algorithms and is more reliable.
{"title":"Intelligent Billing system using Object Detection","authors":"Neeraj Chidella, N. K. Reddy, Nicole Reddy, Maddi Mohan, Joydeep Sengupta","doi":"10.1109/PCEMS55161.2022.9807953","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807953","url":null,"abstract":"With the rapidly increasing technology and development in machine learning, deep learning and artificial intelligence, improving the billing system is an effective means of reducing wastage of time. Nowadays, even though barcode scanners have become as fast as ever but for fruits and vegetables, it still needs to be entered manually into the computer which is very time taking and hectic process. Vegetable and fruit markets have become an integral part of our life hence in such places the environment must be made hassle free and more importantly, the billing should be less laborious and efficient without wasting time. In order to overcome the existing problems associated with the barcode and RFID tags, we proposed an automatic billing system that detects the fruits and vegetables and then displays the final Bill. The main objective of this project is to detect the fruits, display the fruits detected and then to bill these items. To achieve this, we have used two different algorithms, 1) Fine tuned Convolutional Neural Network that we built from base model. 2) To increase accuracy for real time object detection and for the bounding boxes to be displayed, we used state of the art YOLO based on pytorch as YOLO predicts the bounding boxes and detects the object faster than other detection algorithms and is more reliable.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129427349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807886
Ridita Garg, Isha Bhatt, K. Eashwer, S. Jindal
Elders depend on their medicines to keep them stable, yet complex prescription timetables can spur confusion like missing dosages, wrong medication, or medications at some wrong time. These slip-ups could lead to a redundant specialist or clinic visits, sickness, and even demise. Consequently, there is a need to plan a Medication Dispenser to assist elders with taking the drug on time. This would dissuade impromptu clinic or specialist visits connected with mistaken medicine use. The objective involves the development of an intelligent device that dispenses the medicines on the advised schedule. This work requires interfacing LCD, motor, and interfacing RFID reader with an 8051 microcontroller.
{"title":"Experimental Design and Implementation of RFID based Clinical Medicine Dispenser","authors":"Ridita Garg, Isha Bhatt, K. Eashwer, S. Jindal","doi":"10.1109/PCEMS55161.2022.9807886","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807886","url":null,"abstract":"Elders depend on their medicines to keep them stable, yet complex prescription timetables can spur confusion like missing dosages, wrong medication, or medications at some wrong time. These slip-ups could lead to a redundant specialist or clinic visits, sickness, and even demise. Consequently, there is a need to plan a Medication Dispenser to assist elders with taking the drug on time. This would dissuade impromptu clinic or specialist visits connected with mistaken medicine use. The objective involves the development of an intelligent device that dispenses the medicines on the advised schedule. This work requires interfacing LCD, motor, and interfacing RFID reader with an 8051 microcontroller.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130368611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the hit of the global pandemic COVID-19, the chest X-ray domain has gained prominence. It has been recognised as one of the principal methods to learn the presence of infection and its effect on various internal organs like the lungs. Chest radiographs show abnormalities due to COVID-19 that appear similar to the anomalies caused by other viruses and bacteria, thus making it challenging for technicians to detect. Therefore, it becomes almost inevitable to have a computer vision model that identifies and localizes the COVID-19 virus to help doctors provide an immediate and confident diagnosis. The models in computer vision tasks have seen considerable advancements in deep learning, so the proposed model tried to integrate a few of them to come up with a model for classifying and localising the diagnosis of COVID-19 using chest X-rays. This paper ensembles a few state-of-the-art models in classification and object detection to build a model for chest radiograph diagnosis. The proposed ensembled model is found to achieve the mean Average Precision value of 0.627 on SIIM-FISABIO-RSNA COVID-19 dataset.
{"title":"Identification and Localization of COVID-19 Abnormalities on Chest Radiographs using Ensembled Deep Neural Networks","authors":"Manikiran Kommidi, Anudeep Chinta, Tarun Kumar Dachepally, Srilatha Chebrolu","doi":"10.1109/PCEMS55161.2022.9807972","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807972","url":null,"abstract":"With the hit of the global pandemic COVID-19, the chest X-ray domain has gained prominence. It has been recognised as one of the principal methods to learn the presence of infection and its effect on various internal organs like the lungs. Chest radiographs show abnormalities due to COVID-19 that appear similar to the anomalies caused by other viruses and bacteria, thus making it challenging for technicians to detect. Therefore, it becomes almost inevitable to have a computer vision model that identifies and localizes the COVID-19 virus to help doctors provide an immediate and confident diagnosis. The models in computer vision tasks have seen considerable advancements in deep learning, so the proposed model tried to integrate a few of them to come up with a model for classifying and localising the diagnosis of COVID-19 using chest X-rays. This paper ensembles a few state-of-the-art models in classification and object detection to build a model for chest radiograph diagnosis. The proposed ensembled model is found to achieve the mean Average Precision value of 0.627 on SIIM-FISABIO-RSNA COVID-19 dataset.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"86 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132897285","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807874
Sankalp Naik, Osama Khan, Ashay Katre, A. Keskar
With the boom of the internet, online streaming algorithms such as Adaptive bitrate streaming have gained prominence. The Adaptive Bitrate (ABR) scheme uses the Model Predictive Control (MPC) to determine the best possible bitrate for the given network conditions. Though this method works well, the major disadvantage is its heavy reliance on the throughput prediction error which makes it difficult to perform well in congested network conditions. Other methods such as DeepMPC have also been explored in this paper which use the Deep Learning algorithms to predict the bandwidth. These work better than the trivial harmonic predictor but demand high computational power. This paper proposes ARMPC which uses the Auto-Regressive Integrated Moving Average Technique (ARIMA) to predict the future bandwidth. Using trace-driven experiments, we have shown both mathematically and practically that the ARMPC can provide us with improvements in both the prediction and the computational points of view.
{"title":"ARMPC - ARIMA based prediction model for Adaptive Bitrate Scheme in Streaming","authors":"Sankalp Naik, Osama Khan, Ashay Katre, A. Keskar","doi":"10.1109/PCEMS55161.2022.9807874","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807874","url":null,"abstract":"With the boom of the internet, online streaming algorithms such as Adaptive bitrate streaming have gained prominence. The Adaptive Bitrate (ABR) scheme uses the Model Predictive Control (MPC) to determine the best possible bitrate for the given network conditions. Though this method works well, the major disadvantage is its heavy reliance on the throughput prediction error which makes it difficult to perform well in congested network conditions. Other methods such as DeepMPC have also been explored in this paper which use the Deep Learning algorithms to predict the bandwidth. These work better than the trivial harmonic predictor but demand high computational power. This paper proposes ARMPC which uses the Auto-Regressive Integrated Moving Average Technique (ARIMA) to predict the future bandwidth. Using trace-driven experiments, we have shown both mathematically and practically that the ARMPC can provide us with improvements in both the prediction and the computational points of view.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"101 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124642838","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807899
Rohan Paul, Arpan Karar, Abhirup Datta, S. Jindal
A traditional model of the smart helmet has been produced to assist miners operating in the mining industry. Many risky incidents commonly occur in the mining sector, many of which result in life-threatening injuries or death. A miner’s helmet is one of the most regularly used safety equipment for mine workers hence it must be loaded with some more advanced features. With the use of different sensors, the smart helmet will be able to identify catastrophic situations such as harmful gases like Carbon-Monoxide, CH4, LPG, and natural gases. Whether the miner is wearing his helmet or not is detected by an infrared sensor. Each sensor has a critical value that, if exceeded, causes the buzzer to activate and the LEDs to illuminate, signaling the miners and supervisors. The GPS module fitted in the miners’ helmets allows the mining officials to readily track their locations. Furthermore, a Panic Button has been implemented, which, when pressed, sends an emergency signal via mail to higher authorities outside the mines. A mobile application has also been created to display all of the data supplied wirelessly from the sensors. As a result, the proposed smart helmet protects miners from any upcoming accidents.
{"title":"Smart Helmet For Coal Miners","authors":"Rohan Paul, Arpan Karar, Abhirup Datta, S. Jindal","doi":"10.1109/PCEMS55161.2022.9807899","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807899","url":null,"abstract":"A traditional model of the smart helmet has been produced to assist miners operating in the mining industry. Many risky incidents commonly occur in the mining sector, many of which result in life-threatening injuries or death. A miner’s helmet is one of the most regularly used safety equipment for mine workers hence it must be loaded with some more advanced features. With the use of different sensors, the smart helmet will be able to identify catastrophic situations such as harmful gases like Carbon-Monoxide, CH4, LPG, and natural gases. Whether the miner is wearing his helmet or not is detected by an infrared sensor. Each sensor has a critical value that, if exceeded, causes the buzzer to activate and the LEDs to illuminate, signaling the miners and supervisors. The GPS module fitted in the miners’ helmets allows the mining officials to readily track their locations. Furthermore, a Panic Button has been implemented, which, when pressed, sends an emergency signal via mail to higher authorities outside the mines. A mobile application has also been created to display all of the data supplied wirelessly from the sensors. As a result, the proposed smart helmet protects miners from any upcoming accidents.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130266543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807973
Vaijayanti Panse, T. Jain, A. Kothari
The radio frequency energy harvesting (RF-EH) technique provides a potential way to power the battery-constrained wireless devices in the future generation wireless networks. In this paper, we investigate a dual-hop decode-and-forward (DF) cooperative network with RF-EH using non-linear hybrid power-time-splitting (PTS) based model. In the proposed system, the best relay is obtained by considering the instantaneous signal-to-noise ratios (SNRs) of source (S) to relay (R) links using three selection schemes, namely, absolute SNR-based selection, normalized SNR-based selection and random selection. Considering the DF protocol at R, we evaluate the outage and throughput performances of the system over independent and identically distributed Rayleigh fading channels. The derived results are validated through Monte-Carlo simulations.
{"title":"Relay Selection in SWIPT-enabled Cooperative Networks","authors":"Vaijayanti Panse, T. Jain, A. Kothari","doi":"10.1109/PCEMS55161.2022.9807973","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807973","url":null,"abstract":"The radio frequency energy harvesting (RF-EH) technique provides a potential way to power the battery-constrained wireless devices in the future generation wireless networks. In this paper, we investigate a dual-hop decode-and-forward (DF) cooperative network with RF-EH using non-linear hybrid power-time-splitting (PTS) based model. In the proposed system, the best relay is obtained by considering the instantaneous signal-to-noise ratios (SNRs) of source (S) to relay (R) links using three selection schemes, namely, absolute SNR-based selection, normalized SNR-based selection and random selection. Considering the DF protocol at R, we evaluate the outage and throughput performances of the system over independent and identically distributed Rayleigh fading channels. The derived results are validated through Monte-Carlo simulations.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121005370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807891
Pallavi S. Kadam, P. Manikanta, N. Rao
This paper presents a metamaterial based microstrip patch antenna with fractal geometry and defected ground to give improved Return loss, Directivity, Gain and Bandwidth.Two rectangular shapes have been cut from two edges of the ground.RT 5880LZ dielectric has been used as substrate along with a Meta-surface (MS) layer separated by air gap. The MS unit cell consists of two L-shaped patches on two corners and a C-shaped patch at the centre.Fractal Patch with rectangular notches on three sides has been used along with microstrip feedline to feed the antenna.This antenna gives a good impedance matching in the frequency range of 10.6 GHz - 11.3 GHz.Maximum return loss of 21.31 dB has been achieved and a gain of 6 dBi has been obtained at the operating frequency of 10.95 GHz. This efficient antenna model can be implemented for satellite communication in X-band.
{"title":"Metasurface based microstrip patch antenna at 11GHz frequency for enhanced gain and directivity","authors":"Pallavi S. Kadam, P. Manikanta, N. Rao","doi":"10.1109/PCEMS55161.2022.9807891","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807891","url":null,"abstract":"This paper presents a metamaterial based microstrip patch antenna with fractal geometry and defected ground to give improved Return loss, Directivity, Gain and Bandwidth.Two rectangular shapes have been cut from two edges of the ground.RT 5880LZ dielectric has been used as substrate along with a Meta-surface (MS) layer separated by air gap. The MS unit cell consists of two L-shaped patches on two corners and a C-shaped patch at the centre.Fractal Patch with rectangular notches on three sides has been used along with microstrip feedline to feed the antenna.This antenna gives a good impedance matching in the frequency range of 10.6 GHz - 11.3 GHz.Maximum return loss of 21.31 dB has been achieved and a gain of 6 dBi has been obtained at the operating frequency of 10.95 GHz. This efficient antenna model can be implemented for satellite communication in X-band.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"342 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127703897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9807846
Sachin N. Kapgate, Pravin Sahu, M. Das, Deep Gupta
This paper presents an embedded robotic system using Kinect sensor technology that is utilized to detect an individual target and track its movement in the surrounding. The developed system integrates both the features of computer vision and embedded robotics simultaneously. In this paper, a skeleton-based tracking algorithm is used because tracking plays an important role in localization and mapping. This system uses a gesture-based recognition that relies on Microsoft Kinect XBox 360 instead of a standard touch-based control system that can be utilized as a stand-alone system or subsystem to integrate into a larger system. The Kinect sensor captures the 3-dimensional information of the surroundings and recognizes the human body by retrieving the depth information that does not require wearing any kind of intrusive sensors. Firstly, this robotic system follows an individual by detection of torso point that is required for steering and maintaining a fixed safe distance for localization and mapping and provides a robust and reliable system. The proposed system can be utilized in a wide variety of applications such as work assistants, luggage carrying carts, etc.
本文介绍了一种使用Kinect传感器技术的嵌入式机器人系统,该系统用于检测单个目标并跟踪其在周围环境中的运动。所开发的系统同时集成了计算机视觉和嵌入式机器人的特点。由于跟踪在定位和映射中起着重要的作用,本文采用了基于骨架的跟踪算法。该系统使用基于手势的识别技术,依赖于微软Kinect XBox 360,而不是标准的基于触摸的控制系统,可以作为一个独立的系统或子系统集成到一个更大的系统中。Kinect传感器捕捉周围环境的三维信息,并通过检索深度信息来识别人体,而不需要佩戴任何侵入式传感器。首先,该机器人系统通过检测个体转向所需的躯干点来跟踪个体,并保持固定的安全距离进行定位和绘图,提供了一个鲁棒可靠的系统。所提出的系统可用于各种各样的应用,如工作助理、行李搬运车等。
{"title":"Human Following Robot using Kinect in Embedded Platform","authors":"Sachin N. Kapgate, Pravin Sahu, M. Das, Deep Gupta","doi":"10.1109/PCEMS55161.2022.9807846","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9807846","url":null,"abstract":"This paper presents an embedded robotic system using Kinect sensor technology that is utilized to detect an individual target and track its movement in the surrounding. The developed system integrates both the features of computer vision and embedded robotics simultaneously. In this paper, a skeleton-based tracking algorithm is used because tracking plays an important role in localization and mapping. This system uses a gesture-based recognition that relies on Microsoft Kinect XBox 360 instead of a standard touch-based control system that can be utilized as a stand-alone system or subsystem to integrate into a larger system. The Kinect sensor captures the 3-dimensional information of the surroundings and recognizes the human body by retrieving the depth information that does not require wearing any kind of intrusive sensors. Firstly, this robotic system follows an individual by detection of torso point that is required for steering and maintaining a fixed safe distance for localization and mapping and provides a robust and reliable system. The proposed system can be utilized in a wide variety of applications such as work assistants, luggage carrying carts, etc.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128669735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2022-05-06DOI: 10.1109/PCEMS55161.2022.9808012
Aman Verma, Raghav Agrawal, Priyanka Singh, N. Ansari
Speech emotion recognition has shown several advancements as a result of advancements in Deep Learning algorithms. These algorithms can easily extract the features from the data and learn to recognize patterns from them. Although these algorithms can successfully recognize emotions, their efficiency is often argued. The main objective of this paper is to efficiently classify the emotional state of a person from speech signals using traditional machine learning and deep learning techniques and to present a comparative analysis. We have considered eight different types of emotions, and have analyzed them in the following two ways: First, by considering the male and female emotions combinedly (gender-neutral) where they are classified into eight classes, and second, separately for the male and female emotions (gender-based) for a total of 16 classes. We have performed experimentation and have tested several architectures like K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), One Dimensional Convolutional Neural Network + Long Short-Term Memory (ID CNN+LSTM) by efficiently tuning the hyperparameters to classify the emotional states. Best results are obtained with the ID CNN + LSTM model. We have obtained an accuracy of 87.4% for gender-neutral cases and 82.78% for gender-based cases. This model outperforms existing techniques.
{"title":"An Acoustic Analysis of Speech for Emotion Recognition using Deep Learning","authors":"Aman Verma, Raghav Agrawal, Priyanka Singh, N. Ansari","doi":"10.1109/PCEMS55161.2022.9808012","DOIUrl":"https://doi.org/10.1109/PCEMS55161.2022.9808012","url":null,"abstract":"Speech emotion recognition has shown several advancements as a result of advancements in Deep Learning algorithms. These algorithms can easily extract the features from the data and learn to recognize patterns from them. Although these algorithms can successfully recognize emotions, their efficiency is often argued. The main objective of this paper is to efficiently classify the emotional state of a person from speech signals using traditional machine learning and deep learning techniques and to present a comparative analysis. We have considered eight different types of emotions, and have analyzed them in the following two ways: First, by considering the male and female emotions combinedly (gender-neutral) where they are classified into eight classes, and second, separately for the male and female emotions (gender-based) for a total of 16 classes. We have performed experimentation and have tested several architectures like K-Nearest Neighbor (KNN), Multilayer Perceptron (MLP), One Dimensional Convolutional Neural Network + Long Short-Term Memory (ID CNN+LSTM) by efficiently tuning the hyperparameters to classify the emotional states. Best results are obtained with the ID CNN + LSTM model. We have obtained an accuracy of 87.4% for gender-neutral cases and 82.78% for gender-based cases. This model outperforms existing techniques.","PeriodicalId":248874,"journal":{"name":"2022 1st International Conference on the Paradigm Shifts in Communication, Embedded Systems, Machine Learning and Signal Processing (PCEMS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125932509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}