Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10104750
Kunal Bhujbal, Dr. Mahendra Pawar
Self-driving cars have become a trending subject with a significant improvement in the technologies in the last decade. The project purpose is to train a convolutional neural network to drive an autonomous car agent on the tracks of Udacity’s Car Simulator environment. Udacity has released the simulator as an open source software. Driving a car in an autonomous manner requires learning to control steering angle, throttle and brakes. Behavioral cloning technique is used to mimic human driving behavior in the training mode on the track. That means a dataset is generated in the simulator by a user driven car in training mode, and the NVIDIA’s convolutional neural network model then drives the car in autonomous mode. Augmentation and image pre-processing are used to increase the accuracy of CNN model.
{"title":"Deep Learning Model for Simulating Self Driving Car","authors":"Kunal Bhujbal, Dr. Mahendra Pawar","doi":"10.1109/CSCITA55725.2023.10104750","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104750","url":null,"abstract":"Self-driving cars have become a trending subject with a significant improvement in the technologies in the last decade. The project purpose is to train a convolutional neural network to drive an autonomous car agent on the tracks of Udacity’s Car Simulator environment. Udacity has released the simulator as an open source software. Driving a car in an autonomous manner requires learning to control steering angle, throttle and brakes. Behavioral cloning technique is used to mimic human driving behavior in the training mode on the track. That means a dataset is generated in the simulator by a user driven car in training mode, and the NVIDIA’s convolutional neural network model then drives the car in autonomous mode. Augmentation and image pre-processing are used to increase the accuracy of CNN model.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127517946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10105056
Jay Sanghvi, Jay Rathod, Sakshi Nemade, Hasti Panchal, A. Pavate
As more and more logos are produced, logo detection has gradually grown in popularity as study across numerous jobs and sectors. Deep learning-based solutions, which make use of numerous data sets,learning techniques, network designs, etc., have dominated recent advancements in this field. This research examines the progress made in the field of logo detection using deep learning approaches. In order to evaluate the efficacy of logo detection algorithms, which tend to be more diversified, difficult, and realistically reflective of real life, we first discuss a thorough background of the topic. The pros and disadvantages of each learning approach are then thoroughly analysed, along with the current logo detection strategies.To wrap up this study, we examine probable obstacles and provide the future directions for logo detecting development.
{"title":"Logo Detection Using Machine Learning Algorithm : A Survey","authors":"Jay Sanghvi, Jay Rathod, Sakshi Nemade, Hasti Panchal, A. Pavate","doi":"10.1109/CSCITA55725.2023.10105056","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10105056","url":null,"abstract":"As more and more logos are produced, logo detection has gradually grown in popularity as study across numerous jobs and sectors. Deep learning-based solutions, which make use of numerous data sets,learning techniques, network designs, etc., have dominated recent advancements in this field. This research examines the progress made in the field of logo detection using deep learning approaches. In order to evaluate the efficacy of logo detection algorithms, which tend to be more diversified, difficult, and realistically reflective of real life, we first discuss a thorough background of the topic. The pros and disadvantages of each learning approach are then thoroughly analysed, along with the current logo detection strategies.To wrap up this study, we examine probable obstacles and provide the future directions for logo detecting development.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124730729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
For many years, stock market portfolio management has been successful in attracting the interest of several academics from the domains of computer science, finance, and mathematics worldwide. The main focus of investors and fund managers in the financial markets is to successfully monitor as well as manage investment portfolios. This paper is based on developing a Web Application which will assist small equity investors in checking and monitoring the health of an individual stock as well as the overall health of users portfolio. With minimal knowledge of stock market, one can build a great customized portfolio. The application will also notify the risky stocks which will help the investors to minimize the risk. It will be able to adapt the changes made in the portfolio and will also have other features which are needed by equity investors. In this paper, an algorithm has been proposed which will evaluate the health of the stock depending upon the parameters such as P/E Ratio, Dividend Yield, Debt to Equity, Industry P/E, ROE, ROCE, PEG Ratio, Profit Growth of past 5 years, Sales Growth of past 5 years and Sector. The entire web application will be hosted on the AWS cloud by leveraging its services to make it more accessible and scalable.
{"title":"Stock Portfolio Health Monitoring System","authors":"Soham Shinde, Aditya Ware, Sachin Yadav, Aldrin Paul, Ramjee Yadav","doi":"10.1109/CSCITA55725.2023.10105068","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10105068","url":null,"abstract":"For many years, stock market portfolio management has been successful in attracting the interest of several academics from the domains of computer science, finance, and mathematics worldwide. The main focus of investors and fund managers in the financial markets is to successfully monitor as well as manage investment portfolios. This paper is based on developing a Web Application which will assist small equity investors in checking and monitoring the health of an individual stock as well as the overall health of users portfolio. With minimal knowledge of stock market, one can build a great customized portfolio. The application will also notify the risky stocks which will help the investors to minimize the risk. It will be able to adapt the changes made in the portfolio and will also have other features which are needed by equity investors. In this paper, an algorithm has been proposed which will evaluate the health of the stock depending upon the parameters such as P/E Ratio, Dividend Yield, Debt to Equity, Industry P/E, ROE, ROCE, PEG Ratio, Profit Growth of past 5 years, Sales Growth of past 5 years and Sector. The entire web application will be hosted on the AWS cloud by leveraging its services to make it more accessible and scalable.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124931836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
As the crypto market develops, new projects emerge with blockchains and tokens aimed at achieving specific goals. Some of them aim to outperform Ethereum by providing developers with improved scalability, low to no fees, and other benefits. Others are designed to be used only in decentralized applications such as online casinos or cryptocurrency loan services. This incredible variety of options eventually leads to the need to exchange one cryptocurrency for another, just as we would exchange dollars, euros, and yen. In the market, there are numerous ways to exchange cryptocurrencies. There are numerous applications and blockchain platforms that facilitate the exchange of cryptocurrencies from one token to another. However, there are numerous complications throughout the process. In some platforms, you have to write long lines of code in order to swap tokens or the transaction fee in some platforms is very high which makes it difficult for a token owner to swap the tokens and also earn profit with it. Therefore to solve this problem, we have come up with a web application ‘‘dExCount’’ that helps the token seller to sell their tokens at a discounted price by creating a discount pool without writing any code. This system blockchain platform will help cryptocurrency owners to increase the brand value of the tokens by hosting the tokens on the platform for sale at a discounted rate. The token owner can give detailed information about the token like the social media links, website, and various other information about the tokens.
{"title":"dExCount: A decentralized cross-chain discount web app for Token Sale","authors":"Kavita Sonawane, Yogesh Singh Nayal, Durgesh Palekar, Harshit Shetty, Vivek Pinto","doi":"10.1109/CSCITA55725.2023.10104639","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104639","url":null,"abstract":"As the crypto market develops, new projects emerge with blockchains and tokens aimed at achieving specific goals. Some of them aim to outperform Ethereum by providing developers with improved scalability, low to no fees, and other benefits. Others are designed to be used only in decentralized applications such as online casinos or cryptocurrency loan services. This incredible variety of options eventually leads to the need to exchange one cryptocurrency for another, just as we would exchange dollars, euros, and yen. In the market, there are numerous ways to exchange cryptocurrencies. There are numerous applications and blockchain platforms that facilitate the exchange of cryptocurrencies from one token to another. However, there are numerous complications throughout the process. In some platforms, you have to write long lines of code in order to swap tokens or the transaction fee in some platforms is very high which makes it difficult for a token owner to swap the tokens and also earn profit with it. Therefore to solve this problem, we have come up with a web application ‘‘dExCount’’ that helps the token seller to sell their tokens at a discounted price by creating a discount pool without writing any code. This system blockchain platform will help cryptocurrency owners to increase the brand value of the tokens by hosting the tokens on the platform for sale at a discounted rate. The token owner can give detailed information about the token like the social media links, website, and various other information about the tokens.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123551037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10105116
Eleesa Anil, Sherine Sebastian, Janice Johnson, Janhavi S Rane, K. Karunakaran
Due to the ever growing world of high speed internet, videos have become a common medium for information on the web. When we want to gain information about anything from educational topics to entertainment we prefer watching videos instead of reading long paragraphs. With the vast diversity of videos available on the internet today on every single topic possible it gets confusing to find the right content for our needs. People end up wasting time on trying to find a good video instead of on the actual work the video is needed for. Video content being such a big part of our information source today it is necessary to have a system that will enable users to understand a gist of the video instead of having to sit through hours of content just to find nothing useful. The primary objective of this given paper is to propose a method to create a video summary in a way that it contains only the necessary and important information in a concise format by using various NLP algorithms such as Textrank, LexRank and LSA(Latent Semantic Analysis).
{"title":"Summarization of Video Clips using Subtitles","authors":"Eleesa Anil, Sherine Sebastian, Janice Johnson, Janhavi S Rane, K. Karunakaran","doi":"10.1109/CSCITA55725.2023.10105116","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10105116","url":null,"abstract":"Due to the ever growing world of high speed internet, videos have become a common medium for information on the web. When we want to gain information about anything from educational topics to entertainment we prefer watching videos instead of reading long paragraphs. With the vast diversity of videos available on the internet today on every single topic possible it gets confusing to find the right content for our needs. People end up wasting time on trying to find a good video instead of on the actual work the video is needed for. Video content being such a big part of our information source today it is necessary to have a system that will enable users to understand a gist of the video instead of having to sit through hours of content just to find nothing useful. The primary objective of this given paper is to propose a method to create a video summary in a way that it contains only the necessary and important information in a concise format by using various NLP algorithms such as Textrank, LexRank and LSA(Latent Semantic Analysis).","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"71 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123737170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10104784
Vikas Varshney, J. Panda, Rashmi Gupta
Due to scattering of light in an atmosphere, hazy images along with noise, color distortions, block artifacts and low intensity are obtained during the image capturing process. The paper proposes a new approach to deal with the problems as mentioned to achieve a better dehazed image. The methodology involves the Dark Channel Prior (DCP) algorithm followed by multi-scale switching morphological operator (MSMO) and contrast limited adaptive histogram equalization (CLAHE). The two inputs are derived by applying MSMO and CLAHE techniques on DCP algorithm based output image and then final dehazed image is obtained through linear fusion. Extensive experiments have been done on various images collected from BeDDE dataset. Results achieved by the proposed approach demonstrate that the quality of dehazed images have significant improvements in terms of better color preservation, reduced noise and blocking artifacts.
{"title":"An Effective Technique for Single Image Haze Removal using MSMO","authors":"Vikas Varshney, J. Panda, Rashmi Gupta","doi":"10.1109/CSCITA55725.2023.10104784","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104784","url":null,"abstract":"Due to scattering of light in an atmosphere, hazy images along with noise, color distortions, block artifacts and low intensity are obtained during the image capturing process. The paper proposes a new approach to deal with the problems as mentioned to achieve a better dehazed image. The methodology involves the Dark Channel Prior (DCP) algorithm followed by multi-scale switching morphological operator (MSMO) and contrast limited adaptive histogram equalization (CLAHE). The two inputs are derived by applying MSMO and CLAHE techniques on DCP algorithm based output image and then final dehazed image is obtained through linear fusion. Extensive experiments have been done on various images collected from BeDDE dataset. Results achieved by the proposed approach demonstrate that the quality of dehazed images have significant improvements in terms of better color preservation, reduced noise and blocking artifacts.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"174 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122998170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10104840
Joshua Dsouza, Selina Ger, Leni Wilson, Nikhil Lobo, Nitika Rai
This paper discusses a framework for development of a virtual tour of a campus of an institute of higher education. The aim is to implement a sense of simulating realism using virtual reality (VR) and high textured three-dimensional (3D) modelling into creating a virtual tour of a campus. This framework is developed with the aim to provide the prospective and current students and other stakeholders a virtual experience of the entire campus, its infrastructure and all the facilities that it has to offer. It allows the users to navigate through the campus and can read brief information about the major hotspots within the campus. The virtual tour can be used to spread awareness and help stakeholders to get a brief overview of the campus without having to step into campus physically.
{"title":"A Framework for Development of a Virtual Campus Tour","authors":"Joshua Dsouza, Selina Ger, Leni Wilson, Nikhil Lobo, Nitika Rai","doi":"10.1109/CSCITA55725.2023.10104840","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104840","url":null,"abstract":"This paper discusses a framework for development of a virtual tour of a campus of an institute of higher education. The aim is to implement a sense of simulating realism using virtual reality (VR) and high textured three-dimensional (3D) modelling into creating a virtual tour of a campus. This framework is developed with the aim to provide the prospective and current students and other stakeholders a virtual experience of the entire campus, its infrastructure and all the facilities that it has to offer. It allows the users to navigate through the campus and can read brief information about the major hotspots within the campus. The virtual tour can be used to spread awareness and help stakeholders to get a brief overview of the campus without having to step into campus physically.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116324198","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-31DOI: 10.1109/CSCITA55725.2023.10104672
Vaishnavi S. Narkhede, Om Surushe, S. Kulkarni, Harshad B Solanki, Tejas Ekbote, Deepali J. Joshi
In the era of rapid technological advancement, artificial intelligence (AI) and machine learning (ML) are transforming the way we work and interact with the world around us. The hiring process is a crucial aspect of any organization, as it determines the quality of the workforce and the success of the business. However, traditional hiring methods can be time-consuming and prone to bias. In this paper, we propose a better approach to hiring that leverages the power of Artificial Intelligence (AI) and machine learning (ML) to automate and improve the efficiency of the process. Our proposed system allows managers to specify their requirements and receive a shortlist of candidates based on their skills, experience, and performance in a one-on-one interview with a photorealistic artificial intelligence bot. The bot also assesses candidates’ confidence and body language to rank them accordingly. By using AI and machine learning in the hiring process, we can save time and reduce bias, leading to better-quality hires and a more productive workforce.
{"title":"AVA: A Photorealistic AI Bot for Human-like Interaction and Extended Reality","authors":"Vaishnavi S. Narkhede, Om Surushe, S. Kulkarni, Harshad B Solanki, Tejas Ekbote, Deepali J. Joshi","doi":"10.1109/CSCITA55725.2023.10104672","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104672","url":null,"abstract":"In the era of rapid technological advancement, artificial intelligence (AI) and machine learning (ML) are transforming the way we work and interact with the world around us. The hiring process is a crucial aspect of any organization, as it determines the quality of the workforce and the success of the business. However, traditional hiring methods can be time-consuming and prone to bias. In this paper, we propose a better approach to hiring that leverages the power of Artificial Intelligence (AI) and machine learning (ML) to automate and improve the efficiency of the process. Our proposed system allows managers to specify their requirements and receive a shortlist of candidates based on their skills, experience, and performance in a one-on-one interview with a photorealistic artificial intelligence bot. The bot also assesses candidates’ confidence and body language to rank them accordingly. By using AI and machine learning in the hiring process, we can save time and reduce bias, leading to better-quality hires and a more productive workforce.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130777639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing Audio Source Separation models usually operate using magnitude spectrum and neglect the phase information which results in long-range temporal correlations because of its high sampling rates. Audio source separation has been a problem since long and only a handful of solutions have been presented for it. This research work presents a Wave-U-Net architecture with Spectral Loss Function which separates input audio into multiple audio file of different instrument sounds along with vocals. Existing Wave-U-Net Architecture with Mean Square Error (MSE) loss function provides poor quality results due to lack of training on only specific instruments and use of MSE as an evaluation parameter. While commenting about the loss functions, shift invariance is an important aspect that should be taken into consideration. This research work makes use of Spectral Loss Function in coordination with Wave-U-Net architecture, which automatically syncs the phase even if two audio sources are asynchronised. Spectral Loss Function solves the problem of shift invariance. The MUSDB18 Dataset is used to train the proposed model and the results are compared using evaluation metrics such as Signal to Distortion Ratio (SDR). After successful implementation of the Wave-U-Net Architecture with Spectral Loss Function it is observed that the accuracy of the system has been improved significantly.
{"title":"Audio Source Separation using Wave-U-Net with Spectral Loss","authors":"Varun Patkar, Tanish Parmar, Parth Narvekar, Vedant Pawar, Joanne Gomes","doi":"10.1109/CSCITA55725.2023.10104853","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104853","url":null,"abstract":"Existing Audio Source Separation models usually operate using magnitude spectrum and neglect the phase information which results in long-range temporal correlations because of its high sampling rates. Audio source separation has been a problem since long and only a handful of solutions have been presented for it. This research work presents a Wave-U-Net architecture with Spectral Loss Function which separates input audio into multiple audio file of different instrument sounds along with vocals. Existing Wave-U-Net Architecture with Mean Square Error (MSE) loss function provides poor quality results due to lack of training on only specific instruments and use of MSE as an evaluation parameter. While commenting about the loss functions, shift invariance is an important aspect that should be taken into consideration. This research work makes use of Spectral Loss Function in coordination with Wave-U-Net architecture, which automatically syncs the phase even if two audio sources are asynchronised. Spectral Loss Function solves the problem of shift invariance. The MUSDB18 Dataset is used to train the proposed model and the results are compared using evaluation metrics such as Signal to Distortion Ratio (SDR). After successful implementation of the Wave-U-Net Architecture with Spectral Loss Function it is observed that the accuracy of the system has been improved significantly.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127795051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Lip Reading has evolved and from where it began to help deaf people has slowly turned into a service where in the Digital Entertainment industry has started utilizing it. With the recent rise of AI, automated technologies have touched the boundaries of Lip Reading as well. Various Algorithms have been devised using Neural Network Methodologies. We observe that a lot of the algorithms reviewed, have been exploring various techniques whether it be a variation from detecting lip features to the text generation process itself.With the amount of research done in the field, one can always look out towards a better & optimized lip detection. The study emphasizes more towards looking at the utilization of the Machine Learning & Deep Learning technologies and thus provides a vivid view at the bigger picture of the interpolation of AI in the Visual based Lip Reading domain.
{"title":"Survey on Visual Speech Recognition using Deep Learning Techniques","authors":"Ritika Chand, Pushpit Jain, Abhinav Mathur, Shiwansh Raj, Prashasti Kanikar","doi":"10.1109/CSCITA55725.2023.10104811","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104811","url":null,"abstract":"Lip Reading has evolved and from where it began to help deaf people has slowly turned into a service where in the Digital Entertainment industry has started utilizing it. With the recent rise of AI, automated technologies have touched the boundaries of Lip Reading as well. Various Algorithms have been devised using Neural Network Methodologies. We observe that a lot of the algorithms reviewed, have been exploring various techniques whether it be a variation from detecting lip features to the text generation process itself.With the amount of research done in the field, one can always look out towards a better & optimized lip detection. The study emphasizes more towards looking at the utilization of the Machine Learning & Deep Learning technologies and thus provides a vivid view at the bigger picture of the interpolation of AI in the Visual based Lip Reading domain.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133505173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}