2023 International Conference on Communication System, Computing and IT Applications (CSCITA)最新文献

英文中文

Deep Learning Model for Simulating Self Driving Car 模拟自动驾驶汽车的深度学习模型

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104750

Kunal Bhujbal, Dr. Mahendra Pawar

Self-driving cars have become a trending subject with a significant improvement in the technologies in the last decade. The project purpose is to train a convolutional neural network to drive an autonomous car agent on the tracks of Udacity’s Car Simulator environment. Udacity has released the simulator as an open source software. Driving a car in an autonomous manner requires learning to control steering angle, throttle and brakes. Behavioral cloning technique is used to mimic human driving behavior in the training mode on the track. That means a dataset is generated in the simulator by a user driven car in training mode, and the NVIDIA’s convolutional neural network model then drives the car in autonomous mode. Augmentation and image pre-processing are used to increase the accuracy of CNN model.

在过去的十年里，随着技术的显著进步，自动驾驶汽车已经成为一个热门话题。该项目的目的是训练卷积神经网络在Udacity的car Simulator环境的轨道上驱动自动驾驶汽车代理。Udacity已经将模拟器作为开源软件发布。以自动驾驶的方式驾驶汽车需要学习控制转向角度、油门和刹车。使用行为克隆技术在赛道上模拟训练模式下的人类驾驶行为。这意味着在训练模式下，由用户驾驶的汽车在模拟器中生成一个数据集，然后NVIDIA的卷积神经网络模型在自动模式下驾驶汽车。通过增强和图像预处理来提高CNN模型的精度。

引用次数: 0

Logo Detection Using Machine Learning Algorithm : A Survey 使用机器学习算法的标志检测:综述

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10105056

Jay Sanghvi, Jay Rathod, Sakshi Nemade, Hasti Panchal, A. Pavate

As more and more logos are produced, logo detection has gradually grown in popularity as study across numerous jobs and sectors. Deep learning-based solutions, which make use of numerous data sets,learning techniques, network designs, etc., have dominated recent advancements in this field. This research examines the progress made in the field of logo detection using deep learning approaches. In order to evaluate the efficacy of logo detection algorithms, which tend to be more diversified, difficult, and realistically reflective of real life, we first discuss a thorough background of the topic. The pros and disadvantages of each learning approach are then thoroughly analysed, along with the current logo detection strategies.To wrap up this study, we examine probable obstacles and provide the future directions for logo detecting development.

随着越来越多的标志被生产出来，标志检测作为一项研究在许多工作和部门中逐渐流行起来。基于深度学习的解决方案利用了大量的数据集、学习技术、网络设计等，主导了该领域的最新进展。本研究考察了使用深度学习方法在标识检测领域取得的进展。为了评估标识检测算法的有效性，这些算法往往更多样化、更困难、更真实地反映现实生活，我们首先讨论了这个主题的全面背景。然后深入分析了每种学习方法的优缺点，以及当前的标识检测策略。为了总结这项研究，我们研究了可能的障碍，并为标志检测的发展提供了未来的方向。

引用次数: 1

Stock Portfolio Health Monitoring System 股票投资组合健康监测系统

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10105068

Soham Shinde, Aditya Ware, Sachin Yadav, Aldrin Paul, Ramjee Yadav

For many years, stock market portfolio management has been successful in attracting the interest of several academics from the domains of computer science, finance, and mathematics worldwide. The main focus of investors and fund managers in the financial markets is to successfully monitor as well as manage investment portfolios. This paper is based on developing a Web Application which will assist small equity investors in checking and monitoring the health of an individual stock as well as the overall health of users portfolio. With minimal knowledge of stock market, one can build a great customized portfolio. The application will also notify the risky stocks which will help the investors to minimize the risk. It will be able to adapt the changes made in the portfolio and will also have other features which are needed by equity investors. In this paper, an algorithm has been proposed which will evaluate the health of the stock depending upon the parameters such as P/E Ratio, Dividend Yield, Debt to Equity, Industry P/E, ROE, ROCE, PEG Ratio, Profit Growth of past 5 years, Sales Growth of past 5 years and Sector. The entire web application will be hosted on the AWS cloud by leveraging its services to make it more accessible and scalable.

多年来，股票市场投资组合管理已经成功地吸引了世界范围内计算机科学、金融和数学领域的一些学者的兴趣。投资者和基金经理在金融市场上的主要关注点是成功地监控和管理投资组合。本文是基于开发一个Web应用程序，该应用程序将帮助小型股权投资者检查和监控单个股票的健康状况以及用户投资组合的整体健康状况。用最少的股票市场知识，一个人可以建立一个伟大的定制投资组合。该应用程序还将通知风险股票，这将有助于投资者将风险降至最低。它将能够适应投资组合的变化，也将具有股权投资者需要的其他功能。本文提出了一种算法，该算法将根据P/E比率，股息收益率，债务股本比，行业P/E, ROE, ROCE, PEG比率，过去5年的利润增长，过去5年的销售增长和行业等参数来评估股票的健康状况。整个web应用程序将托管在AWS云上，利用其服务使其更易于访问和扩展。

{"title":"Stock Portfolio Health Monitoring System","authors":"Soham Shinde, Aditya Ware, Sachin Yadav, Aldrin Paul, Ramjee Yadav","doi":"10.1109/CSCITA55725.2023.10105068","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10105068","url":null,"abstract":"For many years, stock market portfolio management has been successful in attracting the interest of several academics from the domains of computer science, finance, and mathematics worldwide. The main focus of investors and fund managers in the financial markets is to successfully monitor as well as manage investment portfolios. This paper is based on developing a Web Application which will assist small equity investors in checking and monitoring the health of an individual stock as well as the overall health of users portfolio. With minimal knowledge of stock market, one can build a great customized portfolio. The application will also notify the risky stocks which will help the investors to minimize the risk. It will be able to adapt the changes made in the portfolio and will also have other features which are needed by equity investors. In this paper, an algorithm has been proposed which will evaluate the health of the stock depending upon the parameters such as P/E Ratio, Dividend Yield, Debt to Equity, Industry P/E, ROE, ROCE, PEG Ratio, Profit Growth of past 5 years, Sales Growth of past 5 years and Sector. The entire web application will be hosted on the AWS cloud by leveraging its services to make it more accessible and scalable.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124931836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

dExCount: A decentralized cross-chain discount web app for Token Sale dExCount:一个去中心化的跨链折扣网络应用程序，用于代币销售

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104639

Kavita Sonawane, Yogesh Singh Nayal, Durgesh Palekar, Harshit Shetty, Vivek Pinto

As the crypto market develops, new projects emerge with blockchains and tokens aimed at achieving specific goals. Some of them aim to outperform Ethereum by providing developers with improved scalability, low to no fees, and other benefits. Others are designed to be used only in decentralized applications such as online casinos or cryptocurrency loan services. This incredible variety of options eventually leads to the need to exchange one cryptocurrency for another, just as we would exchange dollars, euros, and yen. In the market, there are numerous ways to exchange cryptocurrencies. There are numerous applications and blockchain platforms that facilitate the exchange of cryptocurrencies from one token to another. However, there are numerous complications throughout the process. In some platforms, you have to write long lines of code in order to swap tokens or the transaction fee in some platforms is very high which makes it difficult for a token owner to swap the tokens and also earn profit with it. Therefore to solve this problem, we have come up with a web application ‘‘dExCount’’ that helps the token seller to sell their tokens at a discounted price by creating a discount pool without writing any code. This system blockchain platform will help cryptocurrency owners to increase the brand value of the tokens by hosting the tokens on the platform for sale at a discounted rate. The token owner can give detailed information about the token like the social media links, website, and various other information about the tokens.

随着加密市场的发展，新的项目出现了区块链和代币，旨在实现特定的目标。其中一些旨在通过为开发人员提供改进的可扩展性，低到没有费用以及其他好处来超越以太坊。另一些则被设计为仅用于分散的应用程序，如在线赌场或加密货币贷款服务。这种令人难以置信的选择多样性最终导致需要将一种加密货币兑换成另一种加密货币，就像我们兑换美元、欧元和日元一样。在市场上，有许多方法可以交换加密货币。有许多应用程序和区块链平台可以促进加密货币从一个令牌到另一个令牌的交换。然而，在整个过程中有许多并发症。在某些平台上，您必须编写长行代码才能交换代币，或者某些平台的交易费用非常高，这使得代币所有者难以交换代币并从中获利。因此，为了解决这个问题，我们提出了一个web应用程序“dExCount”，它通过创建折扣池来帮助代币卖家在不编写任何代码的情况下以折扣价出售他们的代币。该系统区块链平台将帮助加密货币所有者通过在平台上以折扣价出售代币来增加代币的品牌价值。令牌所有者可以提供有关令牌的详细信息，如社交媒体链接、网站和有关令牌的各种其他信息。

{"title":"dExCount: A decentralized cross-chain discount web app for Token Sale","authors":"Kavita Sonawane, Yogesh Singh Nayal, Durgesh Palekar, Harshit Shetty, Vivek Pinto","doi":"10.1109/CSCITA55725.2023.10104639","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104639","url":null,"abstract":"As the crypto market develops, new projects emerge with blockchains and tokens aimed at achieving specific goals. Some of them aim to outperform Ethereum by providing developers with improved scalability, low to no fees, and other benefits. Others are designed to be used only in decentralized applications such as online casinos or cryptocurrency loan services. This incredible variety of options eventually leads to the need to exchange one cryptocurrency for another, just as we would exchange dollars, euros, and yen. In the market, there are numerous ways to exchange cryptocurrencies. There are numerous applications and blockchain platforms that facilitate the exchange of cryptocurrencies from one token to another. However, there are numerous complications throughout the process. In some platforms, you have to write long lines of code in order to swap tokens or the transaction fee in some platforms is very high which makes it difficult for a token owner to swap the tokens and also earn profit with it. Therefore to solve this problem, we have come up with a web application ‘‘dExCount’’ that helps the token seller to sell their tokens at a discounted price by creating a discount pool without writing any code. This system blockchain platform will help cryptocurrency owners to increase the brand value of the tokens by hosting the tokens on the platform for sale at a discounted rate. The token owner can give detailed information about the token like the social media links, website, and various other information about the tokens.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123551037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Summarization of Video Clips using Subtitles 使用字幕的视频剪辑摘要

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10105116

Eleesa Anil, Sherine Sebastian, Janice Johnson, Janhavi S Rane, K. Karunakaran

Due to the ever growing world of high speed internet, videos have become a common medium for information on the web. When we want to gain information about anything from educational topics to entertainment we prefer watching videos instead of reading long paragraphs. With the vast diversity of videos available on the internet today on every single topic possible it gets confusing to find the right content for our needs. People end up wasting time on trying to find a good video instead of on the actual work the video is needed for. Video content being such a big part of our information source today it is necessary to have a system that will enable users to understand a gist of the video instead of having to sit through hours of content just to find nothing useful. The primary objective of this given paper is to propose a method to create a video summary in a way that it contains only the necessary and important information in a concise format by using various NLP algorithms such as Textrank, LexRank and LSA(Latent Semantic Analysis).

由于高速互联网世界的不断发展，视频已经成为网络上常见的信息媒介。当我们想要获得从教育话题到娱乐的任何信息时，我们更喜欢看视频而不是阅读长段落。随着互联网上各种各样的视频的出现，我们很难找到适合我们需要的内容。人们最终会把时间浪费在寻找一个好的视频上，而不是花在视频需要做的实际工作上。视频内容是我们今天信息来源的重要组成部分，有必要有一个系统，使用户能够理解视频的要点，而不是不得不坐着看几个小时的内容，却发现没有任何有用的东西。本文的主要目标是提出一种方法，通过使用各种NLP算法(如Textrank, LexRank和LSA(Latent Semantic Analysis))，以简洁的格式仅包含必要和重要的信息来创建视频摘要。

引用次数: 0

An Effective Technique for Single Image Haze Removal using MSMO 一种有效的MSMO单幅图像去雾技术

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104784

Vikas Varshney, J. Panda, Rashmi Gupta

Due to scattering of light in an atmosphere, hazy images along with noise, color distortions, block artifacts and low intensity are obtained during the image capturing process. The paper proposes a new approach to deal with the problems as mentioned to achieve a better dehazed image. The methodology involves the Dark Channel Prior (DCP) algorithm followed by multi-scale switching morphological operator (MSMO) and contrast limited adaptive histogram equalization (CLAHE). The two inputs are derived by applying MSMO and CLAHE techniques on DCP algorithm based output image and then final dehazed image is obtained through linear fusion. Extensive experiments have been done on various images collected from BeDDE dataset. Results achieved by the proposed approach demonstrate that the quality of dehazed images have significant improvements in terms of better color preservation, reduced noise and blocking artifacts.

由于光在大气中的散射，在图像捕获过程中会得到模糊图像，并伴有噪声、色彩失真、块伪影和低强度。本文提出了一种新的方法来解决上述问题，以获得更好的去雾图像。该方法包括暗通道先验(DCP)算法、多尺度切换形态学算子(MSMO)和对比度有限自适应直方图均衡化(CLAHE)。在基于DCP算法的输出图像上应用MSMO和CLAHE技术得到两个输入，然后通过线性融合得到最终去雾图像。对从BeDDE数据集收集的各种图像进行了大量的实验。结果表明，该方法在色彩保持、噪声降低和伪影抑制等方面显著提高了图像的质量。

引用次数: 0

A Framework for Development of a Virtual Campus Tour 虚拟校园旅游的开发框架

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104840

Joshua Dsouza, Selina Ger, Leni Wilson, Nikhil Lobo, Nitika Rai

This paper discusses a framework for development of a virtual tour of a campus of an institute of higher education. The aim is to implement a sense of simulating realism using virtual reality (VR) and high textured three-dimensional (3D) modelling into creating a virtual tour of a campus. This framework is developed with the aim to provide the prospective and current students and other stakeholders a virtual experience of the entire campus, its infrastructure and all the facilities that it has to offer. It allows the users to navigate through the campus and can read brief information about the major hotspots within the campus. The virtual tour can be used to spread awareness and help stakeholders to get a brief overview of the campus without having to step into campus physically.

本文讨论了高校校园虚拟导览的开发框架。其目的是利用虚拟现实(VR)和高纹理三维(3D)建模来实现模拟现实主义的感觉，以创建虚拟的校园之旅。该框架旨在为未来和现有的学生以及其他利益相关者提供整个校园的虚拟体验，其基础设施和所有设施都必须提供。它允许用户浏览校园，并可以阅读校园内主要热点的简要信息。虚拟之旅可以用来传播意识，并帮助利益相关者在不亲自进入校园的情况下对校园进行简要概述。

引用次数: 0

AVA: A Photorealistic AI Bot for Human-like Interaction and Extended Reality AVA:用于类人互动和扩展现实的逼真AI Bot

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104672

Vaishnavi S. Narkhede, Om Surushe, S. Kulkarni, Harshad B Solanki, Tejas Ekbote, Deepali J. Joshi

In the era of rapid technological advancement, artificial intelligence (AI) and machine learning (ML) are transforming the way we work and interact with the world around us. The hiring process is a crucial aspect of any organization, as it determines the quality of the workforce and the success of the business. However, traditional hiring methods can be time-consuming and prone to bias. In this paper, we propose a better approach to hiring that leverages the power of Artificial Intelligence (AI) and machine learning (ML) to automate and improve the efficiency of the process. Our proposed system allows managers to specify their requirements and receive a shortlist of candidates based on their skills, experience, and performance in a one-on-one interview with a photorealistic artificial intelligence bot. The bot also assesses candidates’ confidence and body language to rank them accordingly. By using AI and machine learning in the hiring process, we can save time and reduce bias, leading to better-quality hires and a more productive workforce.

在技术快速发展的时代，人工智能(AI)和机器学习(ML)正在改变我们的工作方式以及与周围世界的互动方式。招聘过程是任何组织的一个关键方面，因为它决定了劳动力的质量和业务的成功。然而，传统的招聘方法既耗时又容易产生偏见。在本文中，我们提出了一种更好的招聘方法，利用人工智能(AI)和机器学习(ML)的力量来自动化和提高流程的效率。我们提出的系统允许管理人员指定他们的要求，并根据他们的技能，经验和表现在一对一的面试中与逼真的人工智能机器人收到候选人的入围名单。该机器人还会评估候选人的自信和肢体语言，并据此对他们进行排名。通过在招聘过程中使用人工智能和机器学习，我们可以节省时间，减少偏见，从而提高招聘质量，提高员工的生产力。

{"title":"AVA: A Photorealistic AI Bot for Human-like Interaction and Extended Reality","authors":"Vaishnavi S. Narkhede, Om Surushe, S. Kulkarni, Harshad B Solanki, Tejas Ekbote, Deepali J. Joshi","doi":"10.1109/CSCITA55725.2023.10104672","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104672","url":null,"abstract":"In the era of rapid technological advancement, artificial intelligence (AI) and machine learning (ML) are transforming the way we work and interact with the world around us. The hiring process is a crucial aspect of any organization, as it determines the quality of the workforce and the success of the business. However, traditional hiring methods can be time-consuming and prone to bias. In this paper, we propose a better approach to hiring that leverages the power of Artificial Intelligence (AI) and machine learning (ML) to automate and improve the efficiency of the process. Our proposed system allows managers to specify their requirements and receive a shortlist of candidates based on their skills, experience, and performance in a one-on-one interview with a photorealistic artificial intelligence bot. The bot also assesses candidates’ confidence and body language to rank them accordingly. By using AI and machine learning in the hiring process, we can save time and reduce bias, leading to better-quality hires and a more productive workforce.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130777639","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Audio Source Separation using Wave-U-Net with Spectral Loss 带频谱损失的Wave-U-Net音频源分离

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104853

Varun Patkar, Tanish Parmar, Parth Narvekar, Vedant Pawar, Joanne Gomes

Existing Audio Source Separation models usually operate using magnitude spectrum and neglect the phase information which results in long-range temporal correlations because of its high sampling rates. Audio source separation has been a problem since long and only a handful of solutions have been presented for it. This research work presents a Wave-U-Net architecture with Spectral Loss Function which separates input audio into multiple audio file of different instrument sounds along with vocals. Existing Wave-U-Net Architecture with Mean Square Error (MSE) loss function provides poor quality results due to lack of training on only specific instruments and use of MSE as an evaluation parameter. While commenting about the loss functions, shift invariance is an important aspect that should be taken into consideration. This research work makes use of Spectral Loss Function in coordination with Wave-U-Net architecture, which automatically syncs the phase even if two audio sources are asynchronised. Spectral Loss Function solves the problem of shift invariance. The MUSDB18 Dataset is used to train the proposed model and the results are compared using evaluation metrics such as Signal to Distortion Ratio (SDR). After successful implementation of the Wave-U-Net Architecture with Spectral Loss Function it is observed that the accuracy of the system has been improved significantly.

现有的音频源分离模型通常使用幅度谱，而忽略了相位信息，由于采样率高，导致了长期的时间相关性。音频源分离一直是一个问题，只有少数解决方案被提出。本研究提出了一种具有频谱损失函数的Wave-U-Net架构，该架构将输入音频与人声一起分离成多个不同乐器声音的音频文件。由于缺乏对特定仪器的训练以及使用均方误差(MSE)作为评估参数，现有的具有均方误差(MSE)损失函数的Wave-U-Net架构提供的结果质量很差。在讨论损失函数时，移位不变性是应该考虑的一个重要方面。本研究工作利用频谱损失函数与Wave-U-Net架构相协调，即使两个音频源是异步的，也能自动同步相位。谱损失函数解决了平移不变性问题。使用MUSDB18数据集对所提出的模型进行训练，并使用信号失真比(SDR)等评价指标对结果进行比较。在成功实现了带谱损失函数的Wave-U-Net体系结构后，系统的精度得到了显著提高。

{"title":"Audio Source Separation using Wave-U-Net with Spectral Loss","authors":"Varun Patkar, Tanish Parmar, Parth Narvekar, Vedant Pawar, Joanne Gomes","doi":"10.1109/CSCITA55725.2023.10104853","DOIUrl":"https://doi.org/10.1109/CSCITA55725.2023.10104853","url":null,"abstract":"Existing Audio Source Separation models usually operate using magnitude spectrum and neglect the phase information which results in long-range temporal correlations because of its high sampling rates. Audio source separation has been a problem since long and only a handful of solutions have been presented for it. This research work presents a Wave-U-Net architecture with Spectral Loss Function which separates input audio into multiple audio file of different instrument sounds along with vocals. Existing Wave-U-Net Architecture with Mean Square Error (MSE) loss function provides poor quality results due to lack of training on only specific instruments and use of MSE as an evaluation parameter. While commenting about the loss functions, shift invariance is an important aspect that should be taken into consideration. This research work makes use of Spectral Loss Function in coordination with Wave-U-Net architecture, which automatically syncs the phase even if two audio sources are asynchronised. Spectral Loss Function solves the problem of shift invariance. The MUSDB18 Dataset is used to train the proposed model and the results are compared using evaluation metrics such as Signal to Distortion Ratio (SDR). After successful implementation of the Wave-U-Net Architecture with Spectral Loss Function it is observed that the accuracy of the system has been improved significantly.","PeriodicalId":224479,"journal":{"name":"2023 International Conference on Communication System, Computing and IT Applications (CSCITA)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127795051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Survey on Visual Speech Recognition using Deep Learning Techniques 基于深度学习技术的视觉语音识别研究综述

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

Pub Date : 2023-03-31 DOI: 10.1109/CSCITA55725.2023.10104811

Ritika Chand, Pushpit Jain, Abhinav Mathur, Shiwansh Raj, Prashasti Kanikar

Lip Reading has evolved and from where it began to help deaf people has slowly turned into a service where in the Digital Entertainment industry has started utilizing it. With the recent rise of AI, automated technologies have touched the boundaries of Lip Reading as well. Various Algorithms have been devised using Neural Network Methodologies. We observe that a lot of the algorithms reviewed, have been exploring various techniques whether it be a variation from detecting lip features to the text generation process itself.With the amount of research done in the field, one can always look out towards a better & optimized lip detection. The study emphasizes more towards looking at the utilization of the Machine Learning & Deep Learning technologies and thus provides a vivid view at the bigger picture of the interpolation of AI in the Visual based Lip Reading domain.

唇读已经发展起来，从最初帮助聋哑人开始，慢慢地变成了一种服务，在数字娱乐行业已经开始使用它。随着人工智能的兴起，自动化技术也触及了唇读的界限。使用神经网络方法设计了各种算法。我们观察到，许多被审查的算法一直在探索各种技术，无论是从检测嘴唇特征到文本生成过程本身的变化。随着在该领域所做的大量研究，人们总是可以看到一个更好的和优化的嘴唇检测。该研究更多地强调了机器学习和深度学习技术的应用，从而为基于视觉的唇读领域的人工智能插值提供了一个生动的视角。

引用次数: 1

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2023 International Conference on Communication System, Computing and IT Applications (CSCITA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀