A Visual Computing Unified Application Using Deep Learning and Computer Vision Techniques

International Journal of Interactive Mobile Technologies (iJIM) Pub Date : 2024-01-12 DOI:10.3991/ijim.v18i01.42673

S. J., Meeradevi, S. Seema, Dayananda P, S. S., S. G., S. Rohith

{"title":"A Visual Computing Unified Application Using Deep Learning and Computer Vision Techniques","authors":"S. J., Meeradevi, S. Seema, Dayananda P, S. S., S. G., S. Rohith","doi":"10.3991/ijim.v18i01.42673","DOIUrl":null,"url":null,"abstract":"Vision Studio aims to utilize a diverse range of modern deep learning and computer vision principles and techniques to provide a broad array of functionalities in image and video processing. Deep learning is a distinct class of machine learning algorithms that utilize multiple layers to gradually extract more advanced features from raw input. This is beneficial when using a matrix as input for pixels in a photo or frames in a video. Computer vision is a field of artificial intelligence that teaches computers to interpret and comprehend the visual domain. The main functions implemented include deepfake creation, digital ageing (de-ageing), image animation, and deepfake detection. Deepfake creation allows users to utilize deep learning methods, particularly autoencoders, to overlay source images onto a target video. This creates a video of the source person imitating or saying things that the target person does. Digital aging utilizes generative adversarial networks (GANs) to digitally simulate the aging process of an individual. Image animation utilizes first-order motion models to create highly realistic animations from a source image and driving video. Deepfake detection is achieved by using advanced and highly efficient convolutional neural networks (CNNs), primarily employing the EfficientNet family of models.","PeriodicalId":507995,"journal":{"name":"International Journal of Interactive Mobile Technologies (iJIM)","volume":" 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Interactive Mobile Technologies (iJIM)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.3991/ijim.v18i01.42673","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

Abstract

Vision Studio aims to utilize a diverse range of modern deep learning and computer vision principles and techniques to provide a broad array of functionalities in image and video processing. Deep learning is a distinct class of machine learning algorithms that utilize multiple layers to gradually extract more advanced features from raw input. This is beneficial when using a matrix as input for pixels in a photo or frames in a video. Computer vision is a field of artificial intelligence that teaches computers to interpret and comprehend the visual domain. The main functions implemented include deepfake creation, digital ageing (de-ageing), image animation, and deepfake detection. Deepfake creation allows users to utilize deep learning methods, particularly autoencoders, to overlay source images onto a target video. This creates a video of the source person imitating or saying things that the target person does. Digital aging utilizes generative adversarial networks (GANs) to digitally simulate the aging process of an individual. Image animation utilizes first-order motion models to create highly realistic animations from a source image and driving video. Deepfake detection is achieved by using advanced and highly efficient convolutional neural networks (CNNs), primarily employing the EfficientNet family of models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

使用深度学习和计算机视觉技术的视觉计算统一应用程序

视觉工作室旨在利用各种现代深度学习和计算机视觉原理与技术，为图像和视频处理提供广泛的功能。深度学习是一类独特的机器学习算法，它利用多层算法从原始输入中逐步提取更高级的特征。当使用矩阵作为照片中像素或视频中帧的输入时，这将大有裨益。计算机视觉是人工智能的一个领域，它教会计算机解释和理解视觉领域。实现的主要功能包括深度伪造创建、数字老化（去老化）、图像动画和深度伪造检测。深度伪造创建允许用户利用深度学习方法，特别是自动编码器，将源图像叠加到目标视频上。这样就能生成源人物模仿目标人物说话或做动作的视频。数字衰老利用生成式对抗网络（GAN）以数字方式模拟个人的衰老过程。图像动画利用一阶运动模型，通过源图像和驱动视频创建高度逼真的动画。通过使用先进、高效的卷积神经网络 (CNN)，主要是使用 EfficientNet 系列模型，实现深度伪造检测。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

International Journal of Interactive Mobile Technologies (iJIM)

自引率

0.00%

发文量