使用条件 GAN 从口语中提取 ML 驱动的面部合成

International Journal of Innovative Research in Engineering Pub Date : 2024-01-25 DOI:10.59256/ijire.20240501004

Vaishnavi Srivastava, Sakshi Srivastava, Sakshi Chauhan, Divyakshi Yadav

{"title":"使用条件 GAN 从口语中提取 ML 驱动的面部合成","authors":"Vaishnavi Srivastava, Sakshi Srivastava, Sakshi Chauhan, Divyakshi Yadav","doi":"10.59256/ijire.20240501004","DOIUrl":null,"url":null,"abstract":"A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"78 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"ML-Driven Facial Synthesis from Spoken Words Using Conditional GANs\",\"authors\":\"Vaishnavi Srivastava, Sakshi Srivastava, Sakshi Chauhan, Divyakshi Yadav\",\"doi\":\"10.59256/ijire.20240501004\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition\",\"PeriodicalId\":516932,\"journal\":{\"name\":\"International Journal of Innovative Research in Engineering\",\"volume\":\"78 3\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-01-25\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of Innovative Research in Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.59256/ijire.20240501004\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Innovative Research in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59256/ijire.20240501004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

人脑可以将一个人的声音转化为相应的人脸图像，即使是从未见过的人。对深度学习网络进行同样的训练，可用于根据人的声音检测人脸，从而找到我们只有声音记录的罪犯。本文的目标是建立一个条件生成对抗网络，该网络可以从人类讲话中生成人脸图像，然后通过人脸识别模型识别出讲话的主人。模型经过训练后，人脸识别模型的训练准确率为 80.08%，测试准确率为 56.2%。与基本的 GAN 模型相比，该模型的结果提高了约 30%。关键字人脸图像合成生成式对抗网络人脸识别

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

ML-Driven Facial Synthesis from Spoken Words Using Conditional GANs

A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of Innovative Research in Engineering

自引率

0.00%

发文量

期刊最新文献

An Analytical Study of Image Fusion Techniques in Image Processing for Data Security & Privacy Handwritten Digit Recognition Motorized Insurance with Group of Data Analysis Intelligent Fall Detection for Elders Smart Robotics in Hydroponic Agriculture: Enhancing Efficiency and Sustainability