{"title":"ML-Driven Facial Synthesis from Spoken Words Using Conditional GANs","authors":"Vaishnavi Srivastava, Sakshi Srivastava, Sakshi Chauhan, Divyakshi Yadav","doi":"10.59256/ijire.20240501004","DOIUrl":null,"url":null,"abstract":"A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition","PeriodicalId":516932,"journal":{"name":"International Journal of Innovative Research in Engineering","volume":"78 3","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Innovative Research in Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.59256/ijire.20240501004","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
A Human Brain may translate a person's voice to its corresponding face image even if never seen before. Training adeep learning network to do the same can be used in detecting human faces based on their voice, which may be used in findinga criminal that we only have a voice recording for. The goal in this paper is to build a Conditional Generative Adversarial Network that produces face images from human speeches which can then be recognized by a face recognition model to identifythe owner of the speech. The model was trained, and the face recognition model gave an accuracy of 80.08% in training and 56.2% in testing. Compared to the basic GAN model, this model has improved the results by about 30%. Key Word: Face image synthesis, Generative adversarial network, Face Recognition