Shanshan Huang, Qingsong Li, Jun Liao, Shu Wang, Li Liu, Lian Li
{"title":"Controllable image synthesis methods, applications and challenges: a comprehensive survey","authors":"Shanshan Huang, Qingsong Li, Jun Liao, Shu Wang, Li Liu, Lian Li","doi":"10.1007/s10462-024-10987-w","DOIUrl":null,"url":null,"abstract":"<div><p>Controllable Image Synthesis (CIS) is a methodology that allows users to generate desired images or manipulate specific attributes of images by providing precise input conditions or modifying latent representations. In recent years, CIS has attracted considerable attention in the field of image processing, with significant advances in consistency, controllability and harmony. However, several challenges still remain, particularly regarding the fine-grained controllability and interpretability of synthesized images. In this paper, we comprehensively and systematically review the CIS from problem definition, taxonomy and evaluation systems to existing challenges and future research directions. First, the definition of CIS is given, and several representative deep generative models are introduced in detail. Second, the existing CIS methods are divided into three categories according to the different control manners used and discuss the typical work in each category critically. Furthermore, we introduce the public datasets and evaluation metrics commonly used in image synthesis and analyze the representative CIS methods. Finally, we present several open issues and discuss the future research direction of CIS.</p></div>","PeriodicalId":8449,"journal":{"name":"Artificial Intelligence Review","volume":"57 12","pages":""},"PeriodicalIF":10.7000,"publicationDate":"2024-10-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10462-024-10987-w.pdf","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Artificial Intelligence Review","FirstCategoryId":"94","ListUrlMain":"https://link.springer.com/article/10.1007/s10462-024-10987-w","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Controllable Image Synthesis (CIS) is a methodology that allows users to generate desired images or manipulate specific attributes of images by providing precise input conditions or modifying latent representations. In recent years, CIS has attracted considerable attention in the field of image processing, with significant advances in consistency, controllability and harmony. However, several challenges still remain, particularly regarding the fine-grained controllability and interpretability of synthesized images. In this paper, we comprehensively and systematically review the CIS from problem definition, taxonomy and evaluation systems to existing challenges and future research directions. First, the definition of CIS is given, and several representative deep generative models are introduced in detail. Second, the existing CIS methods are divided into three categories according to the different control manners used and discuss the typical work in each category critically. Furthermore, we introduce the public datasets and evaluation metrics commonly used in image synthesis and analyze the representative CIS methods. Finally, we present several open issues and discuss the future research direction of CIS.
期刊介绍:
Artificial Intelligence Review, a fully open access journal, publishes cutting-edge research in artificial intelligence and cognitive science. It features critical evaluations of applications, techniques, and algorithms, providing a platform for both researchers and application developers. The journal includes refereed survey and tutorial articles, along with reviews and commentary on significant developments in the field.