Jianlu Cai, Frederick W. B. Li, Fangzhe Nan, Bailin Yang
{"title":"Multi-style cartoonization: Leveraging multiple datasets with generative adversarial networks","authors":"Jianlu Cai, Frederick W. B. Li, Fangzhe Nan, Bailin Yang","doi":"10.1002/cav.2269","DOIUrl":null,"url":null,"abstract":"<p>Scene cartoonization aims to convert photos into stylized cartoons. While generative adversarial networks (GANs) can generate high-quality images, previous methods focus on individual images or single styles, ignoring relationships between datasets. We propose a novel multi-style scene cartoonization GAN that leverages multiple cartoon datasets jointly. Our main technical contribution is a multi-branch style encoder that disentangles representations to model styles as distributions over entire datasets rather than images. Combined with a multi-task discriminator and perceptual losses optimizing across collections, our model achieves state-of-the-art diverse stylization while preserving semantics. Experiments demonstrate that by learning from inter-dataset relationships, our method translates photos into cartoon images with improved realism and abstraction fidelity compared to prior arts, without iterative re-training for new styles.</p>","PeriodicalId":50645,"journal":{"name":"Computer Animation and Virtual Worlds","volume":"35 3","pages":""},"PeriodicalIF":0.9000,"publicationDate":"2024-05-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computer Animation and Virtual Worlds","FirstCategoryId":"94","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/cav.2269","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"COMPUTER SCIENCE, SOFTWARE ENGINEERING","Score":null,"Total":0}
引用次数: 0
Abstract
Scene cartoonization aims to convert photos into stylized cartoons. While generative adversarial networks (GANs) can generate high-quality images, previous methods focus on individual images or single styles, ignoring relationships between datasets. We propose a novel multi-style scene cartoonization GAN that leverages multiple cartoon datasets jointly. Our main technical contribution is a multi-branch style encoder that disentangles representations to model styles as distributions over entire datasets rather than images. Combined with a multi-task discriminator and perceptual losses optimizing across collections, our model achieves state-of-the-art diverse stylization while preserving semantics. Experiments demonstrate that by learning from inter-dataset relationships, our method translates photos into cartoon images with improved realism and abstraction fidelity compared to prior arts, without iterative re-training for new styles.
期刊介绍:
With the advent of very powerful PCs and high-end graphics cards, there has been an incredible development in Virtual Worlds, real-time computer animation and simulation, games. But at the same time, new and cheaper Virtual Reality devices have appeared allowing an interaction with these real-time Virtual Worlds and even with real worlds through Augmented Reality. Three-dimensional characters, especially Virtual Humans are now of an exceptional quality, which allows to use them in the movie industry. But this is only a beginning, as with the development of Artificial Intelligence and Agent technology, these characters will become more and more autonomous and even intelligent. They will inhabit the Virtual Worlds in a Virtual Life together with animals and plants.