Pub Date : 2024-08-01DOI: 10.1016/j.cag.2023.12.007
{"title":"Foreword to the special section on 3D object retrieval 2023 symposium (3DOR2023)","authors":"","doi":"10.1016/j.cag.2023.12.007","DOIUrl":"10.1016/j.cag.2023.12.007","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"122 ","pages":"Article 103865"},"PeriodicalIF":2.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"138683131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-01DOI: 10.1016/j.cag.2023.08.031
{"title":"Foreword to the special section on SIBGRAPI 2023","authors":"","doi":"10.1016/j.cag.2023.08.031","DOIUrl":"10.1016/j.cag.2023.08.031","url":null,"abstract":"","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"122 ","pages":"Article 103810"},"PeriodicalIF":2.5,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129764432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Urban simulations that involve disaster prevention, urban design, and assisted navigation heavily rely on urban geometric models. While large urban areas need a lot of time to be acquired terrestrially, government organizations have already conducted massive aerial LiDAR surveys, some even at the national level. This work aims to provide a pipeline for extracting multi-scale point clouds from 2D building footprints and airborne LiDAR data, which depends on whether the points represent buildings, vegetation, or ground. We denoise the roof slopes, match the vegetation, and roughly recreate the building façades frequently hidden to aerial acquisition using a parametric representation of geometric primitives. We then carry out multiple-scale samplings of the urban geometry until a 3D urban representation can be achieved because we annotate the new version of the original point cloud with the parametric equations representing each part. We mainly tested our methodology in a real-world setting – the city of Genoa – which includes historical buildings and is heavily characterized by irregular ground slopes. Moreover, we present the results of urban reconstruction on part of two other cities, Matera, which has a complex morphology like Genoa, and Rotterdam.
{"title":"From aerial LiDAR point clouds to multiscale urban representation levels by a parametric resampling","authors":"Chiara Romanengo, Bianca Falcidieno, Silvia Biasotti","doi":"10.1016/j.cag.2024.104022","DOIUrl":"10.1016/j.cag.2024.104022","url":null,"abstract":"<div><p>Urban simulations that involve disaster prevention, urban design, and assisted navigation heavily rely on urban geometric models. While large urban areas need a lot of time to be acquired terrestrially, government organizations have already conducted massive aerial LiDAR surveys, some even at the national level. This work aims to provide a pipeline for extracting multi-scale point clouds from 2D building footprints and airborne LiDAR data, which depends on whether the points represent buildings, vegetation, or ground. We denoise the roof slopes, match the vegetation, and roughly recreate the building façades frequently hidden to aerial acquisition using a parametric representation of geometric primitives. We then carry out multiple-scale samplings of the urban geometry until a 3D urban representation can be achieved because we annotate the new version of the original point cloud with the parametric equations representing each part. We mainly tested our methodology in a real-world setting – the city of Genoa – which includes historical buildings and is heavily characterized by irregular ground slopes. Moreover, we present the results of urban reconstruction on part of two other cities, Matera, which has a complex morphology like Genoa, and Rotterdam.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104022"},"PeriodicalIF":2.5,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001572/pdfft?md5=a617708d0acaf24ecd321d09a5821721&pid=1-s2.0-S0097849324001572-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141942922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Analysis of 3D textures, also known as relief patterns is a challenging task that requires separating repetitive surface patterns from the underlying global geometry. Existing works classify entire surfaces based on one or a few patterns by extracting ad-hoc statistical properties. Unfortunately, these methods are not suitable for objects with multiple geometric textures and perform poorly on more complex shapes. In this paper, we propose a neural network for binary segmentation to infer per-point labels based on the presence of surface relief patterns. We evaluated the proposed architecture on a high resolution point cloud dataset, surpassing the state-of-the-art, while maintaining memory and computation efficiency.
{"title":"Binary segmentation of relief patterns on point clouds","authors":"Gabriele Paolini , Claudio Tortorici , Stefano Berretti","doi":"10.1016/j.cag.2024.104020","DOIUrl":"10.1016/j.cag.2024.104020","url":null,"abstract":"<div><p>Analysis of 3D textures, also known as relief patterns is a challenging task that requires separating repetitive surface patterns from the underlying global geometry. Existing works classify entire surfaces based on one or a few patterns by extracting ad-hoc statistical properties. Unfortunately, these methods are not suitable for objects with multiple geometric textures and perform poorly on more complex shapes. In this paper, we propose a neural network for binary segmentation to infer per-point labels based on the presence of surface relief patterns. We evaluated the proposed architecture on a high resolution point cloud dataset, surpassing the state-of-the-art, while maintaining memory and computation efficiency.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104020"},"PeriodicalIF":2.5,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001559/pdfft?md5=2a3d2170481b5dae4c7f729baa4b2914&pid=1-s2.0-S0097849324001559-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141942923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1016/j.cag.2024.104014
Cristiano N. Rodrigues , Ian M. Nunes , Matheus B. Pereira , Hugo Oliveira , Jefersson A. dos Santos
Image segmentation is one of the most classical computer vision tasks. Segmentation tasks yield a set of classes attributed to individual pixels instead of sparsely predicted images or patches, such as in classification or detection tasks. However, creating annotation sets for pixelwise tasks is a very costly task, often requiring hours for labeling single samples in images with multiple classes of objects. In this context, unsupervised learning can be leveraged either to expedite the annotation procedure and/or to guide the segmentation algorithms altogether without the need for manual annotations. Classical unsupervised segmentation methods leveraged techniques from areas as graph theory, image processing, clustering or supervised classifiers in order to achieve “shallow” pixelwise classification. These techniques usually aim to achieve superpixel over-segmentations by grouping similar pixels that should pertain to the same object. Modern deep unsupervised approaches for image segmentation aimed to group pixels in a data-driven way by using the capabilities of deep architectures to process unstructured data such as images. Later, self-supervised learning bypassed the need for labels via pretext tasks, compelling deep architectures to learn more generic features capable of enhancing downstream tasks, including segmentation. The generalized representations produced by unsupervised models have propelled the recent progress in self-supervised, few- and zero-shot learning and even general-purpose foundational models in computer vision, yielding state-of-the-art results across diverse tasks and datasets. This paper provides an overview of unsupervised and generalizable approaches for image segmentation, introduces key concepts and terminology, and discusses the main aspects of state-of-the-art methods. Additionally, we highlight prominent applications in various domains such as remote sensing, medical imaging, and geology. Finally, we discuss trends and future directions for state-of-the-art unsupervised image segmentation.
{"title":"From superpixels to foundational models: An overview of unsupervised and generalizable image segmentation","authors":"Cristiano N. Rodrigues , Ian M. Nunes , Matheus B. Pereira , Hugo Oliveira , Jefersson A. dos Santos","doi":"10.1016/j.cag.2024.104014","DOIUrl":"10.1016/j.cag.2024.104014","url":null,"abstract":"<div><p>Image segmentation is one of the most classical computer vision tasks. Segmentation tasks yield a set of classes attributed to individual pixels instead of sparsely predicted images or patches, such as in classification or detection tasks. However, creating annotation sets for pixelwise tasks is a very costly task, often requiring hours for labeling single samples in images with multiple classes of objects. In this context, unsupervised learning can be leveraged either to expedite the annotation procedure and/or to guide the segmentation algorithms altogether without the need for manual annotations. Classical unsupervised segmentation methods leveraged techniques from areas as graph theory, image processing, clustering or supervised classifiers in order to achieve “shallow” pixelwise classification. These techniques usually aim to achieve superpixel over-segmentations by grouping similar pixels that should pertain to the same object. Modern deep unsupervised approaches for image segmentation aimed to group pixels in a data-driven way by using the capabilities of deep architectures to process unstructured data such as images. Later, self-supervised learning bypassed the need for labels via pretext tasks, compelling deep architectures to learn more generic features capable of enhancing downstream tasks, including segmentation. The generalized representations produced by unsupervised models have propelled the recent progress in self-supervised, few- and zero-shot learning and even general-purpose foundational models in computer vision, yielding state-of-the-art results across diverse tasks and datasets. This paper provides an overview of unsupervised and generalizable approaches for image segmentation, introduces key concepts and terminology, and discusses the main aspects of state-of-the-art methods. Additionally, we highlight prominent applications in various domains such as remote sensing, medical imaging, and geology. Finally, we discuss trends and future directions for state-of-the-art unsupervised image segmentation.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104014"},"PeriodicalIF":2.5,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141961155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-29DOI: 10.1016/j.cag.2024.104021
Bianca Falcidieno, Brian Wyvill, Ergun Akleman, Jorg Peters
The Shape Modeling International awards (SMI awards) were introduced to commemorate the passing of SMI founder, Professor Kunii. Since 2021, the SMI awards recognize exceptional contributors to Shape Modeling. Currently, there are three awards: the Tosiyasu Kunii Distinguished Researcher, the Young Investigator, and the Alexander Pasko Service Award. The 2024 Distinguished Researcher awardees are Gershon Elber and Stefanie Hahmann. The 2024 Young Investigators are Gianmarco Cherchi and Amal Dev Parakkat. The 2024 Service Awardee is Ergun Akleman. This article provides interviews with the five SMI 2024 award winners.
{"title":"Shape Modeling International (SMI) 2024 awards interviews with SMI’2024 award winners","authors":"Bianca Falcidieno, Brian Wyvill, Ergun Akleman, Jorg Peters","doi":"10.1016/j.cag.2024.104021","DOIUrl":"10.1016/j.cag.2024.104021","url":null,"abstract":"<div><p>The Shape Modeling International awards (SMI awards) were introduced to commemorate the passing of SMI founder, Professor Kunii. Since 2021, the SMI awards recognize exceptional contributors to Shape Modeling. Currently, there are three awards: the Tosiyasu Kunii Distinguished Researcher, the Young Investigator, and the Alexander Pasko Service Award. The 2024 Distinguished Researcher awardees are Gershon Elber and Stefanie Hahmann. The 2024 Young Investigators are Gianmarco Cherchi and Amal Dev Parakkat. The 2024 Service Awardee is Ergun Akleman. This article provides interviews with the five SMI 2024 award winners.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104021"},"PeriodicalIF":2.5,"publicationDate":"2024-07-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141952720","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1016/j.cag.2024.104013
Leonardo Ferreira , Gustavo Moreira , Maryam Hosseini , Marcos Lage , Nivan Ferreira , Fabio Miranda
Over the past decade, there has been a significant increase in the development of visual analytics systems dedicated to addressing urban issues. These systems distill intricate urban analysis workflows into intuitive, interactive visual representations and interfaces, enabling users to explore, understand, and derive insights from large and complex data, including street-level imagery, street networks, and building geometries. Developing urban visual analytics systems, however, is a challenging endeavor that requires considerable programming expertise and interaction between various multidisciplinary stakeholders. This situation often leads to monolithic and isolated prototypes that are hard to reproduce, combine, or extend. Concurrently, there has been an increase in the availability of general and urban-specific toolkits, frameworks, and authoring tools that are open source and abstract away the need to implement low-level visual analytics functionalities. This paper provides a hierarchical taxonomy of urban visual analytics systems to contextualize how they are usually designed, implemented, and evaluated. We develop this taxonomy across three distinct levels (i.e., dimensions, categories, and tags), juxtaposing visualization with analytics, data, and system dimensions. We then assess the extent to which current open-source toolkits, frameworks, and authoring tools can effectively support the development of components tailored to urban visual analytics, identifying their strengths and limitations in addressing the unique challenges posed by urban data. In doing so, we offer a roadmap that can guide the effective employment of existing resources and chart a pathway for developing and refining future systems.
{"title":"Assessing the landscape of toolkits, frameworks, and authoring tools for urban visual analytics systems","authors":"Leonardo Ferreira , Gustavo Moreira , Maryam Hosseini , Marcos Lage , Nivan Ferreira , Fabio Miranda","doi":"10.1016/j.cag.2024.104013","DOIUrl":"10.1016/j.cag.2024.104013","url":null,"abstract":"<div><p>Over the past decade, there has been a significant increase in the development of visual analytics systems dedicated to addressing urban issues. These systems distill intricate urban analysis workflows into intuitive, interactive visual representations and interfaces, enabling users to explore, understand, and derive insights from large and complex data, including street-level imagery, street networks, and building geometries. Developing urban visual analytics systems, however, is a challenging endeavor that requires considerable programming expertise and interaction between various multidisciplinary stakeholders. This situation often leads to monolithic and isolated prototypes that are hard to reproduce, combine, or extend. Concurrently, there has been an increase in the availability of general and urban-specific toolkits, frameworks, and authoring tools that are open source and abstract away the need to implement low-level visual analytics functionalities. This paper provides a hierarchical taxonomy of urban visual analytics systems to contextualize how they are usually designed, implemented, and evaluated. We develop this taxonomy across three distinct levels (<em>i.e.</em>, dimensions, categories, and tags), juxtaposing visualization with analytics, data, and system dimensions. We then assess the extent to which current open-source toolkits, frameworks, and authoring tools can effectively support the development of components tailored to urban visual analytics, identifying their strengths and limitations in addressing the unique challenges posed by urban data. In doing so, we offer a roadmap that can guide the effective employment of existing resources and chart a pathway for developing and refining future systems.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104013"},"PeriodicalIF":2.5,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S0097849324001481/pdfft?md5=5e1b2ee787bdf31bc660006341515d9a&pid=1-s2.0-S0097849324001481-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141841726","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-25DOI: 10.1016/j.cag.2024.104018
Jin Xiang , Huihuang Zhao , Pengfei Li , Yue Deng , Weiliang Meng
Recent research in arbitrary style transfer has highlighted challenges in maintaining the balance between content structure and style patterns. Moreover, the improper application of style patterns onto the content image often results in suboptimal quality. In this paper, a novel style transfer network, called MCNet, is proposed. It is based on multi-feature correlations. To better explore the intrinsic relationship between the style image and the content image and to transfer the most suitable style onto the content image, a novel Global Style-Attentional Transfer Module, named GSATM, is introduced in this work. GSATM comprises two parts: Forward Adaptive Style Transformation (FAST) and Delayed Style Transformation (DST). The former analyzes the relationship between style and content features and fine-tunes the style features, whereas the latter transfers the content features based on the fine-tuned style features. Moreover, a new encoding and decoding structure is designed to effectively handle the output of GSATM. Extensive quantitative and qualitative experiments fully demonstrate the superiority of our algorithm. Project page: https://github.com/XiangJinCherry/MCNet.
最近在任意风格转换方面的研究凸显了在内容结构和风格模式之间保持平衡所面临的挑战。此外,将风格模式不恰当地应用到内容图像上往往会导致质量不佳。本文提出了一种名为 MCNet 的新型风格转换网络。它基于多特征相关性。为了更好地探索风格图像和内容图像之间的内在关系,并将最合适的风格转移到内容图像上,本文引入了一个新颖的全局风格-意向转移模块(Global Style-Attentional Transfer Module,简称 GSATM)。GSATM 包括两个部分:前向自适应风格转换(FAST)和延迟风格转换(DST)。前者分析风格特征和内容特征之间的关系并微调风格特征,后者则根据微调后的风格特征传输内容特征。此外,还设计了一种新的编码和解码结构,以有效处理 GSATM 的输出。广泛的定量和定性实验充分证明了我们算法的优越性。项目页面:https://github.com/XiangJinCherry/MCNet。
{"title":"Arbitrary style transfer via multi-feature correlation","authors":"Jin Xiang , Huihuang Zhao , Pengfei Li , Yue Deng , Weiliang Meng","doi":"10.1016/j.cag.2024.104018","DOIUrl":"10.1016/j.cag.2024.104018","url":null,"abstract":"<div><p>Recent research in arbitrary style transfer has highlighted challenges in maintaining the balance between content structure and style patterns. Moreover, the improper application of style patterns onto the content image often results in suboptimal quality. In this paper, a novel style transfer network, called MCNet, is proposed. It is based on multi-feature correlations. To better explore the intrinsic relationship between the style image and the content image and to transfer the most suitable style onto the content image, a novel Global Style-Attentional Transfer Module, named GSATM, is introduced in this work. GSATM comprises two parts: Forward Adaptive Style Transformation (FAST) and Delayed Style Transformation (DST). The former analyzes the relationship between style and content features and fine-tunes the style features, whereas the latter transfers the content features based on the fine-tuned style features. Moreover, a new encoding and decoding structure is designed to effectively handle the output of GSATM. Extensive quantitative and qualitative experiments fully demonstrate the superiority of our algorithm. Project page: <span><span>https://github.com/XiangJinCherry/MCNet</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104018"},"PeriodicalIF":2.5,"publicationDate":"2024-07-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141851209","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-22DOI: 10.1016/j.cag.2024.104017
Heng Zhang , Yuanyuan Pu , Zhengpeng Zhao , Yupan Li , Xin Li , Rencan Nie
A nice image-to-image translation framework is able to acquire an explicit and credible mapping relationship between the source domain and target domains while satisfying two requirements. One is simplicity, the other is extensibility over multiple translation tasks. To this end, we design a concise but versatile generative model for image-to-image translation. Our method includes three major ingredients. First, inspired by popular unconditional normalization layers, named Spatially Adaptive Normalization(SPADE). We introduce a novel Semantics-Appearance Spatially Adaptive Normalization (SA-SPADE), taking into account both semantic structure and style appearance. This enables semantic composition and style appearance information to be sufficiently captured and integrated by our normalization layers. Thanks to SA-SPADE, our model extends to multiple image-to-image translation tasks in an unsupervised or supervised way. Second, we carefully designed two symmetrical network branches to provide semantic and appearance information for our normalization layer, namely Semantic Branch (SB) and Appearance Branch(AB) respectively. Third, we propose novel Semantic-aware Contrastive Loss (SCL) and Appearance-aware Contrastive Loss (ACL)based on newly un-/self- supervised contrastive learning. That is, SCL guarantees domain-invariant (e.g., pose, structure) representations between the generated image and the input image, while ACL ensures domain-specific representations (e.g., color, texture) between the generated image and the reference image. As a result, we verify the effectiveness of our method by comparing it with various task-dependent image translation models in both qualitative and quantitative evaluations.
{"title":"Towards diverse image-to-image translation via adaptive normalization layer and contrast learning","authors":"Heng Zhang , Yuanyuan Pu , Zhengpeng Zhao , Yupan Li , Xin Li , Rencan Nie","doi":"10.1016/j.cag.2024.104017","DOIUrl":"10.1016/j.cag.2024.104017","url":null,"abstract":"<div><p>A nice image-to-image translation framework is able to acquire an explicit and credible mapping relationship between the source domain and target domains while satisfying two requirements. One is simplicity, the other is extensibility over multiple translation tasks. To this end, we design a concise but versatile generative model for image-to-image translation. Our method includes three major ingredients. First, inspired by popular unconditional normalization layers, named Spatially Adaptive Normalization(SPADE). We introduce a novel Semantics-Appearance Spatially Adaptive Normalization (SA-SPADE), taking into account both semantic structure and style appearance. This enables semantic composition and style appearance information to be sufficiently captured and integrated by our normalization layers. Thanks to SA-SPADE, our model extends to multiple image-to-image translation tasks in an unsupervised or supervised way. Second, we carefully designed two symmetrical network branches to provide semantic and appearance information for our normalization layer, namely Semantic Branch (SB) and Appearance Branch(AB) respectively. Third, we propose novel Semantic-aware Contrastive Loss (SCL) and Appearance-aware Contrastive Loss (ACL)based on newly un-/self- supervised contrastive learning. That is, SCL guarantees domain-invariant (e.g., pose, structure) representations between the generated image and the input image, while ACL ensures domain-specific representations (e.g., color, texture) between the generated image and the reference image. As a result, we verify the effectiveness of our method by comparing it with various task-dependent image translation models in both qualitative and quantitative evaluations.</p></div>","PeriodicalId":50628,"journal":{"name":"Computers & Graphics-Uk","volume":"123 ","pages":"Article 104017"},"PeriodicalIF":2.5,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141852026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}