Youngjun Choo , Adrian Matias Chung Baek , Namhun Kim
{"title":"Advancing image inpainting efficiency: An exploration of pixel and channel split operations","authors":"Youngjun Choo , Adrian Matias Chung Baek , Namhun Kim","doi":"10.1016/j.asoc.2024.112179","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning-based image inpainting techniques have achieved unprecedented results using encoder–decoder structures to recover complex missing areas of an image. Recent inpainting models use additional information or networks (e.g., landmarks, edges, styles, and filters) to realize improved restoration performance, but at the cost of increased computational resources. To improve the relationship between inpainting performance and the number of model parameters, researchers have investigated efficient structural approaches such as recurrent and residual connection structures. However, these methods are difficult to apply in the general encoder–decoder structure. In this study, we explored the downsampling and upsampling operations associated with an encoder–decoder structure. We propose two novel split operations: the pixel-split operation (PSO) and channel-split operation (CSO). The proposed PSO transfers image features from high to low resolution with two dilation rate effects and a similar number of parameters as existing downsampling operations. Conversely, the proposed CSO increases the image resolution using only one-fourth the number of parameters of existing upsampling operations. The restoration performance and efficiency of the proposed model were evaluated in terms of five metrics on public datasets, e.g., the Places2 and CelebA datasets, to validate our proposed operations’ contribution to inpainting performance. We achieved state-of-the-art performance and reduced the size of the parameters by 20%. An ablation study was conducted to confirm the effect of each operation on the CelebA-HQ dataset. Results indicated that these split operations exhibit an advanced relationship between inpainting performance and optimization of the model parameters. The corresponding codes are available online (<span><span>https://github.com/MrCAIcode/Split_operation_for_inpainting</span><svg><path></path></svg></span>).</p></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":null,"pages":null},"PeriodicalIF":7.2000,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Applied Soft Computing","FirstCategoryId":"94","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1568494624009530","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
Deep learning-based image inpainting techniques have achieved unprecedented results using encoder–decoder structures to recover complex missing areas of an image. Recent inpainting models use additional information or networks (e.g., landmarks, edges, styles, and filters) to realize improved restoration performance, but at the cost of increased computational resources. To improve the relationship between inpainting performance and the number of model parameters, researchers have investigated efficient structural approaches such as recurrent and residual connection structures. However, these methods are difficult to apply in the general encoder–decoder structure. In this study, we explored the downsampling and upsampling operations associated with an encoder–decoder structure. We propose two novel split operations: the pixel-split operation (PSO) and channel-split operation (CSO). The proposed PSO transfers image features from high to low resolution with two dilation rate effects and a similar number of parameters as existing downsampling operations. Conversely, the proposed CSO increases the image resolution using only one-fourth the number of parameters of existing upsampling operations. The restoration performance and efficiency of the proposed model were evaluated in terms of five metrics on public datasets, e.g., the Places2 and CelebA datasets, to validate our proposed operations’ contribution to inpainting performance. We achieved state-of-the-art performance and reduced the size of the parameters by 20%. An ablation study was conducted to confirm the effect of each operation on the CelebA-HQ dataset. Results indicated that these split operations exhibit an advanced relationship between inpainting performance and optimization of the model parameters. The corresponding codes are available online (https://github.com/MrCAIcode/Split_operation_for_inpainting).
期刊介绍:
Applied Soft Computing is an international journal promoting an integrated view of soft computing to solve real life problems.The focus is to publish the highest quality research in application and convergence of the areas of Fuzzy Logic, Neural Networks, Evolutionary Computing, Rough Sets and other similar techniques to address real world complexities.
Applied Soft Computing is a rolling publication: articles are published as soon as the editor-in-chief has accepted them. Therefore, the web site will continuously be updated with new articles and the publication time will be short.