AlphaNet:一种用于自动图像抠图的注意力引导深度网络

2020 International Conference on Omni-layer Intelligent Systems (COINS) Pub Date : 2020-03-07 DOI:10.1109/COINS49042.2020.9191371

Rishab Sharma, Rahul Deora, Anirudha Vishvakarma

{"title":"AlphaNet:一种用于自动图像抠图的注意力引导深度网络","authors":"Rishab Sharma, Rahul Deora, Anirudha Vishvakarma","doi":"10.1109/COINS49042.2020.9191371","DOIUrl":null,"url":null,"abstract":"In this paper, we propose an end to end solution for image matting i.e high-precision extraction of foreground objects from natural images. Image matting and background detection can be achieved easily through chroma keying in a studio setting when the background is either pure green or blue. Nonetheless, image matting in natural scenes with complex and uneven depth backgrounds remains a tedious task that requires human intervention. To achieve complete automatic foreground extraction in natural scenes, we propose a method that assimilates semantic segmentation and deep image matting processes into a single network to generate detailed semantic mattes for image composition task. The contribution of our proposed method is two-fold, firstly it can be interpreted as a fully automated semantic image matting method and secondly as a refinement of existing semantic segmentation models.We propose a novel model architecture as a combination of segmentation and matting that unifies the function of upsampling and downsampling operators with the notion of attention. As shown in our work, attention guided downsampling and upsampling can extract high-quality boundary details, unlike other normal downsampling and upsampling techniques. For achieving the same, we utilized an attention guided encoder-decoder framework which does unsupervised learning for generating an attention map adaptively from the data to serve and direct the upsampling and downsampling operators. We also construct a fashion e-commerce focused dataset with high-quality alpha mattes to facilitate the training and evaluation for image matting.","PeriodicalId":350108,"journal":{"name":"2020 International Conference on Omni-layer Intelligent Systems (COINS)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2020-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"5","resultStr":"{\"title\":\"AlphaNet: An Attention Guided Deep Network for Automatic Image Matting\",\"authors\":\"Rishab Sharma, Rahul Deora, Anirudha Vishvakarma\",\"doi\":\"10.1109/COINS49042.2020.9191371\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we propose an end to end solution for image matting i.e high-precision extraction of foreground objects from natural images. Image matting and background detection can be achieved easily through chroma keying in a studio setting when the background is either pure green or blue. Nonetheless, image matting in natural scenes with complex and uneven depth backgrounds remains a tedious task that requires human intervention. To achieve complete automatic foreground extraction in natural scenes, we propose a method that assimilates semantic segmentation and deep image matting processes into a single network to generate detailed semantic mattes for image composition task. The contribution of our proposed method is two-fold, firstly it can be interpreted as a fully automated semantic image matting method and secondly as a refinement of existing semantic segmentation models.We propose a novel model architecture as a combination of segmentation and matting that unifies the function of upsampling and downsampling operators with the notion of attention. As shown in our work, attention guided downsampling and upsampling can extract high-quality boundary details, unlike other normal downsampling and upsampling techniques. For achieving the same, we utilized an attention guided encoder-decoder framework which does unsupervised learning for generating an attention map adaptively from the data to serve and direct the upsampling and downsampling operators. We also construct a fashion e-commerce focused dataset with high-quality alpha mattes to facilitate the training and evaluation for image matting.\",\"PeriodicalId\":350108,\"journal\":{\"name\":\"2020 International Conference on Omni-layer Intelligent Systems (COINS)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-03-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"5\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 International Conference on Omni-layer Intelligent Systems (COINS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/COINS49042.2020.9191371\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 International Conference on Omni-layer Intelligent Systems (COINS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/COINS49042.2020.9191371","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 5

摘要

在本文中，我们提出了一种端到端的图像抠图解决方案，即从自然图像中高精度提取前景目标。当背景是纯绿色或纯蓝色时，通过色度键控可以很容易地实现图像抠图和背景检测。然而，在具有复杂和不均匀深度背景的自然场景中，图像抠图仍然是一项繁琐的任务，需要人工干预。为了实现自然场景前景的完全自动提取，我们提出了一种将语义分割和深度图像抠图过程融合到一个网络中的方法，为图像合成任务生成详细的语义抠图。我们提出的方法有两个方面的贡献，首先，它可以被解释为一种全自动的语义图像抠图方法，其次，它是对现有语义分割模型的改进。我们提出了一种新的模型架构，作为分割和抠图的结合，将上采样和下采样算子的功能与注意力的概念统一起来。正如我们的工作所示，与其他正常的下采样和上采样技术不同，注意力引导下采样和上采样可以提取高质量的边界细节。为了实现同样的目标，我们使用了一个注意力引导编码器-解码器框架，该框架进行无监督学习，从数据中自适应地生成注意力图，以服务和指导上采样和下采样算子。我们还构建了一个以时尚电子商务为重点的高质量alpha mattes数据集，以方便图像抠图的训练和评估。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

AlphaNet: An Attention Guided Deep Network for Automatic Image Matting

In this paper, we propose an end to end solution for image matting i.e high-precision extraction of foreground objects from natural images. Image matting and background detection can be achieved easily through chroma keying in a studio setting when the background is either pure green or blue. Nonetheless, image matting in natural scenes with complex and uneven depth backgrounds remains a tedious task that requires human intervention. To achieve complete automatic foreground extraction in natural scenes, we propose a method that assimilates semantic segmentation and deep image matting processes into a single network to generate detailed semantic mattes for image composition task. The contribution of our proposed method is two-fold, firstly it can be interpreted as a fully automated semantic image matting method and secondly as a refinement of existing semantic segmentation models.We propose a novel model architecture as a combination of segmentation and matting that unifies the function of upsampling and downsampling operators with the notion of attention. As shown in our work, attention guided downsampling and upsampling can extract high-quality boundary details, unlike other normal downsampling and upsampling techniques. For achieving the same, we utilized an attention guided encoder-decoder framework which does unsupervised learning for generating an attention map adaptively from the data to serve and direct the upsampling and downsampling operators. We also construct a fashion e-commerce focused dataset with high-quality alpha mattes to facilitate the training and evaluation for image matting.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助