Pre-Activating Semantic Information for Image Aesthetic Assessment

IF 1.2 4区工程技术 Q4 MATERIALS SCIENCE, TEXTILES AATCC Journal of Research Pub Date : 2023-02-05 DOI:10.1177/24723444221147971

J. Song, Rong Huang, Yujia Tian, Aihua Dong

{"title":"Pre-Activating Semantic Information for Image Aesthetic Assessment","authors":"J. Song, Rong Huang, Yujia Tian, Aihua Dong","doi":"10.1177/24723444221147971","DOIUrl":null,"url":null,"abstract":"Automatic image aesthetic evaluation is an attractive and challenging visual task. Recently, methods based on convolutional neural networks have achieved remarkable performance. However, semantic information, an intuitive prerequisite for evaluating image aesthetics, has not received enough attention regarding its importance in previous methods. How to efficiently extract semantic information and make better use of it to assist the aesthetic evaluation task remains unsolved. In this article, we propose to utilize the self-supervised model Auto-Encoder to extract semantic information in the form of multi-task learning. Then, a fusing module is prepended at the bottleneck layer to explicitly combine semantic information with aesthetic information in a pre-activated manner. Specifically, we implement a customized pooling operation to pool the semantic features extracted by Auto-Encoder and apply a weak constraint between the pooled semantic features and aesthetic information to realize the combination. The following regressor can complete aesthetic evaluation based on the semantic–aesthetic combined features. In addition, to enable our model to adapt to arbitrary aspect ratios of images, another pooling strategy called spatial pyramid pooling is adopted to obtain the image features of a fixed length. Our method achieves competitive performance on the public image aesthetic evaluation benchmark. Especially on the most commonly used metric Spearman rank-order correlation coefficient, the proposed model achieved the best performance compared with some state-of-the-art methods. Extensive ablation studies and visualization experiments were conducted to demonstrate the effectiveness of our method.","PeriodicalId":6955,"journal":{"name":"AATCC Journal of Research","volume":" ","pages":""},"PeriodicalIF":1.2000,"publicationDate":"2023-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"AATCC Journal of Research","FirstCategoryId":"88","ListUrlMain":"https://doi.org/10.1177/24723444221147971","RegionNum":4,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"MATERIALS SCIENCE, TEXTILES","Score":null,"Total":0}

引用次数: 0

Abstract

Automatic image aesthetic evaluation is an attractive and challenging visual task. Recently, methods based on convolutional neural networks have achieved remarkable performance. However, semantic information, an intuitive prerequisite for evaluating image aesthetics, has not received enough attention regarding its importance in previous methods. How to efficiently extract semantic information and make better use of it to assist the aesthetic evaluation task remains unsolved. In this article, we propose to utilize the self-supervised model Auto-Encoder to extract semantic information in the form of multi-task learning. Then, a fusing module is prepended at the bottleneck layer to explicitly combine semantic information with aesthetic information in a pre-activated manner. Specifically, we implement a customized pooling operation to pool the semantic features extracted by Auto-Encoder and apply a weak constraint between the pooled semantic features and aesthetic information to realize the combination. The following regressor can complete aesthetic evaluation based on the semantic–aesthetic combined features. In addition, to enable our model to adapt to arbitrary aspect ratios of images, another pooling strategy called spatial pyramid pooling is adopted to obtain the image features of a fixed length. Our method achieves competitive performance on the public image aesthetic evaluation benchmark. Especially on the most commonly used metric Spearman rank-order correlation coefficient, the proposed model achieved the best performance compared with some state-of-the-art methods. Extensive ablation studies and visualization experiments were conducted to demonstrate the effectiveness of our method.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

预激活语义信息用于图像审美评价

图像自动审美评价是一项极具吸引力和挑战性的视觉任务。近年来，基于卷积神经网络的方法取得了令人瞩目的成绩。然而，语义信息作为评价图像美学的直观前提，在以往的方法中并没有得到足够的重视。如何有效地提取语义信息并更好地利用它来辅助审美评价任务是一个尚未解决的问题。在本文中，我们提出利用自监督模型Auto-Encoder以多任务学习的形式提取语义信息。然后，在瓶颈层预先设置融合模块，以预激活的方式显式地将语义信息与美学信息组合在一起。具体而言，我们实现了自定义的池化操作，将Auto-Encoder提取的语义特征池化，并在池化的语义特征与审美信息之间施加弱约束来实现组合。下面的回归量可以完成基于语义-美学组合特征的美学评价。此外，为了使我们的模型能够适应任意图像的宽高比，我们采用了另一种称为空间金字塔池化的池化策略来获得固定长度的图像特征。该方法在公众形象审美评价基准上达到了具有竞争力的表现。特别是在最常用的度量Spearman秩阶相关系数上，与现有的一些方法相比，该模型取得了最好的性能。广泛的消融研究和可视化实验证明了我们的方法的有效性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

AATCC Journal of Research MATERIALS SCIENCE, TEXTILES-

CiteScore

1.30

自引率

0.00%

发文量

期刊介绍： AATCC Journal of Research. This textile research journal has a broad scope: from advanced materials, fibers, and textile and polymer chemistry, to color science, apparel design, and sustainability. Now indexed by Science Citation Index Extended (SCIE) and discoverable in the Clarivate Analytics Web of Science Core Collection! The Journal’s impact factor is available in Journal Citation Reports.