Art design integrating visual relation and affective semantics based on Convolutional Block Attention Mechanism-generative adversarial network model

IF 3.5 4区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE PeerJ Computer Science Pub Date : 2024-08-30 DOI:10.7717/peerj-cs.2274
Jiadong Shen, Jian Wang
{"title":"Art design integrating visual relation and affective semantics based on Convolutional Block Attention Mechanism-generative adversarial network model","authors":"Jiadong Shen, Jian Wang","doi":"10.7717/peerj-cs.2274","DOIUrl":null,"url":null,"abstract":"Scene-based image semantic extraction and its precise sentiment expression significantly enhance artistic design. To address the incongruity between image features and sentiment features caused by non-bilinear pooling, this study introduces a generative adversarial network (GAN) model that integrates visual relationships with sentiment semantics. The GAN-based regularizer is utilized during training to incorporate target information derived from the contextual information into the process. This regularization mechanism imposes stronger penalties for inaccuracies in subject-object type predictions and integrates a sentiment corpus to generate more human-like descriptive statements. The capsule network is employed to reconstruct sentences and predict probabilities in the discriminator. To preserve crucial focal points in feature extraction, the Convolutional Block Attention Mechanism (CBAM) is introduced. Furthermore, two bidirectional long short-term memory (LSTM) modules are used to model both target and relational contexts, thereby refining target labels and inter-target relationships. Experimental results highlight the model’s superiority over comparative models in terms of accuracy, BiLingual Evaluation Understudy (BLEU) score, and text preservation rate. The proposed model achieves an accuracy of 95.40% and the highest BLEU score of 16.79, effectively capturing both the label content and the emotional nuances within the image.","PeriodicalId":54224,"journal":{"name":"PeerJ Computer Science","volume":"9 1","pages":""},"PeriodicalIF":3.5000,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"PeerJ Computer Science","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.7717/peerj-cs.2274","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Scene-based image semantic extraction and its precise sentiment expression significantly enhance artistic design. To address the incongruity between image features and sentiment features caused by non-bilinear pooling, this study introduces a generative adversarial network (GAN) model that integrates visual relationships with sentiment semantics. The GAN-based regularizer is utilized during training to incorporate target information derived from the contextual information into the process. This regularization mechanism imposes stronger penalties for inaccuracies in subject-object type predictions and integrates a sentiment corpus to generate more human-like descriptive statements. The capsule network is employed to reconstruct sentences and predict probabilities in the discriminator. To preserve crucial focal points in feature extraction, the Convolutional Block Attention Mechanism (CBAM) is introduced. Furthermore, two bidirectional long short-term memory (LSTM) modules are used to model both target and relational contexts, thereby refining target labels and inter-target relationships. Experimental results highlight the model’s superiority over comparative models in terms of accuracy, BiLingual Evaluation Understudy (BLEU) score, and text preservation rate. The proposed model achieves an accuracy of 95.40% and the highest BLEU score of 16.79, effectively capturing both the label content and the emotional nuances within the image.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于卷积区块注意机制--生成对抗网络模型,整合视觉关系和情感语义的艺术设计
基于场景的图像语义提取及其精确的情感表达能显著提升艺术设计的效果。为了解决非线性汇集造成的图像特征与情感特征之间的不协调问题,本研究引入了一种生成对抗网络(GAN)模型,将视觉关系与情感语义整合在一起。在训练过程中利用基于 GAN 的正则化机制,将从上下文信息中获得的目标信息纳入训练过程。这种正则化机制会对主客体类型预测的不准确性施加更强的惩罚,并整合情感语料库以生成更像人的描述性语句。胶囊网络用于重构句子和预测判别器中的概率。为了在特征提取中保留关键焦点,引入了卷积块注意机制(CBAM)。此外,两个双向长短期记忆(LSTM)模块用于对目标和关系上下文进行建模,从而完善目标标签和目标间关系。实验结果表明,该模型在准确率、双语评估得分(BLEU)和文本保留率方面均优于同类模型。所提模型的准确率达到 95.40%,BLEU 得分最高,为 16.79 分,有效捕捉了图像中的标签内容和情感细微差别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
PeerJ Computer Science
PeerJ Computer Science Computer Science-General Computer Science
CiteScore
6.10
自引率
5.30%
发文量
332
审稿时长
10 weeks
期刊介绍: PeerJ Computer Science is the new open access journal covering all subject areas in computer science, with the backing of a prestigious advisory board and more than 300 academic editors.
期刊最新文献
A model integrating attention mechanism and generative adversarial network for image style transfer. Detecting rumors in social media using emotion based deep learning approach. Harnessing AI and analytics to enhance cybersecurity and privacy for collective intelligence systems. Improving synthetic media generation and detection using generative adversarial networks. Intelligent accounting optimization method based on meta-heuristic algorithm and CNN.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1