Cross-modal multi-label image classification modeling and recognition based on nonlinear

IF 2.4 Q2 ENGINEERING, MECHANICAL Nonlinear Engineering - Modeling and Application Pub Date : 2023-01-01 DOI:10.1515/nleng-2022-0194
Shuping Yuan, Yang Chen, Cheng Ye, Mohammed Wasim Bhatt, Mhalasakant Saradeshmukh, Md. Shamim Hossain
{"title":"Cross-modal multi-label image classification modeling and recognition based on nonlinear","authors":"Shuping Yuan, Yang Chen, Cheng Ye, Mohammed Wasim Bhatt, Mhalasakant Saradeshmukh, Md. Shamim Hossain","doi":"10.1515/nleng-2022-0194","DOIUrl":null,"url":null,"abstract":"Abstract Recently, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous work has concentrated on capturing label correlation but has neglected to correctly fuse picture features and label embeddings, which has a substantial influence on the model’s convergence efficiency and restricts future multi-label image recognition accuracy improvement. In order to better classify labeled training samples of corresponding categories in the field of image classification, a cross-modal multi-label image classification modeling and recognition method based on nonlinear is proposed. Multi-label classification models based on deep convolutional neural networks are constructed respectively. The visual classification model uses natural images and simple biomedical images with single labels to achieve heterogeneous transfer learning and homogeneous transfer learning, capturing the general features of the general field and the proprietary features of the biomedical field, while the text classification model uses the description text of simple biomedical images to achieve homogeneous transfer learning. The experimental results show that the multi-label classification model combining the two modes can obtain a hamming loss similar to the best performance of the evaluation task, and the macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. The cross-modal multi-label image classification algorithm can better alleviate the problem of overfitting in most classes and has better cross-modal retrieval performance. In addition, the effectiveness and rationality of the two cross-modal mapping techniques are verified.","PeriodicalId":37863,"journal":{"name":"Nonlinear Engineering - Modeling and Application","volume":null,"pages":null},"PeriodicalIF":2.4000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Nonlinear Engineering - Modeling and Application","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/nleng-2022-0194","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MECHANICAL","Score":null,"Total":0}
引用次数: 1

Abstract

Abstract Recently, it has become a popular strategy in multi-label image recognition to predict those labels that co-occur in a picture. Previous work has concentrated on capturing label correlation but has neglected to correctly fuse picture features and label embeddings, which has a substantial influence on the model’s convergence efficiency and restricts future multi-label image recognition accuracy improvement. In order to better classify labeled training samples of corresponding categories in the field of image classification, a cross-modal multi-label image classification modeling and recognition method based on nonlinear is proposed. Multi-label classification models based on deep convolutional neural networks are constructed respectively. The visual classification model uses natural images and simple biomedical images with single labels to achieve heterogeneous transfer learning and homogeneous transfer learning, capturing the general features of the general field and the proprietary features of the biomedical field, while the text classification model uses the description text of simple biomedical images to achieve homogeneous transfer learning. The experimental results show that the multi-label classification model combining the two modes can obtain a hamming loss similar to the best performance of the evaluation task, and the macro average F1 value increases from 0.20 to 0.488, which is about 52.5% higher. The cross-modal multi-label image classification algorithm can better alleviate the problem of overfitting in most classes and has better cross-modal retrieval performance. In addition, the effectiveness and rationality of the two cross-modal mapping techniques are verified.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于非线性的跨模态多标签图像分类建模与识别
摘要近年来,对图像中共存的标签进行预测已成为多标签图像识别中的一种流行策略。以往的工作主要集中在捕获标签相关性,而忽略了正确融合图像特征和标签嵌入,这对模型的收敛效率有很大影响,并制约了未来多标签图像识别精度的提高。为了在图像分类领域更好地对相应类别的标记训练样本进行分类,提出了一种基于非线性的跨模态多标签图像分类建模与识别方法。分别构建了基于深度卷积神经网络的多标签分类模型。视觉分类模型使用自然图像和带有单一标签的简单生物医学图像实现异构迁移学习和同质迁移学习,捕获一般领域的一般特征和生物医学领域的专有特征,而文本分类模型使用简单生物医学图像的描述文本实现同质迁移学习。实验结果表明,结合两种模式的多标签分类模型可以获得与评价任务最佳性能相近的汉明损失,宏观平均F1值从0.20提高到0.488,提高了约52.5%。跨模态多标签图像分类算法可以较好地缓解大多数类的过拟合问题,具有较好的跨模态检索性能。此外,还验证了两种跨模态映射技术的有效性和合理性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
6.20
自引率
3.60%
发文量
49
审稿时长
44 weeks
期刊介绍: The Journal of Nonlinear Engineering aims to be a platform for sharing original research results in theoretical, experimental, practical, and applied nonlinear phenomena within engineering. It serves as a forum to exchange ideas and applications of nonlinear problems across various engineering disciplines. Articles are considered for publication if they explore nonlinearities in engineering systems, offering realistic mathematical modeling, utilizing nonlinearity for new designs, stabilizing systems, understanding system behavior through nonlinearity, optimizing systems based on nonlinear interactions, and developing algorithms to harness and leverage nonlinear elements.
期刊最新文献
Study of time-fractional delayed differential equations via new integral transform-based variation iteration technique Convolutional neural network for UAV image processing and navigation in tree plantations based on deep learning Nonlinear adaptive sliding mode control with application to quadcopters Equilibrium stability of dynamic duopoly Cournot game under heterogeneous strategies, asymmetric information, and one-way R&D spillovers A versatile dynamic noise control framework based on computer simulation and modeling
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1