PLZero:基于占位符的胸片多标签识别广义零学习方法

IF 5 2区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Complex & Intelligent Systems Pub Date : 2025-01-02 DOI:10.1007/s40747-024-01717-4
Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, Yujue Zhou
{"title":"PLZero:基于占位符的胸片多标签识别广义零学习方法","authors":"Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, Yujue Zhou","doi":"10.1007/s40747-024-01717-4","DOIUrl":null,"url":null,"abstract":"<p>By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero.</p>","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"27 1","pages":""},"PeriodicalIF":5.0000,"publicationDate":"2025-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs\",\"authors\":\"Chengrong Yang, Qiwen Jin, Fei Du, Jing Guo, Yujue Zhou\",\"doi\":\"10.1007/s40747-024-01717-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero.</p>\",\"PeriodicalId\":10524,\"journal\":{\"name\":\"Complex & Intelligent Systems\",\"volume\":\"27 1\",\"pages\":\"\"},\"PeriodicalIF\":5.0000,\"publicationDate\":\"2025-01-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Complex & Intelligent Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1007/s40747-024-01717-4\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Complex & Intelligent Systems","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1007/s40747-024-01717-4","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

摘要

通过利用大规模的图像-文本配对数据进行预训练,该模型可以高效地学习图像和文本之间的对齐,极大地推动了零射击学习(zero-shot learning, ZSL)在智能医学图像分析领域的发展。然而,跨模态之间的异质性、图像-文本对的假阴性和领域转移现象给现有方法带来了挑战,使其难以有效地学习图像和文本之间的深层语义关系。为了解决这些挑战,我们提出了一个基于占位符学习的多标签胸部x射线识别广义ZSL框架,称为PLZero。具体来说,我们首先引入了一个联合嵌入空间学习模块(JESL),以鼓励模型更好地捕获不同标签之间的多样性。其次,我们提出了一个幻觉类生成模块(HCG),该模块基于可见类的视觉和语义特征,通过特征扩散和特征融合生成幻觉类,并将这些幻觉类作为未见类的占位符。最后,我们提出了一个基于幻觉类的原型学习模块(HCPL),它利用对比学习来控制幻觉类在视觉类周围的分布,而不会明显偏离原始数据,鼓励视觉类的类原型高度分散,从而为插入未见类样本创造足够的空间。广泛的实验表明,我们的方法具有足够的泛化性,并在三个经典和具有挑战性的胸部x射线数据集(NIH chest X-ray 14, CheXpert和ChestX-Det10)中实现了最佳性能。值得注意的是,即使未见类的数量超过其他方法的实验设置,我们的方法也优于其他方法。代码可在https://github.com/jinqiwen/PLZero上获得。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PLZero: placeholder based approach to generalized zero-shot learning for multi-label recognition in chest radiographs

By leveraging large-scale image-text paired data for pre-training, the model can efficiently learn the alignment between images and text, significantly advancing the development of zero-shot learning (ZSL) in the field of intelligent medical image analysis. However, the heterogeneity between cross-modalities, false negatives in image-text pairs, and domain shift phenomena pose challenges, making it difficult for existing methods to effectively learn the deep semantic relationships between images and text. To address these challenges, we propose a multi-label chest X-ray recognition generalized ZSL framework based on placeholder learning, termed PLZero. Specifically, we first introduce a jointed embedding space learning module (JESL) to encourage the model to better capture the diversity among different labels. Secondly, we propose a hallucinated class generation module (HCG), which generates hallucinated classes by feature diffusion and feature fusion based on the visual and semantic features of seen classes, using these hallucinated classes as placeholders for unseen classes. Finally, we propose a hallucinated class-based prototype learning module (HCPL), which leverages contrastive learning to control the distribution of hallucinated classes around seen classes without significant deviation from the original data, encouraging high dispersion of class prototypes for seen classes to create sufficient space for inserting unseen class samples. Extensive experiments demonstrate that our method exhibits sufficient generalization and achieves the best performance across three classic and challenging chest X-ray datasets: NIH Chest X-ray 14, CheXpert, and ChestX-Det10. Notably, our method outperforms others even when the number of unseen classes exceeds the experimental settings of other methods. The codes are available at: https://github.com/jinqiwen/PLZero.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Complex & Intelligent Systems
Complex & Intelligent Systems COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE-
CiteScore
9.60
自引率
10.30%
发文量
297
期刊介绍: Complex & Intelligent Systems aims to provide a forum for presenting and discussing novel approaches, tools and techniques meant for attaining a cross-fertilization between the broad fields of complex systems, computational simulation, and intelligent analytics and visualization. The transdisciplinary research that the journal focuses on will expand the boundaries of our understanding by investigating the principles and processes that underlie many of the most profound problems facing society today.
期刊最新文献
Tailored meta-learning for dual trajectory transformer: advancing generalized trajectory prediction Control strategy of robotic manipulator based on multi-task reinforcement learning Explainable and secure framework for autism prediction using multimodal eye tracking and kinematic data A novel three-way distance-based fuzzy large margin distribution machine for imbalance classification Chaos-enhanced metaheuristics: classification, comparison, and convergence analysis
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1