Study-level cross-modal retrieval of chest x-ray images and reports with adapter-based fine-tuning.

IF 3.4 3区 医学 Q2 ENGINEERING, BIOMEDICAL Physics in medicine and biology Pub Date : 2025-02-13 DOI:10.1088/1361-6560/adaf05
Yingjie Chen, Weihua Ou, Zhifan Gao, Lingge Lai, Yang Wu, Qianqian Chen
{"title":"Study-level cross-modal retrieval of chest x-ray images and reports with adapter-based fine-tuning.","authors":"Yingjie Chen, Weihua Ou, Zhifan Gao, Lingge Lai, Yang Wu, Qianqian Chen","doi":"10.1088/1361-6560/adaf05","DOIUrl":null,"url":null,"abstract":"<p><p>Cross-modal retrieval is crucial for improving clinical decision-making and report generation. However, current technologies mainly focus on linking single images with reports, ignoring the need to comprehensively observe multiple images in real clinical environments. Additionally, differences in imaging equipment, scanning parameters, geographic regions, and reporting styles in chest x-rays and reports cause inconsistent data distributions, which challenge model reliability and generalization. To address these challenges, we propose a study-level cross-modal retrieval task for chest x-rays and reports to better meet clinical needs. Our study-level approach involves cross-modal retrieval between multiple images and reports from patient exams. Given a set of study-level images or reports, our method retrieves relevant reports or images from a database, providing a more realistic reflection of clinical scenarios compared to traditional methods that link single images with reports. Furthermore, we introduce an adapter-based pre-training and fine-tuning method to enhance model generalization across diverse data distributions. Through comprehensive experiments, we demonstrate the advantages of our method in pre-training and fine-tuning. In the pre-training stage, we compare our method with the latest techniques, showing the effectiveness of integrating study-level image features using a vision transformer and aligning them with report features. In the fine-tuning stage, we compare the adapter-based fine-tuning method with the latest methods of full-parameter fine-tuning and conduct ablation studies with common head-based and full-parameter fine-tuning methods, proving our method's efficiency and significant potential for practical clinical applications. This study proposes a study-level cross-modal retrieval task for matching chest x-ray images and reports. By employing a pre-training and fine-tuning strategy with adapter modules, it addresses the issue of data distribution inconsistency and improves retrieval performance.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":"70 4","pages":""},"PeriodicalIF":3.4000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/adaf05","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Cross-modal retrieval is crucial for improving clinical decision-making and report generation. However, current technologies mainly focus on linking single images with reports, ignoring the need to comprehensively observe multiple images in real clinical environments. Additionally, differences in imaging equipment, scanning parameters, geographic regions, and reporting styles in chest x-rays and reports cause inconsistent data distributions, which challenge model reliability and generalization. To address these challenges, we propose a study-level cross-modal retrieval task for chest x-rays and reports to better meet clinical needs. Our study-level approach involves cross-modal retrieval between multiple images and reports from patient exams. Given a set of study-level images or reports, our method retrieves relevant reports or images from a database, providing a more realistic reflection of clinical scenarios compared to traditional methods that link single images with reports. Furthermore, we introduce an adapter-based pre-training and fine-tuning method to enhance model generalization across diverse data distributions. Through comprehensive experiments, we demonstrate the advantages of our method in pre-training and fine-tuning. In the pre-training stage, we compare our method with the latest techniques, showing the effectiveness of integrating study-level image features using a vision transformer and aligning them with report features. In the fine-tuning stage, we compare the adapter-based fine-tuning method with the latest methods of full-parameter fine-tuning and conduct ablation studies with common head-based and full-parameter fine-tuning methods, proving our method's efficiency and significant potential for practical clinical applications. This study proposes a study-level cross-modal retrieval task for matching chest x-ray images and reports. By employing a pre-training and fine-tuning strategy with adapter modules, it addresses the issue of data distribution inconsistency and improves retrieval performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于适配器微调的胸部x线图像和报告的研究级跨模态检索。
跨模式检索是提高临床决策和报告生成的关键。然而,目前的技术主要集中在单幅图像与报告的链接上,忽略了在真实临床环境中对多幅图像进行综合观察的需要。此外,成像设备、扫描参数、地理区域和胸部x线报告风格的差异导致数据分布不一致,这对模型的可靠性和泛化提出了挑战。为了解决这些挑战,我们提出了一个研究级的胸部x光片和报告的跨模式检索任务,以更好地满足临床需求。我们的研究级方法包括在多个图像和患者检查报告之间进行跨模式检索。给定一组研究级的图像或报告,我们的方法从数据库中检索相关的报告或图像,与将单个图像与报告联系起来的传统方法相比,可以更真实地反映临床情况。此外,我们引入了一种基于适配器的预训练和微调方法来增强模型在不同数据分布中的泛化。通过综合实验,我们证明了该方法在预训练和微调方面的优势。在预训练阶段,我们将我们的方法与最新技术进行了比较,显示了使用视觉转换器整合研究级图像特征并将其与报告特征对齐的有效性。在微调阶段,我们将基于适配器的微调方法与最新的全参数微调方法进行了比较,并与常见的基于头部和全参数微调方法进行了消融研究,证明了我们的方法的有效性和临床应用潜力。本研究提出了一个研究级的跨模态检索任务,用于匹配胸部x线图像和报告。通过使用适配器模块的预训练和微调策略,它解决了数据分布不一致的问题,并提高了检索性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Physics in medicine and biology
Physics in medicine and biology 医学-工程:生物医学
CiteScore
6.50
自引率
14.30%
发文量
409
审稿时长
2 months
期刊介绍: The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry
期刊最新文献
Robust external-beam calibration of plastic scintillation detectors and uncertainty analysis for In-Vivo dosimetry in HDR brachytherapy. SDE-based Monte Carlo dose calculation for proton therapy validated against Geant4. Pacemaker response to scattered radiation of different dose rates. Average glandular dose prediction for breast model with patient-specific fibroglandular distribution in mammography and digital breast tomosynthesis: a machine-learning algorithms comparison. Atmospheric pressure influence on the charge collection efficiency of air-vented ionization chambers in ultra-high dose per pulse electron beams for FLASH radiotherapy.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1