Study-level cross-modal retrieval of chest x-ray images and reports with adapter-based fine-tuning.

IF 3.3 3区 医学 Q2 ENGINEERING, BIOMEDICAL Physics in medicine and biology Pub Date : 2025-02-13 DOI:10.1088/1361-6560/adaf05
Yingjie Chen, Weihua Ou, Zhifan Gao, Lingge Lai, Yang Wu, Qianqian Chen
{"title":"Study-level cross-modal retrieval of chest x-ray images and reports with adapter-based fine-tuning.","authors":"Yingjie Chen, Weihua Ou, Zhifan Gao, Lingge Lai, Yang Wu, Qianqian Chen","doi":"10.1088/1361-6560/adaf05","DOIUrl":null,"url":null,"abstract":"<p><p>Cross-modal retrieval is crucial for improving clinical decision-making and report generation. However, current technologies mainly focus on linking single images with reports, ignoring the need to comprehensively observe multiple images in real clinical environments. Additionally, differences in imaging equipment, scanning parameters, geographic regions, and reporting styles in chest x-rays and reports cause inconsistent data distributions, which challenge model reliability and generalization. To address these challenges, we propose a study-level cross-modal retrieval task for chest x-rays and reports to better meet clinical needs. Our study-level approach involves cross-modal retrieval between multiple images and reports from patient exams. Given a set of study-level images or reports, our method retrieves relevant reports or images from a database, providing a more realistic reflection of clinical scenarios compared to traditional methods that link single images with reports. Furthermore, we introduce an adapter-based pre-training and fine-tuning method to enhance model generalization across diverse data distributions. Through comprehensive experiments, we demonstrate the advantages of our method in pre-training and fine-tuning. In the pre-training stage, we compare our method with the latest techniques, showing the effectiveness of integrating study-level image features using a vision transformer and aligning them with report features. In the fine-tuning stage, we compare the adapter-based fine-tuning method with the latest methods of full-parameter fine-tuning and conduct ablation studies with common head-based and full-parameter fine-tuning methods, proving our method's efficiency and significant potential for practical clinical applications. This study proposes a study-level cross-modal retrieval task for matching chest x-ray images and reports. By employing a pre-training and fine-tuning strategy with adapter modules, it addresses the issue of data distribution inconsistency and improves retrieval performance.</p>","PeriodicalId":20185,"journal":{"name":"Physics in medicine and biology","volume":"70 4","pages":""},"PeriodicalIF":3.3000,"publicationDate":"2025-02-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Physics in medicine and biology","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1088/1361-6560/adaf05","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, BIOMEDICAL","Score":null,"Total":0}
引用次数: 0

Abstract

Cross-modal retrieval is crucial for improving clinical decision-making and report generation. However, current technologies mainly focus on linking single images with reports, ignoring the need to comprehensively observe multiple images in real clinical environments. Additionally, differences in imaging equipment, scanning parameters, geographic regions, and reporting styles in chest x-rays and reports cause inconsistent data distributions, which challenge model reliability and generalization. To address these challenges, we propose a study-level cross-modal retrieval task for chest x-rays and reports to better meet clinical needs. Our study-level approach involves cross-modal retrieval between multiple images and reports from patient exams. Given a set of study-level images or reports, our method retrieves relevant reports or images from a database, providing a more realistic reflection of clinical scenarios compared to traditional methods that link single images with reports. Furthermore, we introduce an adapter-based pre-training and fine-tuning method to enhance model generalization across diverse data distributions. Through comprehensive experiments, we demonstrate the advantages of our method in pre-training and fine-tuning. In the pre-training stage, we compare our method with the latest techniques, showing the effectiveness of integrating study-level image features using a vision transformer and aligning them with report features. In the fine-tuning stage, we compare the adapter-based fine-tuning method with the latest methods of full-parameter fine-tuning and conduct ablation studies with common head-based and full-parameter fine-tuning methods, proving our method's efficiency and significant potential for practical clinical applications. This study proposes a study-level cross-modal retrieval task for matching chest x-ray images and reports. By employing a pre-training and fine-tuning strategy with adapter modules, it addresses the issue of data distribution inconsistency and improves retrieval performance.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
求助全文
约1分钟内获得全文 去求助
来源期刊
Physics in medicine and biology
Physics in medicine and biology 医学-工程:生物医学
CiteScore
6.50
自引率
14.30%
发文量
409
审稿时长
2 months
期刊介绍: The development and application of theoretical, computational and experimental physics to medicine, physiology and biology. Topics covered are: therapy physics (including ionizing and non-ionizing radiation); biomedical imaging (e.g. x-ray, magnetic resonance, ultrasound, optical and nuclear imaging); image-guided interventions; image reconstruction and analysis (including kinetic modelling); artificial intelligence in biomedical physics and analysis; nanoparticles in imaging and therapy; radiobiology; radiation protection and patient dose monitoring; radiation dosimetry
期刊最新文献
Role of modeled high-grade glioma cell invasion and survival on the prediction of tumor progression after radiotherapy. HWA-ResMamba: automatic segmentation of coronary arteries based on residual Mamba with high-order wavelet-enhanced convolution and attention feature aggregation. Optimisation of magnetic field sensing with optically pumped magnetometers for magnetic detection electrical impedance tomography. Optimizingin vivodata acquisition for robust clinical microvascular imaging using ultrasound localization microscopy. Reference dosimetry for MRI-Linacs: an addendum to the 2020 IPEM code of practice for high-energy photon therapy dosimetry.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1