DACG: Dual Attention and Context Guidance model for radiology report generation

IF 10.7 1区 医学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Medical image analysis Pub Date : 2024-10-23 DOI:10.1016/j.media.2024.103377
{"title":"DACG: Dual Attention and Context Guidance model for radiology report generation","authors":"","doi":"10.1016/j.media.2024.103377","DOIUrl":null,"url":null,"abstract":"<div><div>Medical images are an essential basis for radiologists to write radiology reports and greatly help subsequent clinical treatment. The task of generating automatic radiology reports aims to alleviate the burden of clinical doctors writing reports and has received increasing attention this year, becoming an important research hotspot. However, there are severe issues of visual and textual data bias and long text generation in the medical field. Firstly, Abnormal areas in radiological images only account for a small portion, and most radiological reports only involve descriptions of normal findings. Secondly, there are still significant challenges in generating longer and more accurate descriptive texts for radiology report generation tasks. In this paper, we propose a new Dual Attention and Context Guidance (DACG) model to alleviate visual and textual data bias and promote the generation of long texts. We use a Dual Attention Module, including a Position Attention Block and a Channel Attention Block, to extract finer position and channel features from medical images, enhancing the image feature extraction ability of the encoder. We use the Context Guidance Module to integrate contextual information into the decoder and supervise the generation of long texts. The experimental results show that our proposed model achieves state-of-the-art performance on the most commonly used IU X-ray and MIMIC-CXR datasets. Further analysis also proves that our model can improve reporting through more accurate anomaly detection and more detailed descriptions. The source code is available at <span><span>https://github.com/LangWY/DACG</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":18328,"journal":{"name":"Medical image analysis","volume":null,"pages":null},"PeriodicalIF":10.7000,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Medical image analysis","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1361841524003025","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

Medical images are an essential basis for radiologists to write radiology reports and greatly help subsequent clinical treatment. The task of generating automatic radiology reports aims to alleviate the burden of clinical doctors writing reports and has received increasing attention this year, becoming an important research hotspot. However, there are severe issues of visual and textual data bias and long text generation in the medical field. Firstly, Abnormal areas in radiological images only account for a small portion, and most radiological reports only involve descriptions of normal findings. Secondly, there are still significant challenges in generating longer and more accurate descriptive texts for radiology report generation tasks. In this paper, we propose a new Dual Attention and Context Guidance (DACG) model to alleviate visual and textual data bias and promote the generation of long texts. We use a Dual Attention Module, including a Position Attention Block and a Channel Attention Block, to extract finer position and channel features from medical images, enhancing the image feature extraction ability of the encoder. We use the Context Guidance Module to integrate contextual information into the decoder and supervise the generation of long texts. The experimental results show that our proposed model achieves state-of-the-art performance on the most commonly used IU X-ray and MIMIC-CXR datasets. Further analysis also proves that our model can improve reporting through more accurate anomaly detection and more detailed descriptions. The source code is available at https://github.com/LangWY/DACG.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
DACG:用于生成放射学报告的双重注意和上下文引导模型
医学影像是放射科医生撰写放射报告的重要依据,对后续临床治疗有很大帮助。自动生成放射报告的任务旨在减轻临床医生撰写报告的负担,今年以来受到越来越多的关注,成为一个重要的研究热点。然而,医学领域存在着严重的视觉和文本数据偏差以及长文本生成问题。首先,放射图像中的异常区域只占一小部分,大多数放射报告只涉及正常结果的描述。其次,在为放射学报告生成任务生成更长、更准确的描述性文本方面仍存在巨大挑战。在本文中,我们提出了一种新的双重注意和上下文引导(DACG)模型,以减轻视觉和文本数据的偏差,促进长文本的生成。我们使用双注意模块(包括位置注意模块和通道注意模块)从医学图像中提取更精细的位置和通道特征,从而增强编码器的图像特征提取能力。我们使用上下文引导模块将上下文信息整合到解码器中,并监督长文本的生成。实验结果表明,我们提出的模型在最常用的 IU X 光和 MIMIC-CXR 数据集上取得了一流的性能。进一步的分析还证明,我们的模型可以通过更准确的异常检测和更详细的描述来改进报告。源代码见 https://github.com/LangWY/DACG。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Medical image analysis
Medical image analysis 工程技术-工程:生物医学
CiteScore
22.10
自引率
6.40%
发文量
309
审稿时长
6.6 months
期刊介绍: Medical Image Analysis serves as a platform for sharing new research findings in the realm of medical and biological image analysis, with a focus on applications of computer vision, virtual reality, and robotics to biomedical imaging challenges. The journal prioritizes the publication of high-quality, original papers contributing to the fundamental science of processing, analyzing, and utilizing medical and biological images. It welcomes approaches utilizing biomedical image datasets across all spatial scales, from molecular/cellular imaging to tissue/organ imaging.
期刊最新文献
Beyond strong labels: Weakly-supervised learning based on Gaussian pseudo labels for the segmentation of ellipse-like vascular structures in non-contrast CTs A cross-attention-based deep learning approach for predicting functional stroke outcomes using 4D CTP imaging and clinical metadata DACG: Dual Attention and Context Guidance model for radiology report generation Simulation-free prediction of atrial fibrillation inducibility with the fibrotic kernel signature An objective comparison of methods for augmented reality in laparoscopic liver resection by preoperative-to-intraoperative image fusion from the MICCAI2022 challenge
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1