Large Language Model-Augmented Auto-Delineation of Treatment Target Volume in Radiation Therapy

Praveenbalaji Rajendran, Yong Yang, Thomas R. Niedermayr, Michael Gensheimer, Beth Beadle, Quynh-Thu Le, Lei Xing, Xianjin Dai
{"title":"Large Language Model-Augmented Auto-Delineation of Treatment Target Volume in Radiation Therapy","authors":"Praveenbalaji Rajendran, Yong Yang, Thomas R. Niedermayr, Michael Gensheimer, Beth Beadle, Quynh-Thu Le, Lei Xing, Xianjin Dai","doi":"arxiv-2407.07296","DOIUrl":null,"url":null,"abstract":"Radiation therapy (RT) is one of the most effective treatments for cancer,\nand its success relies on the accurate delineation of targets. However, target\ndelineation is a comprehensive medical decision that currently relies purely on\nmanual processes by human experts. Manual delineation is time-consuming,\nlaborious, and subject to interobserver variations. Although the advancements\nin artificial intelligence (AI) techniques have significantly enhanced the\nauto-contouring of normal tissues, accurate delineation of RT target volumes\nremains a challenge. In this study, we propose a visual language model-based RT\ntarget volume auto-delineation network termed Radformer. The Radformer utilizes\na hierarichal vision transformer as the backbone and incorporates large\nlanguage models to extract text-rich features from clinical data. We introduce\na visual language attention module (VLAM) for integrating visual and linguistic\nfeatures for language-aware visual encoding (LAVE). The Radformer has been\nevaluated on a dataset comprising 2985 patients with head-and-neck cancer who\nunderwent RT. Metrics, including the Dice similarity coefficient (DSC),\nintersection over union (IOU), and 95th percentile Hausdorff distance (HD95),\nwere used to evaluate the performance of the model quantitatively. Our results\ndemonstrate that the Radformer has superior segmentation performance compared\nto other state-of-the-art models, validating its potential for adoption in RT\npractice.","PeriodicalId":501378,"journal":{"name":"arXiv - PHYS - Medical Physics","volume":"36 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Medical Physics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.07296","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Radiation therapy (RT) is one of the most effective treatments for cancer, and its success relies on the accurate delineation of targets. However, target delineation is a comprehensive medical decision that currently relies purely on manual processes by human experts. Manual delineation is time-consuming, laborious, and subject to interobserver variations. Although the advancements in artificial intelligence (AI) techniques have significantly enhanced the auto-contouring of normal tissues, accurate delineation of RT target volumes remains a challenge. In this study, we propose a visual language model-based RT target volume auto-delineation network termed Radformer. The Radformer utilizes a hierarichal vision transformer as the backbone and incorporates large language models to extract text-rich features from clinical data. We introduce a visual language attention module (VLAM) for integrating visual and linguistic features for language-aware visual encoding (LAVE). The Radformer has been evaluated on a dataset comprising 2985 patients with head-and-neck cancer who underwent RT. Metrics, including the Dice similarity coefficient (DSC), intersection over union (IOU), and 95th percentile Hausdorff distance (HD95), were used to evaluate the performance of the model quantitatively. Our results demonstrate that the Radformer has superior segmentation performance compared to other state-of-the-art models, validating its potential for adoption in RT practice.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
大语言模型增强的放射治疗靶区自动划线技术
放射治疗(RT)是治疗癌症最有效的方法之一,其成功与否取决于靶点的精确划分。然而,靶区划分是一项综合性医疗决策,目前完全依赖于人类专家的手动操作。人工划线费时、费力,而且受观察者之间差异的影响。虽然人工智能(AI)技术的进步大大提高了正常组织的自动轮廓绘制能力,但 RT 靶区体积的精确划分仍然是一项挑战。在这项研究中,我们提出了一种基于视觉语言模型的 RT 靶体积自动划线网络,称为 Radformer。Radformer 以分层视觉转换器为骨干,结合大型语言模型,从临床数据中提取丰富的文本特征。我们引入了视觉语言注意模块(VLAM),用于整合视觉和语言特征,实现语言感知视觉编码(LAVE)。Radformer 在由 2985 名接受 RT 治疗的头颈癌患者组成的数据集上进行了评估。包括戴斯相似性系数(DSC)、交集大于联合(IOU)和第 95 百分位数豪斯多夫距离(HD95)在内的指标被用来定量评估模型的性能。我们的结果表明,与其他最先进的模型相比,Radformer 具有更优越的分割性能,验证了其在 RT 实践中的应用潜力。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Experimental Learning of a Hyperelastic Behavior with a Physics-Augmented Neural Network Modeling water radiolysis with Geant4-DNA: Impact of the temporal structure of the irradiation pulse under oxygen conditions Fast Spot Order Optimization to Increase Dose Rates in Scanned Particle Therapy FLASH Treatments The i-TED Compton Camera Array for real-time boron imaging and determination during treatments in Boron Neutron Capture Therapy OpenDosimeter: Open Hardware Personal X-ray Dosimeter
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1