IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.

IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Visual Computing for Industry Biomedicine and Art Pub Date : 2024-08-05 DOI:10.1186/s42492-024-00171-w
Zhihao Chen, Bin Hu, Chuang Niu, Tao Chen, Yuxin Li, Hongming Shan, Ge Wang
{"title":"IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.","authors":"Zhihao Chen, Bin Hu, Chuang Niu, Tao Chen, Yuxin Li, Hongming Shan, Ge Wang","doi":"10.1186/s42492-024-00171-w","DOIUrl":null,"url":null,"abstract":"<p><p>Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2000,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11300764/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Visual Computing for Industry Biomedicine and Art","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1186/s42492-024-00171-w","RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
IQAGPT:利用视觉语言和 ChatGPT 模型进行计算机断层扫描图像质量评估。
大型语言模型(LLM),如 ChatGPT,已在各种任务中展示出令人印象深刻的能力,并作为许多领域的自然语言界面吸引了越来越多的关注。最近,从图像-文本对中学习丰富的视觉-语言相关性的大型视觉-语言模型(VLM),如 BLIP-2 和 GPT-4,也得到了深入研究。然而,尽管取得了这些进展,LLMs 和 VLMs 在图像质量评估(IQA)中的应用,尤其是在医学成像中的应用,仍有待探索。这对于进行客观的性能评估和潜在地补充甚至取代放射科医生的意见非常有价值。为此,本研究介绍了一种创新的计算机断层扫描(CT)IQA 系统 IQAGPT,该系统将图像质量字幕 VLM 与 ChatGPT 整合在一起,生成质量评分和文本报告。首先,我们对由 1,000 张不同质量水平的 CT 切片组成的 CT-IQA 数据集进行了专业注释和编译,以用于训练和评估。为了更好地利用 LLM 的功能,使用提示模板将注释的质量分数转换为语义丰富的文本描述。其次,在 CT-IQA 数据集上对图像质量字幕 VLM 进行微调,以生成质量描述。该字幕模型通过跨模态关注融合了图像和文本特征。第三,基于质量描述,用户口头要求 ChatGPT 对图像质量评分或生成放射质量报告。结果证明了使用 LLM 评估图像质量的可行性。所提出的 IQAGPT 优于 GPT-4 和 CLIP-IQA,也优于仅依赖图像的多任务分类和回归模型。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
5.60
自引率
0.00%
发文量
0
期刊最新文献
A study on the influence of situations on personal avatar characteristics. Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models. Machine learning approach for the prediction of macrosomia. Medical image registration and its application in retinal images: a review. IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1