{"title":"一种基于迁移学习的新型 CLIP 模型与自我注意机制相结合,用于区分胰腺导管腺癌中的肿瘤-间质比例。","authors":"Hongfan Liao, Jiang Yuan, Chunhua Liu, Jiao Zhang, Yaying Yang, Hongwei Liang, Haotian Liu, Shanxiong Chen, Yongmei Li","doi":"10.1007/s11547-024-01902-y","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>To develop a contrastive language-image pretraining (CLIP) model based on transfer learning and combined with self-attention mechanism to predict the tumor-stroma ratio (TSR) in pancreatic ductal adenocarcinoma on preoperative enhanced CT images, in order to understand the biological characteristics of tumors for risk stratification and guiding feature fusion during artificial intelligence-based model representation.</p><p><strong>Material and methods: </strong>This retrospective study collected a total of 207 PDAC patients from three hospitals. TSR assessments were performed on surgical specimens by pathologists and divided into high TSR and low TSR groups. This study developed one novel CLIP-adapter model that integrates the CLIP paradigm with a self-attention mechanism for better utilizing features from multi-phase imaging, thereby enhancing the accuracy and reliability of tumor-stroma ratio predictions. Additionally, clinical variables, traditional radiomics model and deep learning models (ResNet50, ResNet101, ViT_Base_32, ViT_Base_16) were constructed for comparison.</p><p><strong>Results: </strong>The models showed significant efficacy in predicting TSR in PDAC. The performance of the CLIP-adapter model based on multi-phase feature fusion was superior to that based on any single phase (arterial or venous phase). The CLIP-adapter model outperformed traditional radiomics models and deep learning models, with CLIP-adapter_ViT_Base_32 performing the best, achieving the highest AUC (0.978) and accuracy (0.921) in the test set. Kaplan-Meier survival analysis showed longer overall survival in patients with low TSR compared to those with high TSR.</p><p><strong>Conclusion: </strong>The CLIP-adapter model designed in this study provides a safe and accurate method for predicting the TSR in PDAC. The feature fusion module based on multi-modal (image and text) and multi-phase (arterial and venous phase) significantly improves model performance.</p>","PeriodicalId":20817,"journal":{"name":"Radiologia Medica","volume":null,"pages":null},"PeriodicalIF":9.7000,"publicationDate":"2024-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"One novel transfer learning-based CLIP model combined with self-attention mechanism for differentiating the tumor-stroma ratio in pancreatic ductal adenocarcinoma.\",\"authors\":\"Hongfan Liao, Jiang Yuan, Chunhua Liu, Jiao Zhang, Yaying Yang, Hongwei Liang, Haotian Liu, Shanxiong Chen, Yongmei Li\",\"doi\":\"10.1007/s11547-024-01902-y\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>To develop a contrastive language-image pretraining (CLIP) model based on transfer learning and combined with self-attention mechanism to predict the tumor-stroma ratio (TSR) in pancreatic ductal adenocarcinoma on preoperative enhanced CT images, in order to understand the biological characteristics of tumors for risk stratification and guiding feature fusion during artificial intelligence-based model representation.</p><p><strong>Material and methods: </strong>This retrospective study collected a total of 207 PDAC patients from three hospitals. TSR assessments were performed on surgical specimens by pathologists and divided into high TSR and low TSR groups. This study developed one novel CLIP-adapter model that integrates the CLIP paradigm with a self-attention mechanism for better utilizing features from multi-phase imaging, thereby enhancing the accuracy and reliability of tumor-stroma ratio predictions. Additionally, clinical variables, traditional radiomics model and deep learning models (ResNet50, ResNet101, ViT_Base_32, ViT_Base_16) were constructed for comparison.</p><p><strong>Results: </strong>The models showed significant efficacy in predicting TSR in PDAC. The performance of the CLIP-adapter model based on multi-phase feature fusion was superior to that based on any single phase (arterial or venous phase). The CLIP-adapter model outperformed traditional radiomics models and deep learning models, with CLIP-adapter_ViT_Base_32 performing the best, achieving the highest AUC (0.978) and accuracy (0.921) in the test set. Kaplan-Meier survival analysis showed longer overall survival in patients with low TSR compared to those with high TSR.</p><p><strong>Conclusion: </strong>The CLIP-adapter model designed in this study provides a safe and accurate method for predicting the TSR in PDAC. The feature fusion module based on multi-modal (image and text) and multi-phase (arterial and venous phase) significantly improves model performance.</p>\",\"PeriodicalId\":20817,\"journal\":{\"name\":\"Radiologia Medica\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":9.7000,\"publicationDate\":\"2024-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Radiologia Medica\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1007/s11547-024-01902-y\",\"RegionNum\":1,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2024/10/16 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q1\",\"JCRName\":\"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Radiologia Medica","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1007/s11547-024-01902-y","RegionNum":1,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2024/10/16 0:00:00","PubModel":"Epub","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
摘要
目的:开发一种基于迁移学习并结合自我注意机制的对比语言-图像预训练(CLIP)模型,用于预测术前增强CT图像上胰腺导管腺癌的肿瘤-间质比(TSR),以了解肿瘤的生物学特征,从而进行风险分层,并在基于人工智能的模型表示过程中指导特征融合:这项回顾性研究共收集了三家医院的 207 例 PDAC 患者。病理学家对手术标本进行了 TSR 评估,并将其分为高 TSR 组和低 TSR 组。本研究开发了一种新型 CLIP 适配器模型,该模型将 CLIP 范式与自我注意机制相结合,能更好地利用多相成像的特征,从而提高肿瘤-基质比预测的准确性和可靠性。此外,还构建了临床变量、传统放射组学模型和深度学习模型(ResNet50、ResNet101、ViT_Base_32、ViT_Base_16)进行比较:结果:这些模型在预测PDAC的TSR方面显示出明显的功效。基于多相特征融合的 CLIP-adapter 模型的性能优于基于任何单相(动脉或静脉相)的模型。CLIP-adapter模型的表现优于传统的放射组学模型和深度学习模型,其中CLIP-adapter_ViT_Base_32表现最佳,在测试集中获得了最高的AUC(0.978)和准确率(0.921)。Kaplan-Meier生存分析显示,与高TSR患者相比,低TSR患者的总生存期更长:结论:本研究设计的 CLIP-adapter 模型为预测 PDAC 的 TSR 提供了一种安全、准确的方法。基于多模态(图像和文本)和多阶段(动脉和静脉阶段)的特征融合模块显著提高了模型的性能。
One novel transfer learning-based CLIP model combined with self-attention mechanism for differentiating the tumor-stroma ratio in pancreatic ductal adenocarcinoma.
Purpose: To develop a contrastive language-image pretraining (CLIP) model based on transfer learning and combined with self-attention mechanism to predict the tumor-stroma ratio (TSR) in pancreatic ductal adenocarcinoma on preoperative enhanced CT images, in order to understand the biological characteristics of tumors for risk stratification and guiding feature fusion during artificial intelligence-based model representation.
Material and methods: This retrospective study collected a total of 207 PDAC patients from three hospitals. TSR assessments were performed on surgical specimens by pathologists and divided into high TSR and low TSR groups. This study developed one novel CLIP-adapter model that integrates the CLIP paradigm with a self-attention mechanism for better utilizing features from multi-phase imaging, thereby enhancing the accuracy and reliability of tumor-stroma ratio predictions. Additionally, clinical variables, traditional radiomics model and deep learning models (ResNet50, ResNet101, ViT_Base_32, ViT_Base_16) were constructed for comparison.
Results: The models showed significant efficacy in predicting TSR in PDAC. The performance of the CLIP-adapter model based on multi-phase feature fusion was superior to that based on any single phase (arterial or venous phase). The CLIP-adapter model outperformed traditional radiomics models and deep learning models, with CLIP-adapter_ViT_Base_32 performing the best, achieving the highest AUC (0.978) and accuracy (0.921) in the test set. Kaplan-Meier survival analysis showed longer overall survival in patients with low TSR compared to those with high TSR.
Conclusion: The CLIP-adapter model designed in this study provides a safe and accurate method for predicting the TSR in PDAC. The feature fusion module based on multi-modal (image and text) and multi-phase (arterial and venous phase) significantly improves model performance.
期刊介绍:
Felice Perussia founded La radiologia medica in 1914. It is a peer-reviewed journal and serves as the official journal of the Italian Society of Medical and Interventional Radiology (SIRM). The primary purpose of the journal is to disseminate information related to Radiology, especially advancements in diagnostic imaging and related disciplines. La radiologia medica welcomes original research on both fundamental and clinical aspects of modern radiology, with a particular focus on diagnostic and interventional imaging techniques. It also covers topics such as radiotherapy, nuclear medicine, radiobiology, health physics, and artificial intelligence in the context of clinical implications. The journal includes various types of contributions such as original articles, review articles, editorials, short reports, and letters to the editor. With an esteemed Editorial Board and a selection of insightful reports, the journal is an indispensable resource for radiologists and professionals in related fields. Ultimately, La radiologia medica aims to serve as a platform for international collaboration and knowledge sharing within the radiological community.