Pub Date : 2024-09-23DOI: 10.1186/s42492-024-00175-6
Dennis Hein, Staffan Holmin, Timothy Szczykutowicz, Jonathan S Maltz, Mats Danielsson, Ge Wang, Mats Persson
Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.
{"title":"Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models.","authors":"Dennis Hein, Staffan Holmin, Timothy Szczykutowicz, Jonathan S Maltz, Mats Danielsson, Ge Wang, Mats Persson","doi":"10.1186/s42492-024-00175-6","DOIUrl":"10.1186/s42492-024-00175-6","url":null,"abstract":"<p><p>Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420411/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-23DOI: 10.1186/s42492-024-00174-7
Natalie Hube, Melissa Reinelt, Kresimir Vidackovic, Michael Sedlmair
Avatars play a key role in how persons interact within virtual environments, acting as the digital selves. There are many types of avatars, each serving the purpose of representing users or others in these immersive spaces. However, the optimal approach for these avatars remains unclear. Although consumer applications often use cartoon-like avatars, this trend is not as common in work settings. To gain a better understanding of the kinds of avatars people prefer, three studies were conducted involving both screen-based and virtual reality setups, looking into how social settings might affect the way people choose their avatars. Personalized avatars were created for 91 participants, including 71 employees in the automotive field and 20 participants not affiliated with the company. The research shows that work-type situations influence the chosen avatar. At the same time, a correlation between the type of display medium used to display the avatar or the person's personality and their avatar choice was not found. Based on the findings, recommendations are made for future avatar representations in work environments and implications and research questions derived that can guide future research.
{"title":"A study on the influence of situations on personal avatar characteristics.","authors":"Natalie Hube, Melissa Reinelt, Kresimir Vidackovic, Michael Sedlmair","doi":"10.1186/s42492-024-00174-7","DOIUrl":"10.1186/s42492-024-00174-7","url":null,"abstract":"<p><p>Avatars play a key role in how persons interact within virtual environments, acting as the digital selves. There are many types of avatars, each serving the purpose of representing users or others in these immersive spaces. However, the optimal approach for these avatars remains unclear. Although consumer applications often use cartoon-like avatars, this trend is not as common in work settings. To gain a better understanding of the kinds of avatars people prefer, three studies were conducted involving both screen-based and virtual reality setups, looking into how social settings might affect the way people choose their avatars. Personalized avatars were created for 91 participants, including 71 employees in the automotive field and 20 participants not affiliated with the company. The research shows that work-type situations influence the chosen avatar. At the same time, a correlation between the type of display medium used to display the avatar or the person's personality and their avatar choice was not found. Based on the findings, recommendations are made for future avatar representations in work environments and implications and research questions derived that can guide future research.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420416/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297059","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fetal macrosomia is associated with maternal and newborn complications due to incorrect fetal weight estimation or inappropriate choice of delivery models. The early screening and evaluation of macrosomia in the third trimester can improve delivery outcomes and reduce complications. However, traditional clinical and ultrasound examinations face difficulties in obtaining accurate fetal measurements during the third trimester of pregnancy. This study aims to develop a comprehensive predictive model for detecting macrosomia using machine learning (ML) algorithms. The accuracy of macrosomia prediction using logistic regression, k-nearest neighbors, support vector machine, random forest (RF), XGBoost, and LightGBM algorithms was explored. Each approach was trained and validated using data from 3244 pregnant women at a hospital in southern China. The information gain method was employed to identify deterministic features associated with the occurrence of macrosomia. The performance of six ML algorithms based on the recall and area under the curve evaluation metrics were compared. To develop an efficient prediction model, two sets of experiments based on ultrasound examination records within 1-7 days and 8-14 days prior to delivery were conducted. The ensemble model, comprising the RF, XGBoost, and LightGBM algorithms, showed encouraging results. For each experimental group, the proposed ensemble model outperformed other ML approaches and the traditional Hadlock formula. The experimental results indicate that, with the most risk-relevant features, the ML algorithms presented in this study can predict macrosomia and assist obstetricians in selecting more appropriate delivery models.
由于胎儿体重估计错误或分娩方式选择不当,胎儿巨大儿与孕产妇和新生儿并发症有关。在妊娠三个月内对巨大胎儿进行早期筛查和评估可改善分娩结局并减少并发症。然而,传统的临床和超声检查很难在妊娠三个月内获得准确的胎儿测量值。本研究旨在利用机器学习(ML)算法建立一个全面的大畸形检测预测模型。研究探讨了使用逻辑回归、k-近邻、支持向量机、随机森林(RF)、XGBoost 和 LightGBM 算法预测巨型胎儿的准确性。每种方法都使用中国南方一家医院 3244 名孕妇的数据进行了训练和验证。采用信息增益法来识别与巨畸症发生相关的确定性特征。比较了基于召回率和曲线下面积评价指标的六种多重L算法的性能。为了建立有效的预测模型,研究人员根据产前 1-7 天和 8-14 天的超声波检查记录进行了两组实验。由 RF、XGBoost 和 LightGBM 算法组成的集合模型取得了令人鼓舞的结果。在每个实验组中,建议的集合模型都优于其他 ML 方法和传统的 Hadlock 公式。实验结果表明,利用与风险最相关的特征,本研究提出的 ML 算法可以预测巨大儿,并帮助产科医生选择更合适的分娩模式。
{"title":"Machine learning approach for the prediction of macrosomia.","authors":"Xiaochen Gu, Ping Huang, Xiaohua Xu, Zhicheng Zheng, Kaiju Luo, Yujie Xu, Yizhen Jia, Yongjin Zhou","doi":"10.1186/s42492-024-00172-9","DOIUrl":"10.1186/s42492-024-00172-9","url":null,"abstract":"<p><p>Fetal macrosomia is associated with maternal and newborn complications due to incorrect fetal weight estimation or inappropriate choice of delivery models. The early screening and evaluation of macrosomia in the third trimester can improve delivery outcomes and reduce complications. However, traditional clinical and ultrasound examinations face difficulties in obtaining accurate fetal measurements during the third trimester of pregnancy. This study aims to develop a comprehensive predictive model for detecting macrosomia using machine learning (ML) algorithms. The accuracy of macrosomia prediction using logistic regression, k-nearest neighbors, support vector machine, random forest (RF), XGBoost, and LightGBM algorithms was explored. Each approach was trained and validated using data from 3244 pregnant women at a hospital in southern China. The information gain method was employed to identify deterministic features associated with the occurrence of macrosomia. The performance of six ML algorithms based on the recall and area under the curve evaluation metrics were compared. To develop an efficient prediction model, two sets of experiments based on ultrasound examination records within 1-7 days and 8-14 days prior to delivery were conducted. The ensemble model, comprising the RF, XGBoost, and LightGBM algorithms, showed encouraging results. For each experimental group, the proposed ensemble model outperformed other ML approaches and the traditional Hadlock formula. The experimental results indicate that, with the most risk-relevant features, the ML algorithms presented in this study can predict macrosomia and assist obstetricians in selecting more appropriate delivery models.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-08-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11349957/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142074113","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-21DOI: 10.1186/s42492-024-00173-8
Qiushi Nie, Xiaoqing Zhang, Yan Hu, Mingdao Gong, Jiang Liu
Medical image registration is vital for disease diagnosis and treatment with its ability to merge diverse information of images, which may be captured under different times, angles, or modalities. Although several surveys have reviewed the development of medical image registration, they have not systematically summarized the existing medical image registration methods. To this end, a comprehensive review of these methods is provided from traditional and deep-learning-based perspectives, aiming to help audiences quickly understand the development of medical image registration. In particular, we review recent advances in retinal image registration, which has not attracted much attention. In addition, current challenges in retinal image registration are discussed and insights and prospects for future research provided.
{"title":"Medical image registration and its application in retinal images: a review.","authors":"Qiushi Nie, Xiaoqing Zhang, Yan Hu, Mingdao Gong, Jiang Liu","doi":"10.1186/s42492-024-00173-8","DOIUrl":"10.1186/s42492-024-00173-8","url":null,"abstract":"<p><p>Medical image registration is vital for disease diagnosis and treatment with its ability to merge diverse information of images, which may be captured under different times, angles, or modalities. Although several surveys have reviewed the development of medical image registration, they have not systematically summarized the existing medical image registration methods. To this end, a comprehensive review of these methods is provided from traditional and deep-learning-based perspectives, aiming to help audiences quickly understand the development of medical image registration. In particular, we review recent advances in retinal image registration, which has not attracted much attention. In addition, current challenges in retinal image registration are discussed and insights and prospects for future research provided.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-08-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11339199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142018904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-08-05DOI: 10.1186/s42492-024-00171-w
Zhihao Chen, Bin Hu, Chuang Niu, Tao Chen, Yuxin Li, Hongming Shan, Ge Wang
Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.
{"title":"IQAGPT: computed tomography image quality assessment with vision-language and ChatGPT models.","authors":"Zhihao Chen, Bin Hu, Chuang Niu, Tao Chen, Yuxin Li, Hongming Shan, Ge Wang","doi":"10.1186/s42492-024-00171-w","DOIUrl":"10.1186/s42492-024-00171-w","url":null,"abstract":"<p><p>Large language models (LLMs), such as ChatGPT, have demonstrated impressive capabilities in various tasks and attracted increasing interest as a natural language interface across many domains. Recently, large vision-language models (VLMs) that learn rich vision-language correlation from image-text pairs, like BLIP-2 and GPT-4, have been intensively investigated. However, despite these developments, the application of LLMs and VLMs in image quality assessment (IQA), particularly in medical imaging, remains unexplored. This is valuable for objective performance evaluation and potential supplement or even replacement of radiologists' opinions. To this end, this study introduces IQAGPT, an innovative computed tomography (CT) IQA system that integrates image-quality captioning VLM with ChatGPT to generate quality scores and textual reports. First, a CT-IQA dataset comprising 1,000 CT slices with diverse quality levels is professionally annotated and compiled for training and evaluation. To better leverage the capabilities of LLMs, the annotated quality scores are converted into semantically rich text descriptions using a prompt template. Second, the image-quality captioning VLM is fine-tuned on the CT-IQA dataset to generate quality descriptions. The captioning model fuses image and text features through cross-modal attention. Third, based on the quality descriptions, users verbally request ChatGPT to rate image-quality scores or produce radiological quality reports. Results demonstrate the feasibility of assessing image quality using LLMs. The proposed IQAGPT outperformed GPT-4 and CLIP-IQA, as well as multitask classification and regression models that solely rely on images.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11300764/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141890259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-17DOI: 10.1186/s42492-024-00169-4
Muhammad Ramzan, Jinfang Sheng, Muhammad Usman Saeed, Bin Wang, Faisal Z Duraihem
This study addresses the critical issue of anemia detection using machine learning (ML) techniques. Although a widespread blood disorder with significant health implications, anemia often remains undetected. This necessitates timely and efficient diagnostic methods, as traditional approaches that rely on manual assessment are time-consuming and subjective. The present study explored the application of ML - particularly classification models, such as logistic regression, decision trees, random forest, support vector machines, Naïve Bayes, and k-nearest neighbors - in conjunction with innovative models incorporating attention modules and spatial attention to detect anemia. The proposed models demonstrated promising results, achieving high accuracy, precision, recall, and F1 scores for both textual and image datasets. In addition, an integrated approach that combines textual and image data was found to outperform the individual modalities. Specifically, the proposed AlexNet Multiple Spatial Attention model achieved an exceptional accuracy of 99.58%, emphasizing its potential to revolutionize automated anemia detection. The results of ablation studies confirm the significance of key components - including the blue-green-red, multiple, and spatial attentions - in enhancing model performance. Overall, this study presents a comprehensive and innovative framework for noninvasive anemia detection, contributing valuable insights to the field.
本研究利用机器学习(ML)技术解决了贫血检测这一关键问题。虽然贫血是一种普遍存在的血液疾病,对健康有重大影响,但往往仍未被发现。这就需要及时有效的诊断方法,因为依赖人工评估的传统方法既费时又主观。本研究探讨了如何应用多重参照法,特别是分类模型,如逻辑回归、决策树、随机森林、支持向量机、奈夫贝叶斯和 k 近邻等,并结合包含注意力模块和空间注意力的创新模型来检测贫血。所提出的模型取得了可喜的成果,在文本和图像数据集上都获得了较高的准确度、精确度、召回率和 F1 分数。此外,结合文本和图像数据的综合方法也优于单独的模式。具体来说,所提出的 AlexNet 多空间注意力模型达到了 99.58% 的超高准确率,凸显了其在自动化贫血检测方面的革命性潜力。消融研究结果证实了蓝绿红、多重和空间注意力等关键组件在提高模型性能方面的重要性。总之,这项研究为无创贫血检测提出了一个全面而创新的框架,为该领域贡献了宝贵的见解。
{"title":"Revolutionizing anemia detection: integrative machine learning models and advanced attention mechanisms.","authors":"Muhammad Ramzan, Jinfang Sheng, Muhammad Usman Saeed, Bin Wang, Faisal Z Duraihem","doi":"10.1186/s42492-024-00169-4","DOIUrl":"10.1186/s42492-024-00169-4","url":null,"abstract":"<p><p>This study addresses the critical issue of anemia detection using machine learning (ML) techniques. Although a widespread blood disorder with significant health implications, anemia often remains undetected. This necessitates timely and efficient diagnostic methods, as traditional approaches that rely on manual assessment are time-consuming and subjective. The present study explored the application of ML - particularly classification models, such as logistic regression, decision trees, random forest, support vector machines, Naïve Bayes, and k-nearest neighbors - in conjunction with innovative models incorporating attention modules and spatial attention to detect anemia. The proposed models demonstrated promising results, achieving high accuracy, precision, recall, and F1 scores for both textual and image datasets. In addition, an integrated approach that combines textual and image data was found to outperform the individual modalities. Specifically, the proposed AlexNet Multiple Spatial Attention model achieved an exceptional accuracy of 99.58%, emphasizing its potential to revolutionize automated anemia detection. The results of ablation studies confirm the significance of key components - including the blue-green-red, multiple, and spatial attentions - in enhancing model performance. Overall, this study presents a comprehensive and innovative framework for noninvasive anemia detection, contributing valuable insights to the field.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11255163/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141627889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-08DOI: 10.1186/s42492-024-00168-5
Yufei Li, Yufei Xin, Xinni Li, Yinrui Zhang, Cheng Liu, Zhengwen Cao, Shaoyi Du, Lin Wang
Pneumonia is a serious disease that can be fatal, particularly among children and the elderly. The accuracy of pneumonia diagnosis can be improved by combining artificial-intelligence technology with X-ray imaging. This study proposes X-ODFCANet, which addresses the issues of low accuracy and excessive parameters in existing deep-learning-based pneumonia-classification methods. This network incorporates a feature coordination attention module and an omni-dimensional dynamic convolution (ODConv) module, leveraging the residual module for feature extraction from X-ray images. The feature coordination attention module utilizes two one-dimensional feature encoding processes to aggregate feature information from different spatial directions. Additionally, the ODConv module extracts and fuses feature information in four dimensions: the spatial dimension of the convolution kernel, input and output channel quantities, and convolution kernel quantity. The experimental results demonstrate that the proposed method can effectively improve the accuracy of pneumonia classification, which is 3.77% higher than that of ResNet18. The model parameters are 4.45M, which was reduced by approximately 2.5 times. The code is available at https://github.com/limuni/X-ODFCANET .
肺炎是一种可致命的严重疾病,尤其是对儿童和老人而言。通过将人工智能技术与 X 射线成像相结合,可以提高肺炎诊断的准确性。本研究提出的 X-ODFCANet 解决了现有基于深度学习的肺炎分类方法准确率低和参数过多的问题。该网络包含一个特征协调注意模块和一个全维动态卷积(ODConv)模块,利用残差模块从 X 光图像中提取特征。特征协调注意模块利用两个一维特征编码过程来汇总来自不同空间方向的特征信息。此外,ODConv 模块从四个维度提取并融合特征信息:卷积核的空间维度、输入和输出通道数量以及卷积核数量。实验结果表明,所提出的方法能有效提高肺炎分类的准确率,比 ResNet18 高出 3.77%。模型参数为 4.45M,减少了约 2.5 倍。代码见 https://github.com/limuni/X-ODFCANET 。
{"title":"Omni-dimensional dynamic convolution feature coordinate attention network for pneumonia classification.","authors":"Yufei Li, Yufei Xin, Xinni Li, Yinrui Zhang, Cheng Liu, Zhengwen Cao, Shaoyi Du, Lin Wang","doi":"10.1186/s42492-024-00168-5","DOIUrl":"10.1186/s42492-024-00168-5","url":null,"abstract":"<p><p>Pneumonia is a serious disease that can be fatal, particularly among children and the elderly. The accuracy of pneumonia diagnosis can be improved by combining artificial-intelligence technology with X-ray imaging. This study proposes X-ODFCANet, which addresses the issues of low accuracy and excessive parameters in existing deep-learning-based pneumonia-classification methods. This network incorporates a feature coordination attention module and an omni-dimensional dynamic convolution (ODConv) module, leveraging the residual module for feature extraction from X-ray images. The feature coordination attention module utilizes two one-dimensional feature encoding processes to aggregate feature information from different spatial directions. Additionally, the ODConv module extracts and fuses feature information in four dimensions: the spatial dimension of the convolution kernel, input and output channel quantities, and convolution kernel quantity. The experimental results demonstrate that the proposed method can effectively improve the accuracy of pneumonia classification, which is 3.77% higher than that of ResNet18. The model parameters are 4.45M, which was reduced by approximately 2.5 times. The code is available at https://github.com/limuni/X-ODFCANET .</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11231110/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141555547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-07-05DOI: 10.1186/s42492-024-00167-6
Yuwei Liu, Litao Zhao, Jie Bao, Jian Hou, Zhaozhao Jing, Songlu Liu, Xuanhao Li, Zibing Cao, Boyu Yang, Junkang Shen, Ji Zhang, Libiao Ji, Zhen Kang, Chunhong Hu, Liang Wang, Jiangang Liu
Active surveillance (AS) is the primary strategy for managing patients with low or favorable-intermediate risk prostate cancer (PCa). Identifying patients who may benefit from AS relies on unpleasant prostate biopsies, which entail the risk of bleeding and infection. In the current study, we aimed to develop a radiomics model based on prostate magnetic resonance images to identify AS candidates non-invasively. A total of 956 PCa patients with complete biopsy reports from six hospitals were included in the current multicenter retrospective study. The National Comprehensive Cancer Network (NCCN) guidelines were used as reference standards to determine the AS candidacy. To discriminate between AS and non-AS candidates, five radiomics models (i.e., eXtreme Gradient Boosting (XGBoost) AS classifier (XGB-AS), logistic regression (LR) AS classifier, random forest (RF) AS classifier, adaptive boosting (AdaBoost) AS classifier, and decision tree (DT) AS classifier) were developed and externally validated using a three-fold cross-center validation based on five classifiers: XGBoost, LR, RF, AdaBoost, and DT. Area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE) were calculated to evaluate the performance of these models. XGB-AS exhibited an average of AUC of 0.803, ACC of 0.693, SEN of 0.668, and SPE of 0.841, showing a better comprehensive performance than those of the other included radiomic models. Additionally, the XGB-AS model also presented a promising performance for identifying AS candidates from the intermediate-risk cases and the ambiguous cases with diagnostic discordance between the NCCN guidelines and the Prostate Imaging-Reporting and Data System assessment. These results suggest that the XGB-AS model has the potential to help identify patients who are suitable for AS and allow non-invasive monitoring of patients on AS, thereby reducing the number of annual biopsies and the associated risks of bleeding and infection.
{"title":"Non-invasively identifying candidates of active surveillance for prostate cancer using magnetic resonance imaging radiomics.","authors":"Yuwei Liu, Litao Zhao, Jie Bao, Jian Hou, Zhaozhao Jing, Songlu Liu, Xuanhao Li, Zibing Cao, Boyu Yang, Junkang Shen, Ji Zhang, Libiao Ji, Zhen Kang, Chunhong Hu, Liang Wang, Jiangang Liu","doi":"10.1186/s42492-024-00167-6","DOIUrl":"10.1186/s42492-024-00167-6","url":null,"abstract":"<p><p>Active surveillance (AS) is the primary strategy for managing patients with low or favorable-intermediate risk prostate cancer (PCa). Identifying patients who may benefit from AS relies on unpleasant prostate biopsies, which entail the risk of bleeding and infection. In the current study, we aimed to develop a radiomics model based on prostate magnetic resonance images to identify AS candidates non-invasively. A total of 956 PCa patients with complete biopsy reports from six hospitals were included in the current multicenter retrospective study. The National Comprehensive Cancer Network (NCCN) guidelines were used as reference standards to determine the AS candidacy. To discriminate between AS and non-AS candidates, five radiomics models (i.e., eXtreme Gradient Boosting (XGBoost) AS classifier (XGB-AS), logistic regression (LR) AS classifier, random forest (RF) AS classifier, adaptive boosting (AdaBoost) AS classifier, and decision tree (DT) AS classifier) were developed and externally validated using a three-fold cross-center validation based on five classifiers: XGBoost, LR, RF, AdaBoost, and DT. Area under the receiver operating characteristic curve (AUC), accuracy (ACC), sensitivity (SEN), and specificity (SPE) were calculated to evaluate the performance of these models. XGB-AS exhibited an average of AUC of 0.803, ACC of 0.693, SEN of 0.668, and SPE of 0.841, showing a better comprehensive performance than those of the other included radiomic models. Additionally, the XGB-AS model also presented a promising performance for identifying AS candidates from the intermediate-risk cases and the ambiguous cases with diagnostic discordance between the NCCN guidelines and the Prostate Imaging-Reporting and Data System assessment. These results suggest that the XGB-AS model has the potential to help identify patients who are suitable for AS and allow non-invasive monitoring of patients on AS, thereby reducing the number of annual biopsies and the associated risks of bleeding and infection.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":3.2,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11226574/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141535544","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-06-17DOI: 10.1186/s42492-024-00166-7
Taofik Ahmed Suleiman, Daniel Tweneboah Anyimadu, Andrew Dwi Permana, Hsham Abdalgny Abdalwhab Ngim, Alessandra Scotto di Freca
Skin lesion classification plays a crucial role in the early detection and diagnosis of various skin conditions. Recent advances in computer-aided diagnostic techniques have been instrumental in timely intervention, thereby improving patient outcomes, particularly in rural communities lacking specialized expertise. Despite the widespread adoption of convolutional neural networks (CNNs) in skin disease detection, their effectiveness has been hindered by the limited size and data imbalance of publicly accessible skin lesion datasets. In this context, a two-step hierarchical binary classification approach is proposed utilizing hybrid machine and deep learning (DL) techniques. Experiments conducted on the International Skin Imaging Collaboration (ISIC 2017) dataset demonstrate the effectiveness of the hierarchical approach in handling large class imbalances. Specifically, employing DenseNet121 (DNET) as a feature extractor and random forest (RF) as a classifier yielded the most promising results, achieving a balanced multiclass accuracy (BMA) of 91.07% compared to the pure deep-learning model (end-to-end DNET) with a BMA of 88.66%. The RF ensemble exhibited significantly greater efficiency than other machine-learning classifiers in aiding DL to address the challenge of learning with limited data. Furthermore, the implemented predictive hybrid hierarchical model demonstrated enhanced performance while significantly reducing computational time, indicating its potential efficiency in real-world applications for the classification of skin lesions.
{"title":"Two-step hierarchical binary classification of cancerous skin lesions using transfer learning and the random forest algorithm.","authors":"Taofik Ahmed Suleiman, Daniel Tweneboah Anyimadu, Andrew Dwi Permana, Hsham Abdalgny Abdalwhab Ngim, Alessandra Scotto di Freca","doi":"10.1186/s42492-024-00166-7","DOIUrl":"10.1186/s42492-024-00166-7","url":null,"abstract":"<p><p>Skin lesion classification plays a crucial role in the early detection and diagnosis of various skin conditions. Recent advances in computer-aided diagnostic techniques have been instrumental in timely intervention, thereby improving patient outcomes, particularly in rural communities lacking specialized expertise. Despite the widespread adoption of convolutional neural networks (CNNs) in skin disease detection, their effectiveness has been hindered by the limited size and data imbalance of publicly accessible skin lesion datasets. In this context, a two-step hierarchical binary classification approach is proposed utilizing hybrid machine and deep learning (DL) techniques. Experiments conducted on the International Skin Imaging Collaboration (ISIC 2017) dataset demonstrate the effectiveness of the hierarchical approach in handling large class imbalances. Specifically, employing DenseNet121 (DNET) as a feature extractor and random forest (RF) as a classifier yielded the most promising results, achieving a balanced multiclass accuracy (BMA) of 91.07% compared to the pure deep-learning model (end-to-end DNET) with a BMA of 88.66%. The RF ensemble exhibited significantly greater efficiency than other machine-learning classifiers in aiding DL to address the challenge of learning with limited data. Furthermore, the implemented predictive hybrid hierarchical model demonstrated enhanced performance while significantly reducing computational time, indicating its potential efficiency in real-world applications for the classification of skin lesions.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":null,"pages":null},"PeriodicalIF":2.8,"publicationDate":"2024-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11183002/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141331925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}