首页 > 最新文献

Visual Computing for Industry Biomedicine and Art最新文献

英文 中文
Explainable machine learning framework for cataracts recognition using visual features. 使用视觉特征识别白内障的可解释机器学习框架。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-17 DOI: 10.1186/s42492-024-00183-6
Xiao Wu, Lingxi Hu, Zunjie Xiao, Xiaoqing Zhang, Risa Higashita, Jiang Liu

Cataract is the leading ocular disease of blindness and visual impairment globally. Deep neural networks (DNNs) have achieved promising cataracts recognition performance based on anterior segment optical coherence tomography (AS-OCT) images; however, they have poor explanations, limiting their clinical applications. In contrast, visual features extracted from original AS-OCT images and their transform forms (e.g., AS-OCT-based histograms) have good explanations but have not been fully exploited. Motivated by these observations, an explainable machine learning framework to recognize cataracts severity levels automatically using AS-OCT images was proposed, consisting of three stages: visual feature extraction, feature importance explanation and selection, and recognition. First, the intensity histogram and intensity-based statistical methods are applied to extract visual features from original AS-OCT images and AS-OCT-based histograms. Subsequently, the SHapley Additive exPlanations and Pearson correlation coefficient methods are applied to analyze the feature importance and select significant visual features. Finally, an ensemble multi-class ridge regression method is applied to recognize the cataracts severity levels based on the selected visual features. Experiments on a clinical AS-OCT-NC dataset demonstrate that the proposed framework not only achieves competitive performance through comparisons with DNNs, but also has a good explanation ability, meeting the requirements of clinical diagnostic practice.

白内障是全球致盲和视力损害的主要眼部疾病。基于前段光学相干断层扫描(AS-OCT)图像的深度神经网络(DNNs)已经取得了很好的白内障识别性能;然而,它们的解释不充分,限制了它们的临床应用。相比之下,从原始AS-OCT图像中提取的视觉特征及其变换形式(例如,基于AS-OCT的直方图)有很好的解释,但尚未得到充分利用。基于这些观察结果,提出了一种基于AS-OCT图像的白内障严重程度自动识别的机器学习框架,该框架包括三个阶段:视觉特征提取、特征重要性解释和选择以及识别。首先,应用强度直方图和基于强度的统计方法从原始AS-OCT图像和基于AS-OCT的直方图中提取视觉特征;随后,应用SHapley加性解释和Pearson相关系数方法分析特征重要性,选择显著的视觉特征。最后,基于所选择的视觉特征,应用集成多类岭回归方法对白内障的严重程度进行识别。在临床AS-OCT-NC数据集上的实验表明,所提出的框架不仅与dnn相比具有竞争力,而且具有良好的解释能力,满足临床诊断实践的要求。
{"title":"Explainable machine learning framework for cataracts recognition using visual features.","authors":"Xiao Wu, Lingxi Hu, Zunjie Xiao, Xiaoqing Zhang, Risa Higashita, Jiang Liu","doi":"10.1186/s42492-024-00183-6","DOIUrl":"10.1186/s42492-024-00183-6","url":null,"abstract":"<p><p>Cataract is the leading ocular disease of blindness and visual impairment globally. Deep neural networks (DNNs) have achieved promising cataracts recognition performance based on anterior segment optical coherence tomography (AS-OCT) images; however, they have poor explanations, limiting their clinical applications. In contrast, visual features extracted from original AS-OCT images and their transform forms (e.g., AS-OCT-based histograms) have good explanations but have not been fully exploited. Motivated by these observations, an explainable machine learning framework to recognize cataracts severity levels automatically using AS-OCT images was proposed, consisting of three stages: visual feature extraction, feature importance explanation and selection, and recognition. First, the intensity histogram and intensity-based statistical methods are applied to extract visual features from original AS-OCT images and AS-OCT-based histograms. Subsequently, the SHapley Additive exPlanations and Pearson correlation coefficient methods are applied to analyze the feature importance and select significant visual features. Finally, an ensemble multi-class ridge regression method is applied to recognize the cataracts severity levels based on the selected visual features. Experiments on a clinical AS-OCT-NC dataset demonstrate that the proposed framework not only achieves competitive performance through comparisons with DNNs, but also has a good explanation ability, meeting the requirements of clinical diagnostic practice.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"3"},"PeriodicalIF":3.2,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11748710/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143012990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Harmonized technical standard test methods for quality evaluation of medical fluorescence endoscopic imaging systems. 医用荧光内窥镜成像系统质量评价的协调技术标准试验方法。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-10 DOI: 10.1186/s42492-024-00184-5
Bodong Liu, Zhaojun Guo, Pengfei Yang, Jian'an Ye, Kunshan He, Shen Gao, Chongwei Chi, Yu An, Jie Tian

Fluorescence endoscopy technology utilizes a light source of a specific wavelength to excite the fluorescence signals of biological tissues. This capability is extremely valuable for the early detection and precise diagnosis of pathological changes. Identifying a suitable experimental approach and metric for objectively and quantitatively assessing the imaging quality of fluorescence endoscopy is imperative to enhance the image evaluation criteria of fluorescence imaging technology. In this study, we propose a new set of standards for fluorescence endoscopy technology to evaluate the optical performance and image quality of fluorescence imaging objectively and quantitatively. This comprehensive set of standards encompasses fluorescence test models and imaging quality assessment protocols to ensure that the performance of fluorescence endoscopy systems meets the required standards. In addition, it aims to enhance the accuracy and uniformity of the results by standardizing testing procedures. The formulation of pivotal metrics and testing methodologies is anticipated to facilitate direct quantitative comparisons of the performance of fluorescence endoscopy devices. This advancement is expected to foster the harmonization of clinical and preclinical evaluations using fluorescence endoscopy imaging systems, thereby improving diagnostic precision and efficiency.

荧光内窥镜技术利用特定波长的光源激发生物组织的荧光信号。这种能力对于病理变化的早期发现和精确诊断是非常有价值的。为客观定量地评价荧光内镜成像质量,确定合适的实验方法和指标是提高荧光成像技术图像评价标准的必要条件。在本研究中,我们提出了一套新的荧光内窥镜技术标准,客观定量地评价荧光成像的光学性能和图像质量。这套全面的标准包括荧光测试模型和成像质量评估协议,以确保荧光内窥镜系统的性能符合要求的标准。此外,它旨在通过标准化测试程序来提高结果的准确性和统一性。预计关键指标和测试方法的制定将促进荧光内窥镜设备性能的直接定量比较。这一进展有望促进使用荧光内窥镜成像系统的临床和临床前评估的协调,从而提高诊断的准确性和效率。
{"title":"Harmonized technical standard test methods for quality evaluation of medical fluorescence endoscopic imaging systems.","authors":"Bodong Liu, Zhaojun Guo, Pengfei Yang, Jian'an Ye, Kunshan He, Shen Gao, Chongwei Chi, Yu An, Jie Tian","doi":"10.1186/s42492-024-00184-5","DOIUrl":"10.1186/s42492-024-00184-5","url":null,"abstract":"<p><p>Fluorescence endoscopy technology utilizes a light source of a specific wavelength to excite the fluorescence signals of biological tissues. This capability is extremely valuable for the early detection and precise diagnosis of pathological changes. Identifying a suitable experimental approach and metric for objectively and quantitatively assessing the imaging quality of fluorescence endoscopy is imperative to enhance the image evaluation criteria of fluorescence imaging technology. In this study, we propose a new set of standards for fluorescence endoscopy technology to evaluate the optical performance and image quality of fluorescence imaging objectively and quantitatively. This comprehensive set of standards encompasses fluorescence test models and imaging quality assessment protocols to ensure that the performance of fluorescence endoscopy systems meets the required standards. In addition, it aims to enhance the accuracy and uniformity of the results by standardizing testing procedures. The formulation of pivotal metrics and testing methodologies is anticipated to facilitate direct quantitative comparisons of the performance of fluorescence endoscopy devices. This advancement is expected to foster the harmonization of clinical and preclinical evaluations using fluorescence endoscopy imaging systems, thereby improving diagnostic precision and efficiency.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"2"},"PeriodicalIF":3.2,"publicationDate":"2025-01-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11723869/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advancing breast cancer diagnosis: token vision transformers for faster and accurate classification of histopathology images. 推进乳腺癌诊断:标记视觉变压器更快,更准确地分类组织病理图像。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2025-01-08 DOI: 10.1186/s42492-024-00181-8
Mouhamed Laid Abimouloud, Khaled Bensid, Mohamed Elleuch, Mohamed Ben Ammar, Monji Kherallah

The vision transformer (ViT) architecture, with its attention mechanism based on multi-head attention layers, has been widely adopted in various computer-aided diagnosis tasks due to its effectiveness in processing medical image information. ViTs are notably recognized for their complex architecture, which requires high-performance GPUs or CPUs for efficient model training and deployment in real-world medical diagnostic devices. This renders them more intricate than convolutional neural networks (CNNs). This difficulty is also challenging in the context of histopathology image analysis, where the images are both limited and complex. In response to these challenges, this study proposes a TokenMixer hybrid-architecture that combines the strengths of CNNs and ViTs. This hybrid architecture aims to enhance feature extraction and classification accuracy with shorter training time and fewer parameters by minimizing the number of input patches employed during training, while incorporating tokenization of input patches using convolutional layers and encoder transformer layers to process patches across all network layers for fast and accurate breast cancer tumor subtype classification. The TokenMixer mechanism is inspired by the ConvMixer and TokenLearner models. First, the ConvMixer model dynamically generates spatial attention maps using convolutional layers, enabling the extraction of patches from input images to minimize the number of input patches used in training. Second, the TokenLearner model extracts relevant regions from the selected input patches, tokenizes them to improve feature extraction, and trains all tokenized patches in an encoder transformer network. We evaluated the TokenMixer model on the BreakHis public dataset, comparing it with ViT-based and other state-of-the-art methods. Our approach achieved impressive results for both binary and multi-classification of breast cancer subtypes across various magnification levels (40×, 100×, 200×, 400×). The model demonstrated accuracies of 97.02% for binary classification and 93.29% for multi-classification, with decision times of 391.71 and 1173.56 s, respectively. These results highlight the potential of our hybrid deep ViT-CNN architecture for advancing tumor classification in histopathological images. The source code is accessible: https://github.com/abimouloud/TokenMixer .

视觉转换器(vision transformer, ViT)架构以其基于多头注意层的注意机制,在医学图像信息处理方面的有效性被广泛应用于各种计算机辅助诊断任务中。vit以其复杂的体系结构而闻名,这需要高性能gpu或cpu才能在现实世界的医疗诊断设备中进行有效的模型训练和部署。这使得它们比卷积神经网络(cnn)更复杂。在组织病理学图像分析的背景下,这一困难也是具有挑战性的,因为图像既有限又复杂。为了应对这些挑战,本研究提出了一种结合cnn和ViTs优势的TokenMixer混合架构。该混合架构旨在通过最小化训练过程中使用的输入补丁数量,以更短的训练时间和更少的参数提高特征提取和分类精度,同时结合使用卷积层和编码器变压器层对输入补丁进行标记化,跨所有网络层处理补丁,以实现快速准确的乳腺癌肿瘤亚型分类。TokenMixer机制的灵感来自于ConvMixer和TokenLearner模型。首先,ConvMixer模型使用卷积层动态生成空间注意图,从而能够从输入图像中提取补丁,从而最大限度地减少训练中使用的输入补丁数量。其次,TokenLearner模型从选择的输入patch中提取相关区域,对其进行标记以改进特征提取,并在编码器变压器网络中训练所有标记过的patch。我们在BreakHis公共数据集上评估了TokenMixer模型,并将其与基于viti的方法和其他最先进的方法进行了比较。我们的方法在不同放大倍数(40倍、100倍、200倍、400倍)下对乳腺癌亚型的二元和多重分类都取得了令人印象深刻的结果。该模型对二元分类的准确率为97.02%,对多重分类的准确率为93.29%,决策时间分别为391.71 s和1173.56 s。这些结果突出了我们的混合深度ViT-CNN架构在组织病理学图像中推进肿瘤分类的潜力。源代码可访问:https://github.com/abimouloud/TokenMixer。
{"title":"Advancing breast cancer diagnosis: token vision transformers for faster and accurate classification of histopathology images.","authors":"Mouhamed Laid Abimouloud, Khaled Bensid, Mohamed Elleuch, Mohamed Ben Ammar, Monji Kherallah","doi":"10.1186/s42492-024-00181-8","DOIUrl":"10.1186/s42492-024-00181-8","url":null,"abstract":"<p><p>The vision transformer (ViT) architecture, with its attention mechanism based on multi-head attention layers, has been widely adopted in various computer-aided diagnosis tasks due to its effectiveness in processing medical image information. ViTs are notably recognized for their complex architecture, which requires high-performance GPUs or CPUs for efficient model training and deployment in real-world medical diagnostic devices. This renders them more intricate than convolutional neural networks (CNNs). This difficulty is also challenging in the context of histopathology image analysis, where the images are both limited and complex. In response to these challenges, this study proposes a TokenMixer hybrid-architecture that combines the strengths of CNNs and ViTs. This hybrid architecture aims to enhance feature extraction and classification accuracy with shorter training time and fewer parameters by minimizing the number of input patches employed during training, while incorporating tokenization of input patches using convolutional layers and encoder transformer layers to process patches across all network layers for fast and accurate breast cancer tumor subtype classification. The TokenMixer mechanism is inspired by the ConvMixer and TokenLearner models. First, the ConvMixer model dynamically generates spatial attention maps using convolutional layers, enabling the extraction of patches from input images to minimize the number of input patches used in training. Second, the TokenLearner model extracts relevant regions from the selected input patches, tokenizes them to improve feature extraction, and trains all tokenized patches in an encoder transformer network. We evaluated the TokenMixer model on the BreakHis public dataset, comparing it with ViT-based and other state-of-the-art methods. Our approach achieved impressive results for both binary and multi-classification of breast cancer subtypes across various magnification levels (40×, 100×, 200×, 400×). The model demonstrated accuracies of 97.02% for binary classification and 93.29% for multi-classification, with decision times of 391.71 and 1173.56 s, respectively. These results highlight the potential of our hybrid deep ViT-CNN architecture for advancing tumor classification in histopathological images. The source code is accessible: https://github.com/abimouloud/TokenMixer .</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"8 1","pages":"1"},"PeriodicalIF":3.2,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11711433/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142956033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised contour-driven broad learning system for autonomous segmentation of concealed prohibited baggage items. 隐藏违禁行李物品自主分割的半监督轮廓驱动广义学习系统。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-12-24 DOI: 10.1186/s42492-024-00182-7
Divya Velayudhan, Abdelfatah Ahmed, Taimur Hassan, Muhammad Owais, Neha Gour, Mohammed Bennamoun, Ernesto Damiani, Naoufel Werghi

With the exponential rise in global air traffic, ensuring swift passenger processing while countering potential security threats has become a paramount concern for aviation security. Although X-ray baggage monitoring is now standard, manual screening has several limitations, including the propensity for errors, and raises concerns about passenger privacy. To address these drawbacks, researchers have leveraged recent advances in deep learning to design threat-segmentation frameworks. However, these models require extensive training data and labour-intensive dense pixel-wise annotations and are finetuned separately for each dataset to account for inter-dataset discrepancies. Hence, this study proposes a semi-supervised contour-driven broad learning system (BLS) for X-ray baggage security threat instance segmentation referred to as C-BLX. The research methodology involved enhancing representation learning and achieving faster training capability to tackle severe occlusion and class imbalance using a single training routine with limited baggage scans. The proposed framework was trained with minimal supervision using resource-efficient image-level labels to localize illegal items in multi-vendor baggage scans. More specifically, the framework generated candidate region segments from the input X-ray scans based on local intensity transition cues, effectively identifying concealed prohibited items without entire baggage scans. The multi-convolutional BLS exploits the rich complementary features extracted from these region segments to predict object categories, including threat and benign classes. The contours corresponding to the region segments predicted as threats were then utilized to yield the segmentation results. The proposed C-BLX system was thoroughly evaluated on three highly imbalanced public datasets and surpassed other competitive approaches in baggage-threat segmentation, yielding 90.04%, 78.92%, and 59.44% in terms of mIoU on GDXray, SIXray, and Compass-XP, respectively. Furthermore, the limitations of the proposed system in extracting precise region segments in intricate noisy settings and potential strategies for overcoming them through post-processing techniques were explored (source code will be available at https://github.com/Divs1159/CNN_BLS .).

随着全球空中交通的指数增长,确保快速处理旅客,同时应对潜在的安全威胁已成为航空安全的首要问题。虽然x光行李监控现在是标准的,但人工筛查有一些局限性,包括容易出错,并引发了对乘客隐私的担忧。为了解决这些问题,研究人员利用深度学习的最新进展来设计威胁分割框架。然而,这些模型需要大量的训练数据和劳动密集型的密集像素级注释,并且需要针对每个数据集分别进行微调,以解释数据集之间的差异。因此,本研究提出了一种用于x射线行李安全威胁实例分割的半监督轮廓驱动广义学习系统(BLS),称为C-BLX。研究方法包括增强表征学习和实现更快的训练能力,以解决严重的闭塞和班级不平衡,使用单一的训练程序和有限的行李扫描。该框架在最小的监督下进行训练,使用资源高效的图像级标签来定位多供应商行李扫描中的非法物品。更具体地说,该框架基于局部强度转换线索从输入的x射线扫描中生成候选区域片段,有效识别隐藏的违禁物品,而无需对整个行李进行扫描。多卷积BLS利用从这些区域段中提取的丰富的互补特征来预测对象类别,包括威胁类和良性类。然后利用预测为威胁的区域段对应的轮廓来产生分割结果。本文提出的C-BLX系统在三个高度不平衡的公共数据集上进行了全面评估,在行李威胁分割方面超过了其他竞争方法,在GDXray、SIXray和Compass-XP上的mIoU分别达到90.04%、78.92%和59.44%。此外,所提出的系统在复杂的噪声环境中提取精确区域片段的局限性以及通过后处理技术克服它们的潜在策略进行了探讨(源代码将在https://github.com/Divs1159/CNN_BLS .)。
{"title":"Semi-supervised contour-driven broad learning system for autonomous segmentation of concealed prohibited baggage items.","authors":"Divya Velayudhan, Abdelfatah Ahmed, Taimur Hassan, Muhammad Owais, Neha Gour, Mohammed Bennamoun, Ernesto Damiani, Naoufel Werghi","doi":"10.1186/s42492-024-00182-7","DOIUrl":"10.1186/s42492-024-00182-7","url":null,"abstract":"<p><p>With the exponential rise in global air traffic, ensuring swift passenger processing while countering potential security threats has become a paramount concern for aviation security. Although X-ray baggage monitoring is now standard, manual screening has several limitations, including the propensity for errors, and raises concerns about passenger privacy. To address these drawbacks, researchers have leveraged recent advances in deep learning to design threat-segmentation frameworks. However, these models require extensive training data and labour-intensive dense pixel-wise annotations and are finetuned separately for each dataset to account for inter-dataset discrepancies. Hence, this study proposes a semi-supervised contour-driven broad learning system (BLS) for X-ray baggage security threat instance segmentation referred to as C-BLX. The research methodology involved enhancing representation learning and achieving faster training capability to tackle severe occlusion and class imbalance using a single training routine with limited baggage scans. The proposed framework was trained with minimal supervision using resource-efficient image-level labels to localize illegal items in multi-vendor baggage scans. More specifically, the framework generated candidate region segments from the input X-ray scans based on local intensity transition cues, effectively identifying concealed prohibited items without entire baggage scans. The multi-convolutional BLS exploits the rich complementary features extracted from these region segments to predict object categories, including threat and benign classes. The contours corresponding to the region segments predicted as threats were then utilized to yield the segmentation results. The proposed C-BLX system was thoroughly evaluated on three highly imbalanced public datasets and surpassed other competitive approaches in baggage-threat segmentation, yielding 90.04%, 78.92%, and 59.44% in terms of mIoU on GDXray, SIXray, and Compass-XP, respectively. Furthermore, the limitations of the proposed system in extracting precise region segments in intricate noisy settings and potential strategies for overcoming them through post-processing techniques were explored (source code will be available at https://github.com/Divs1159/CNN_BLS .).</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"30"},"PeriodicalIF":3.2,"publicationDate":"2024-12-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11666859/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142883119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Energy consumption forecasting for laser manufacturing of large artifacts based on fusionable transfer learning. 基于可融合迁移学习的大型工件激光加工能耗预测。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-12-02 DOI: 10.1186/s42492-024-00178-3
Linxuan Wang, Jinghua Xu, Shuyou Zhang, Jianrong Tan, Shaomei Fei, Xuezhi Shi, Jihong Pang, Sheng Luo

This study presents an energy consumption (EC) forecasting method for laser melting manufacturing of metal artifacts based on fusionable transfer learning (FTL). To predict the EC of manufacturing products, particularly from scale-down to scale-up, a general paradigm was first developed by categorizing the overall process into three main sub-steps. The operating electrical power was further formulated as a combinatorial function, based on which an operator learning network was adopted to fit the nonlinear relations between the fabricating arguments and EC. Parallel-arranged networks were constructed to investigate the impacts of fabrication variables and devices on power. Considering the interconnections among these factors, the outputs of the neural networks were blended and fused to jointly predict the electrical power. Most innovatively, large artifacts can be decomposed into time-dependent laser-scanning trajectories, which can be further transformed into fusionable information via neural networks, inspired by large language model. Accordingly, transfer learning can deal with either scale-down or scale-up forecasting, namely, FTL with scalability within artifact structures. The effectiveness of the proposed FTL was verified through physical fabrication experiments via laser powder bed fusion. The relative error of the average and overall EC predictions based on FTL was maintained below 0.83%. The melting fusion quality was examined using metallographic diagrams. The proposed FTL framework can forecast the EC of scaled structures, which is particularly helpful in price estimation and quotation of large metal products towards carbon peaking and carbon neutrality.

提出了一种基于可融合迁移学习(FTL)的金属工件激光熔化加工能耗预测方法。为了预测制造产品的EC,特别是从按比例缩小到按比例扩大,首先通过将整个过程分为三个主要子步骤,开发了一个一般范例。在此基础上,采用算子学习网络拟合加工参数与电导率之间的非线性关系。构建了并联排列的网络,研究了制造变量和设备对功率的影响。考虑到这些因素之间的相互联系,将神经网络的输出进行混合融合,共同预测电功率。最具创新性的是,大型工件可以分解为与时间相关的激光扫描轨迹,这些轨迹可以进一步通过神经网络转化为可融合的信息,并受到大型语言模型的启发。因此,迁移学习可以处理按比例缩小或按比例扩大的预测,即在工件结构中具有可伸缩性的FTL。通过激光粉末床融合物理制造实验,验证了该超光速装置的有效性。基于超光速的平均和总体EC预测的相对误差保持在0.83%以下。用金相图对熔炼质量进行了检验。所提出的FTL框架可以预测规模结构的EC,特别有助于大型金属产品的碳峰值和碳中和价格估计和报价。
{"title":"Energy consumption forecasting for laser manufacturing of large artifacts based on fusionable transfer learning.","authors":"Linxuan Wang, Jinghua Xu, Shuyou Zhang, Jianrong Tan, Shaomei Fei, Xuezhi Shi, Jihong Pang, Sheng Luo","doi":"10.1186/s42492-024-00178-3","DOIUrl":"10.1186/s42492-024-00178-3","url":null,"abstract":"<p><p>This study presents an energy consumption (EC) forecasting method for laser melting manufacturing of metal artifacts based on fusionable transfer learning (FTL). To predict the EC of manufacturing products, particularly from scale-down to scale-up, a general paradigm was first developed by categorizing the overall process into three main sub-steps. The operating electrical power was further formulated as a combinatorial function, based on which an operator learning network was adopted to fit the nonlinear relations between the fabricating arguments and EC. Parallel-arranged networks were constructed to investigate the impacts of fabrication variables and devices on power. Considering the interconnections among these factors, the outputs of the neural networks were blended and fused to jointly predict the electrical power. Most innovatively, large artifacts can be decomposed into time-dependent laser-scanning trajectories, which can be further transformed into fusionable information via neural networks, inspired by large language model. Accordingly, transfer learning can deal with either scale-down or scale-up forecasting, namely, FTL with scalability within artifact structures. The effectiveness of the proposed FTL was verified through physical fabrication experiments via laser powder bed fusion. The relative error of the average and overall EC predictions based on FTL was maintained below 0.83%. The melting fusion quality was examined using metallographic diagrams. The proposed FTL framework can forecast the EC of scaled structures, which is particularly helpful in price estimation and quotation of large metal products towards carbon peaking and carbon neutrality.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"29"},"PeriodicalIF":3.2,"publicationDate":"2024-12-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11612079/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142772951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational analysis of variability and uncertainty in the clinical reference on magnetic resonance imaging radiomics: modelling and performance. 磁共振成像放射组学临床参考文献中变异性和不确定性的计算分析:建模与性能。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-19 DOI: 10.1186/s42492-024-00180-9
Cindy Xue, Jing Yuan, Gladys G Lo, Darren M C Poon, Winnie Cw Chu

To conduct a computational investigation to explore the influence of clinical reference uncertainty on magnetic resonance imaging (MRI) radiomics feature selection, modelling, and performance. This study used two sets of publicly available prostate cancer MRI = radiomics data (Dataset 1: n = 260; Dataset 2: n = 100) with Gleason score clinical references. Each dataset was divided into training and holdout testing datasets at a ratio of 7:3 and analysed independently. The clinical references of the training set were permuted at different levels (increments of 5%) and repeated 20 times. Four feature selection algorithms and two classifiers were used to construct the models. Cross-validation was employed for training, while a separate hold-out testing set was used for evaluation. The Jaccard similarity coefficient was used to evaluate feature selection, while the area under the curve (AUC) and accuracy were used to assess model performance. An analysis of variance test with Bonferroni correction was conducted to compare the metrics of each model. The consistency of the feature selection performance decreased substantially with the clinical reference permutation. AUCs of the trained models with permutation particularly after 20% were significantly lower (Dataset 1 (with ≥ 20% permutation): 0.67, and Dataset 2 (≥ 20% permutation): 0.74), compared to the AUC of models without permutation (Dataset 1: 0.94, Dataset 2: 0.97). The performances of the models were also associated with larger uncertainties and an increasing number of permuted clinical references. Clinical reference uncertainty can substantially influence MRI radiomic feature selection and modelling. The high accuracy of clinical references should be helpful in building reliable and robust radiomic models. Careful interpretation of the model performance is necessary, particularly for high-dimensional data.

进行计算研究,探索临床参考不确定性对磁共振成像(MRI)放射组学特征选择、建模和性能的影响。本研究使用了两组公开的前列腺癌磁共振成像 = 放射组学数据(数据集 1:n = 260;数据集 2:n = 100),其中包含格里森评分临床参考值。每个数据集按 7:3 的比例分为训练数据集和暂停测试数据集,并进行独立分析。训练集的临床参考资料按不同级别(增量为 5%)进行置换,并重复 20 次。在构建模型时使用了四种特征选择算法和两种分类器。训练时采用交叉验证,评估时则使用单独的保留测试集。Jaccard 相似系数用于评估特征选择,而曲线下面积(AUC)和准确率则用于评估模型性能。对每个模型的指标进行了带 Bonferroni 校正的方差分析测试比较。特征选择性能的一致性随着临床参照排列的增加而大大降低。经过训练的模型的AUCs,尤其是20%以后的包被率明显降低(数据集1(包被率≥20%):0.67;数据集2(包被率≥20%):0.67):0.67,数据集 2(≥ 20% 变异):0.74):数据集 1:0.94;数据集 2:0.97)。模型的性能还与不确定性的增大和置换临床参考文献数量的增加有关。临床参考文献的不确定性会严重影响磁共振成像放射学特征选择和建模。临床参考文献的高准确性应有助于建立可靠、稳健的放射学模型。有必要对模型性能进行仔细解读,尤其是高维数据。
{"title":"Computational analysis of variability and uncertainty in the clinical reference on magnetic resonance imaging radiomics: modelling and performance.","authors":"Cindy Xue, Jing Yuan, Gladys G Lo, Darren M C Poon, Winnie Cw Chu","doi":"10.1186/s42492-024-00180-9","DOIUrl":"10.1186/s42492-024-00180-9","url":null,"abstract":"<p><p>To conduct a computational investigation to explore the influence of clinical reference uncertainty on magnetic resonance imaging (MRI) radiomics feature selection, modelling, and performance. This study used two sets of publicly available prostate cancer MRI = radiomics data (Dataset 1: n = 260; Dataset 2: n = 100) with Gleason score clinical references. Each dataset was divided into training and holdout testing datasets at a ratio of 7:3 and analysed independently. The clinical references of the training set were permuted at different levels (increments of 5%) and repeated 20 times. Four feature selection algorithms and two classifiers were used to construct the models. Cross-validation was employed for training, while a separate hold-out testing set was used for evaluation. The Jaccard similarity coefficient was used to evaluate feature selection, while the area under the curve (AUC) and accuracy were used to assess model performance. An analysis of variance test with Bonferroni correction was conducted to compare the metrics of each model. The consistency of the feature selection performance decreased substantially with the clinical reference permutation. AUCs of the trained models with permutation particularly after 20% were significantly lower (Dataset 1 (with ≥ 20% permutation): 0.67, and Dataset 2 (≥ 20% permutation): 0.74), compared to the AUC of models without permutation (Dataset 1: 0.94, Dataset 2: 0.97). The performances of the models were also associated with larger uncertainties and an increasing number of permuted clinical references. Clinical reference uncertainty can substantially influence MRI radiomic feature selection and modelling. The high accuracy of clinical references should be helpful in building reliable and robust radiomic models. Careful interpretation of the model performance is necessary, particularly for high-dimensional data.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"28"},"PeriodicalIF":3.2,"publicationDate":"2024-11-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11573982/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142669232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Survey of real-time brainmedia in artistic exploration. 艺术探索中的实时脑媒体调查。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-11-18 DOI: 10.1186/s42492-024-00179-2
Rem RunGu Lin, Kang Zhang

This survey examines the evolution and impact of real-time brainmedia on artistic exploration, contextualizing developments within a historical framework. To enhance knowledge on the entanglement between the brain, mind, and body in an increasingly mediated world, this work defines a clear scope at the intersection of bio art and interactive art, concentrating on real-time brainmedia artworks developed in the 21st century. It proposes a set of criteria and a taxonomy based on historical notions, interaction dynamics, and media art representations. The goal is to provide a comprehensive overview of real-time brainmedia, setting the stage for future explorations of new paradigms in communication between humans, machines, and the environment.

本调查研究了实时脑媒体的演变及其对艺术探索的影响,并在历史框架内对其发展进行了梳理。为了增进人们对日益媒介化的世界中大脑、心灵和身体之间的纠葛的了解,这项工作在生物艺术和互动艺术的交叉点上界定了一个明确的范围,集中研究 21 世纪开发的实时脑媒体艺术作品。它根据历史概念、互动动态和媒体艺术表现形式,提出了一套标准和分类法。其目的是对实时脑媒体进行全面概述,为未来探索人类、机器和环境之间交流的新范式奠定基础。
{"title":"Survey of real-time brainmedia in artistic exploration.","authors":"Rem RunGu Lin, Kang Zhang","doi":"10.1186/s42492-024-00179-2","DOIUrl":"10.1186/s42492-024-00179-2","url":null,"abstract":"<p><p>This survey examines the evolution and impact of real-time brainmedia on artistic exploration, contextualizing developments within a historical framework. To enhance knowledge on the entanglement between the brain, mind, and body in an increasingly mediated world, this work defines a clear scope at the intersection of bio art and interactive art, concentrating on real-time brainmedia artworks developed in the 21st century. It proposes a set of criteria and a taxonomy based on historical notions, interaction dynamics, and media art representations. The goal is to provide a comprehensive overview of real-time brainmedia, setting the stage for future explorations of new paradigms in communication between humans, machines, and the environment.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"27"},"PeriodicalIF":3.2,"publicationDate":"2024-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11570570/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142649143","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Achieving view-distance and -angle invariance in motion prediction using a simple network. 利用简单网络实现运动预测中的视距和角度不变性
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-10-28 DOI: 10.1186/s42492-024-00176-5
Haichuan Zhao, Xudong Ru, Peng Du, Shaolong Liu, Na Liu, Xingce Wang, Zhongke Wu

Recently, human motion prediction has gained significant attention and achieved notable success. However, current methods primarily rely on training and testing with ideal datasets, overlooking the impact of variations in the viewing distance and viewing angle, which are commonly encountered in practical scenarios. In this study, we address the issue of model invariance by ensuring robust performance despite variations in view distances and angles. To achieve this, we employed Riemannian geometry methods to constrain the learning process of neural networks, enabling the prediction of invariances using a simple network. Furthermore, this enhances the application of motion prediction in various scenarios. Our framework uses Riemannian geometry to encode motion into a novel motion space to achieve prediction with an invariant viewing distance and angle using a simple network. Specifically, the specified path transport square-root velocity function is proposed to aid in removing the view-angle equivalence class and encode motion sequences into a flattened space. Motion coding by the geometry method linearizes the optimization problem in a non-flattened space and effectively extracts motion information, allowing the proposed method to achieve competitive performance using a simple network. Experimental results on Human 3.6M and CMU MoCap demonstrate that the proposed framework has competitive performance and invariance to the viewing distance and viewing angle.

近来,人体运动预测受到了广泛关注,并取得了显著成就。然而,目前的方法主要依赖于理想数据集的训练和测试,忽略了实际场景中常见的视距和视角变化的影响。在本研究中,我们通过确保在视距和视角发生变化时仍能保持稳定的性能来解决模型不变性问题。为此,我们采用了黎曼几何方法来约束神经网络的学习过程,从而能够使用简单的网络预测不变性。此外,这还增强了运动预测在各种场景中的应用。我们的框架利用黎曼几何将运动编码到一个新颖的运动空间,从而利用简单的网络实现视距和视角不变的预测。具体来说,我们提出了指定路径传输平方根速度函数,以帮助消除视角等价类,并将运动序列编码到扁平化空间中。通过几何方法进行运动编码,可在非扁平化空间中线性化优化问题,并有效提取运动信息,从而使所提出的方法能够利用简单的网络实现具有竞争力的性能。在人类 3.6M 和 CMU MoCap 上的实验结果表明,所提出的框架具有极佳的性能,并且不受观看距离和观看角度的影响。
{"title":"Achieving view-distance and -angle invariance in motion prediction using a simple network.","authors":"Haichuan Zhao, Xudong Ru, Peng Du, Shaolong Liu, Na Liu, Xingce Wang, Zhongke Wu","doi":"10.1186/s42492-024-00176-5","DOIUrl":"10.1186/s42492-024-00176-5","url":null,"abstract":"<p><p>Recently, human motion prediction has gained significant attention and achieved notable success. However, current methods primarily rely on training and testing with ideal datasets, overlooking the impact of variations in the viewing distance and viewing angle, which are commonly encountered in practical scenarios. In this study, we address the issue of model invariance by ensuring robust performance despite variations in view distances and angles. To achieve this, we employed Riemannian geometry methods to constrain the learning process of neural networks, enabling the prediction of invariances using a simple network. Furthermore, this enhances the application of motion prediction in various scenarios. Our framework uses Riemannian geometry to encode motion into a novel motion space to achieve prediction with an invariant viewing distance and angle using a simple network. Specifically, the specified path transport square-root velocity function is proposed to aid in removing the view-angle equivalence class and encode motion sequences into a flattened space. Motion coding by the geometry method linearizes the optimization problem in a non-flattened space and effectively extracts motion information, allowing the proposed method to achieve competitive performance using a simple network. Experimental results on Human 3.6M and CMU MoCap demonstrate that the proposed framework has competitive performance and invariance to the viewing distance and viewing angle.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"26"},"PeriodicalIF":3.2,"publicationDate":"2024-10-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11519277/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Real-time volume rendering for three-dimensional fetal ultrasound using volumetric photon mapping. 利用容积光子映射技术为三维胎儿超声波提供实时容积渲染。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-10-25 DOI: 10.1186/s42492-024-00177-4
Jing Zou, Jing Qin

Three-dimensional (3D) fetal ultrasound has been widely used in prenatal examinations. Realistic and real-time volumetric ultrasound volume rendering can enhance the effectiveness of diagnoses and assist obstetricians and pregnant mothers in communicating. However, this remains a challenging task because (1) there is a large amount of speckle noise in ultrasound images and (2) ultrasound images usually have low contrasts, making it difficult to distinguish different tissues and organs. However, traditional local-illumination-based methods do not achieve satisfactory results. This real-time requirement makes the task increasingly challenging. This study presents a novel real-time volume-rendering method equipped with a global illumination model for 3D fetal ultrasound visualization. This method can render direct illumination and indirect illumination separately by calculating single scattering and multiple scattering radiances, respectively. The indirect illumination effect was simulated using volumetric photon mapping. Calculating each photon's brightness is proposed using a novel screen-space destiny estimation to avoid complicated storage structures and accelerate computation. This study proposes a high dynamic range approach to address the issue of fetal skin with a dynamic range exceeding that of the display device. Experiments show that our technology, compared to conventional methodologies, can generate realistic rendering results with far more depth information.

三维(3D)胎儿超声已广泛应用于产前检查。真实、实时的超声容积渲染可以提高诊断的有效性,并帮助产科医生和孕妇进行沟通。然而,这仍然是一项具有挑战性的任务,因为:(1) 超声图像中存在大量斑点噪声;(2) 超声图像通常对比度较低,难以区分不同的组织和器官。然而,传统的基于局部照明的方法并不能达到令人满意的效果。这种实时性要求使得这项任务越来越具有挑战性。本研究提出了一种配备全局照明模型的新型实时体绘制方法,用于三维胎儿超声可视化。该方法可通过计算单散射和多散射辐射分别渲染直接照明和间接照明。间接照明效果是通过体积光子映射来模拟的。计算每个光子的亮度时,建议使用新颖的屏幕空间命运估计,以避免复杂的存储结构并加快计算速度。本研究提出了一种高动态范围方法,以解决胎儿皮肤动态范围超过显示设备动态范围的问题。实验表明,与传统方法相比,我们的技术可以生成具有更多深度信息的逼真渲染结果。
{"title":"Real-time volume rendering for three-dimensional fetal ultrasound using volumetric photon mapping.","authors":"Jing Zou, Jing Qin","doi":"10.1186/s42492-024-00177-4","DOIUrl":"https://doi.org/10.1186/s42492-024-00177-4","url":null,"abstract":"<p><p>Three-dimensional (3D) fetal ultrasound has been widely used in prenatal examinations. Realistic and real-time volumetric ultrasound volume rendering can enhance the effectiveness of diagnoses and assist obstetricians and pregnant mothers in communicating. However, this remains a challenging task because (1) there is a large amount of speckle noise in ultrasound images and (2) ultrasound images usually have low contrasts, making it difficult to distinguish different tissues and organs. However, traditional local-illumination-based methods do not achieve satisfactory results. This real-time requirement makes the task increasingly challenging. This study presents a novel real-time volume-rendering method equipped with a global illumination model for 3D fetal ultrasound visualization. This method can render direct illumination and indirect illumination separately by calculating single scattering and multiple scattering radiances, respectively. The indirect illumination effect was simulated using volumetric photon mapping. Calculating each photon's brightness is proposed using a novel screen-space destiny estimation to avoid complicated storage structures and accelerate computation. This study proposes a high dynamic range approach to address the issue of fetal skin with a dynamic range exceeding that of the display device. Experiments show that our technology, compared to conventional methodologies, can generate realistic rendering results with far more depth information.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"25"},"PeriodicalIF":3.2,"publicationDate":"2024-10-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11511803/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509383","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models. 利用无监督泊松流生成模型抑制光子计数计算机断层扫描中的噪声。
IF 3.2 4区 计算机科学 Q2 COMPUTER SCIENCE, INTERDISCIPLINARY APPLICATIONS Pub Date : 2024-09-23 DOI: 10.1186/s42492-024-00175-6
Dennis Hein, Staffan Holmin, Timothy Szczykutowicz, Jonathan S Maltz, Mats Danielsson, Ge Wang, Mats Persson

Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.

深度学习(DL)已被证明对计算机断层扫描(CT)图像去噪非常重要。然而,此类模型通常是在监督下进行训练的,需要配对数据,而在实践中可能很难获得配对数据。扩散模型提供了通过后验采样解决各种逆问题的无监督方法。特别是,利用通过无监督学习获得的先验分布的估计无条件得分函数,我们可以通过劫持和正则化从所需的后验中进行采样。然而,由于使用的是迭代求解器,所需的函数评估次数(NFE)可能会比单步采样器大几个数量级。在本文中,我们将逆问题求解的无监督方法扩展到泊松流生成模型 (PFGM)++ 的情况,为光子计数 CT 提出了一种新型图像去噪技术。通过劫持和正则化采样过程,我们得到了单步采样器,即 NFE = 1。我们提出的方法将后验采样与扩散模型作为特例结合在一起。我们证明,PFGM++ 框架增加的鲁棒性可显著提高性能。我们的研究结果表明,在临床低剂量 CT 数据和来自 GE HealthCare 开发的光子计数 CT 系统原型的临床图像上,与流行的监督式(包括 NFE = 1 的最先进扩散式模型(一致性模型))、无监督式和非基于 DL 的图像去噪技术相比,我们的方法具有竞争力。
{"title":"Noise suppression in photon-counting computed tomography using unsupervised Poisson flow generative models.","authors":"Dennis Hein, Staffan Holmin, Timothy Szczykutowicz, Jonathan S Maltz, Mats Danielsson, Ge Wang, Mats Persson","doi":"10.1186/s42492-024-00175-6","DOIUrl":"10.1186/s42492-024-00175-6","url":null,"abstract":"<p><p>Deep learning (DL) has proven to be important for computed tomography (CT) image denoising. However, such models are usually trained under supervision, requiring paired data that may be difficult to obtain in practice. Diffusion models offer unsupervised means of solving a wide range of inverse problems via posterior sampling. In particular, using the estimated unconditional score function of the prior distribution, obtained via unsupervised learning, one can sample from the desired posterior via hijacking and regularization. However, due to the iterative solvers used, the number of function evaluations (NFE) required may be orders of magnitudes larger than for single-step samplers. In this paper, we present a novel image denoising technique for photon-counting CT by extending the unsupervised approach to inverse problem solving to the case of Poisson flow generative models (PFGM)++. By hijacking and regularizing the sampling process we obtain a single-step sampler, that is NFE = 1. Our proposed method incorporates posterior sampling using diffusion models as a special case. We demonstrate that the added robustness afforded by the PFGM++ framework yields significant performance gains. Our results indicate competitive performance compared to popular supervised, including state-of-the-art diffusion-style models with NFE = 1 (consistency models), unsupervised, and non-DL-based image denoising techniques, on clinical low-dose CT data and clinical images from a prototype photon-counting CT system developed by GE HealthCare.</p>","PeriodicalId":29931,"journal":{"name":"Visual Computing for Industry Biomedicine and Art","volume":"7 1","pages":"24"},"PeriodicalIF":3.2,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11420411/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142297060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Visual Computing for Industry Biomedicine and Art
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1