Journal of Imaging最新文献_第3页

Overview of High-Dynamic-Range Image Quality Assessment. 高动态范围图像质量评估概述。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-27 DOI: 10.3390/jimaging10100243

Yue Liu, Yu Tian, Shiqi Wang, Xinfeng Zhang, Sam Kwong

In recent years, the High-Dynamic-Range (HDR) image has gained widespread popularity across various domains, such as the security, multimedia, and biomedical fields, owing to its ability to deliver an authentic visual experience. However, the extensive dynamic range and rich detail in HDR images present challenges in assessing their quality. Therefore, current efforts involve constructing subjective databases and proposing objective quality assessment metrics to achieve an efficient HDR Image Quality Assessment (IQA). Recognizing the absence of a systematic overview of these approaches, this paper provides a comprehensive survey of both subjective and objective HDR IQA methods. Specifically, we review 7 subjective HDR IQA databases and 12 objective HDR IQA metrics. In addition, we conduct a statistical analysis of 9 IQA algorithms, incorporating 3 perceptual mapping functions. Our findings highlight two main areas for improvement. Firstly, the size and diversity of HDR IQA subjective databases should be significantly increased, encompassing a broader range of distortion types. Secondly, objective quality assessment algorithms need to identify more generalizable perceptual mapping approaches and feature extraction methods to enhance their robustness and applicability. Furthermore, this paper aims to serve as a valuable resource for researchers by discussing the limitations of current methodologies and potential research directions in the future.

近年来，高动态范围（HDR）图像因其能够提供真实的视觉体验而在安全、多媒体和生物医学等多个领域广受欢迎。然而，HDR 图像的动态范围广、细节丰富，给图像质量的评估带来了挑战。因此，目前的工作包括构建主观数据库和提出客观质量评估指标，以实现高效的 HDR 图像质量评估（IQA）。由于缺乏对这些方法的系统概述，本文对主观和客观 HDR IQA 方法进行了全面调查。具体来说，我们回顾了 7 个主观 HDR IQA 数据库和 12 个客观 HDR IQA 指标。此外，我们还对 9 种 IQA 算法进行了统计分析，其中包含 3 种感知映射函数。我们的研究结果突出了两个有待改进的主要方面。首先，应大幅增加 HDR IQA 主观数据库的规模和多样性，涵盖更广泛的失真类型。其次，客观质量评估算法需要确定更具通用性的感知映射方法和特征提取方法，以增强其稳健性和适用性。此外，本文旨在通过讨论当前方法的局限性和未来潜在的研究方向，为研究人员提供宝贵的资源。

{"title":"Overview of High-Dynamic-Range Image Quality Assessment.","authors":"Yue Liu, Yu Tian, Shiqi Wang, Xinfeng Zhang, Sam Kwong","doi":"10.3390/jimaging10100243","DOIUrl":"https://doi.org/10.3390/jimaging10100243","url":null,"abstract":"In recent years, the High-Dynamic-Range (HDR) image has gained widespread popularity across various domains, such as the security, multimedia, and biomedical fields, owing to its ability to deliver an authentic visual experience. However, the extensive dynamic range and rich detail in HDR images present challenges in assessing their quality. Therefore, current efforts involve constructing subjective databases and proposing objective quality assessment metrics to achieve an efficient HDR Image Quality Assessment (IQA). Recognizing the absence of a systematic overview of these approaches, this paper provides a comprehensive survey of both subjective and objective HDR IQA methods. Specifically, we review 7 subjective HDR IQA databases and 12 objective HDR IQA metrics. In addition, we conduct a statistical analysis of 9 IQA algorithms, incorporating 3 perceptual mapping functions. Our findings highlight two main areas for improvement. Firstly, the size and diversity of HDR IQA subjective databases should be significantly increased, encompassing a broader range of distortion types. Secondly, objective quality assessment algorithms need to identify more generalizable perceptual mapping approaches and feature extraction methods to enhance their robustness and applicability. Furthermore, this paper aims to serve as a valuable resource for researchers by discussing the limitations of current methodologies and potential research directions in the future.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 10","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508586/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evaluation of Focus Measures for Hyperspectral Imaging Microscopy Using Principal Component Analysis. 利用主成分分析法评估高光谱成像显微镜的焦点测量。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-26 DOI: 10.3390/jimaging10100240

Humbat Nasibov

An automatic focusing system is a crucial component of automated microscopes, adjusting the lens-to-object distance to find the optimal focus by maximizing the focus measure (FM) value. This study develops reliable autofocus methods for hyperspectral imaging microscope systems, essential for extracting accurate chemical and spatial information from hyperspectral datacubes. Since FMs are domain- and application-specific, commonly, their performance is evaluated using verified focus positions. For example, in optical microscopy, the sharpness/contrast of visual peculiarities of a sample under testing typically guides as an anchor to determine the best focus position, but this approach is challenging in hyperspectral imaging systems (HSISs), where instant two-dimensional hyperspectral images do not always possess human-comprehensible visual information. To address this, a principal component analysis (PCA) was used to define the optimal ("ideal") optical focus position in HSIS, providing a benchmark for assessing 22 FMs commonly used in other imaging fields. Evaluations utilized hyperspectral images from visible (400-1100 nm) and near-infrared (900-1700 nm) bands across four different HSIS setups with varying magnifications. Results indicate that gradient-based FMs are the fastest and most reliable operators in this context.

自动对焦系统是自动显微镜的重要组成部分，它通过最大化对焦测量（FM）值来调整镜头到物体的距离，从而找到最佳焦点。本研究为高光谱成像显微镜系统开发了可靠的自动对焦方法，这对于从高光谱数据集提取准确的化学和空间信息至关重要。由于自动对焦是针对特定领域和应用的，因此通常使用经过验证的对焦位置来评估其性能。例如，在光学显微镜中，被测样品视觉特征的清晰度/对比度通常可作为确定最佳聚焦位置的锚点，但这种方法在高光谱成像系统（HSIS）中具有挑战性，因为即时的二维高光谱图像并不总是拥有人类可理解的视觉信息。为了解决这个问题，我们使用主成分分析（PCA）来定义 HSIS 中的最佳（"理想"）光学焦点位置，为评估其他成像领域常用的 22 个调频提供基准。评估使用了可见光（400-1100 纳米）和近红外（900-1700 纳米）波段的高光谱图像，涉及四种不同放大倍率的 HSIS 设置。结果表明，在这种情况下，基于梯度的调频是最快、最可靠的操作。

{"title":"Evaluation of Focus Measures for Hyperspectral Imaging Microscopy Using Principal Component Analysis.","authors":"Humbat Nasibov","doi":"10.3390/jimaging10100240","DOIUrl":"https://doi.org/10.3390/jimaging10100240","url":null,"abstract":"An automatic focusing system is a crucial component of automated microscopes, adjusting the lens-to-object distance to find the optimal focus by maximizing the focus measure (FM) value. This study develops reliable autofocus methods for hyperspectral imaging microscope systems, essential for extracting accurate chemical and spatial information from hyperspectral datacubes. Since FMs are domain- and application-specific, commonly, their performance is evaluated using verified focus positions. For example, in optical microscopy, the sharpness/contrast of visual peculiarities of a sample under testing typically guides as an anchor to determine the best focus position, but this approach is challenging in hyperspectral imaging systems (HSISs), where instant two-dimensional hyperspectral images do not always possess human-comprehensible visual information. To address this, a principal component analysis (PCA) was used to define the optimal (\"ideal\") optical focus position in HSIS, providing a benchmark for assessing 22 FMs commonly used in other imaging fields. Evaluations utilized hyperspectral images from visible (400-1100 nm) and near-infrared (900-1700 nm) bands across four different HSIS setups with varying magnifications. Results indicate that gradient-based FMs are the fastest and most reliable operators in this context.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 10","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11508558/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142509764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Survey on Explainable Artificial Intelligence (XAI) Techniques for Visualizing Deep Learning Models in Medical Imaging. 医学影像中可视化深度学习模型的可解释人工智能（XAI）技术概览。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-25 DOI: 10.3390/jimaging10100239

Deepshikha Bhati, Fnu Neha, Md Amiruzzaman

The combination of medical imaging and deep learning has significantly improved diagnostic and prognostic capabilities in the healthcare domain. Nevertheless, the inherent complexity of deep learning models poses challenges in understanding their decision-making processes. Interpretability and visualization techniques have emerged as crucial tools to unravel the black-box nature of these models, providing insights into their inner workings and enhancing trust in their predictions. This survey paper comprehensively examines various interpretation and visualization techniques applied to deep learning models in medical imaging. The paper reviews methodologies, discusses their applications, and evaluates their effectiveness in enhancing the interpretability, reliability, and clinical relevance of deep learning models in medical image analysis.

医学成像与深度学习的结合大大提高了医疗保健领域的诊断和预后能力。然而，深度学习模型固有的复杂性给理解其决策过程带来了挑战。可解释性和可视化技术已成为揭开这些模型黑箱本质的重要工具，为了解其内部运作提供了洞察力，并提高了对其预测的信任度。本调查报告全面研究了应用于医学影像深度学习模型的各种解释和可视化技术。论文回顾了各种方法，讨论了它们的应用，并评估了它们在提高医学影像分析中深度学习模型的可解释性、可靠性和临床相关性方面的有效性。

引用次数: 0

Comparison of Visual and Quantra Software Mammographic Density Assessment According to BI-RADS^® in 2D and 3D Images. 根据 BI-RADS® 在二维和三维图像中进行乳腺密度评估的视觉软件和 Quantra 软件的比较。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-23 DOI: 10.3390/jimaging10090238

Francesca Morciano, Cristina Marcazzan, Rossella Rella, Oscar Tommasini, Marco Conti, Paolo Belli, Andrea Spagnolo, Andrea Quaglia, Stefano Tambalo, Andreea Georgiana Trisca, Claudia Rossati, Francesca Fornasa, Giovanna Romanucci

Mammographic density (MD) assessment is subject to inter- and intra-observer variability. An automated method, such as Quantra software, could be a useful tool for an objective and reproducible MD assessment. Our purpose was to evaluate the performance of Quantra software in assessing MD, according to BI-RADS^® Atlas Fifth Edition recommendations, verifying the degree of agreement with the gold standard, given by the consensus of two breast radiologists. A total of 5009 screening examinations were evaluated by two radiologists and analysed by Quantra software to assess MD. The agreement between the three assigned values was expressed as intraclass correlation coefficients (ICCs). The agreement between the software and the two readers (R1 and R2) was moderate with ICC values of 0.725 and 0.713, respectively. A better agreement was demonstrated between the software's assessment and the average score of the values assigned by the two radiologists, with an index of 0.793, which reflects a good correlation. Quantra software appears a promising tool in supporting radiologists in the MD assessment and could be part of a personalised screening protocol soon. However, some fine-tuning is needed to improve its accuracy, reduce its tendency to overestimate, and ensure it excludes high-density structures from its assessment.

乳腺密度（MD）评估受观察者之间和观察者内部差异的影响。Quantra 软件等自动化方法是进行客观、可重复 MD 评估的有用工具。我们的目的是根据 BI-RADS® 图谱第五版的建议，评估 Quantra 软件在评估乳腺组织密度方面的性能，并验证其与两位乳腺放射科专家一致给出的金标准的吻合程度。两位放射科专家共对 5009 例筛查进行了评估，并通过 Quantra 软件对 MD 进行了分析。三个指定值之间的一致性以类内相关系数（ICC）表示。软件与两名读者（R1 和 R2）之间的一致性为中等，ICC 值分别为 0.725 和 0.713。软件的评估结果与两位放射科医生给出的平均值之间的一致性更好，指数为 0.793，反映了良好的相关性。Quantra 软件在支持放射科医生进行 MD 评估方面似乎是一个很有前途的工具，很快就能成为个性化筛查方案的一部分。不过，还需要进行一些微调，以提高其准确性，降低其高估倾向，并确保其在评估中排除高密度结构。

{"title":"Comparison of Visual and Quantra Software Mammographic Density Assessment According to BI-RADS® in 2D and 3D Images.","authors":"Francesca Morciano, Cristina Marcazzan, Rossella Rella, Oscar Tommasini, Marco Conti, Paolo Belli, Andrea Spagnolo, Andrea Quaglia, Stefano Tambalo, Andreea Georgiana Trisca, Claudia Rossati, Francesca Fornasa, Giovanna Romanucci","doi":"10.3390/jimaging10090238","DOIUrl":"https://doi.org/10.3390/jimaging10090238","url":null,"abstract":"Mammographic density (MD) assessment is subject to inter- and intra-observer variability. An automated method, such as Quantra software, could be a useful tool for an objective and reproducible MD assessment. Our purpose was to evaluate the performance of Quantra software in assessing MD, according to BI-RADS® Atlas Fifth Edition recommendations, verifying the degree of agreement with the gold standard, given by the consensus of two breast radiologists. A total of 5009 screening examinations were evaluated by two radiologists and analysed by Quantra software to assess MD. The agreement between the three assigned values was expressed as intraclass correlation coefficients (ICCs). The agreement between the software and the two readers (R1 and R2) was moderate with ICC values of 0.725 and 0.713, respectively. A better agreement was demonstrated between the software's assessment and the average score of the values assigned by the two radiologists, with an index of 0.793, which reflects a good correlation. Quantra software appears a promising tool in supporting radiologists in the MD assessment and could be part of a personalised screening protocol soon. However, some fine-tuning is needed to improve its accuracy, reduce its tendency to overestimate, and ensure it excludes high-density structures from its assessment.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Efficient End-to-End Convolutional Architecture for Point-of-Gaze Estimation. 用于注视点估计的高效端到端卷积架构

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-23 DOI: 10.3390/jimaging10090237

Casian Miron, George Ciubotariu, Alexandru Păsărică, Radu Timofte

Point-of-gaze estimation is part of a larger set of tasks aimed at improving user experience, providing business insights, or facilitating interactions with different devices. There has been a growing interest in this task, particularly due to the need for upgrades in e-meeting platforms during the pandemic when on-site activities were no longer possible for educational institutions, corporations, and other organizations. Current research advancements are focusing on more complex methodologies for data collection and task implementation, creating a gap that we intend to address with our contributions. Thus, we introduce a methodology for data acquisition that shows promise due to its nonrestrictive and straightforward nature, notably increasing the yield of collected data without compromising diversity or quality. Additionally, we present a novel and efficient convolutional neural network specifically tailored for calibration-free point-of-gaze estimation that outperforms current state-of-the-art methods on the MPIIFaceGaze dataset by a substantial margin, and sets a strong baseline on our own data.

观测点估算是旨在改善用户体验、提供业务洞察力或促进与不同设备互动的一系列大型任务的一部分。人们对这项任务的兴趣与日俱增，特别是在大流行病期间，教育机构、公司和其他组织无法再进行现场活动，因此需要升级电子会议平台。目前的研究进展主要集中在更复杂的数据收集和任务执行方法上，这就造成了一个空白，我们打算通过我们的贡献来弥补这个空白。因此，我们介绍了一种数据采集方法，该方法因其不受限制和简单明了的性质而大有可为，在不影响多样性和质量的前提下显著提高了采集数据的产量。此外，我们还介绍了一种专为无校准凝视点估算量身定制的新型高效卷积神经网络，该网络在 MPIIFaceGaze 数据集上的表现大大优于目前最先进的方法，并在我们自己的数据上建立了一个强大的基准。

{"title":"Efficient End-to-End Convolutional Architecture for Point-of-Gaze Estimation.","authors":"Casian Miron, George Ciubotariu, Alexandru Păsărică, Radu Timofte","doi":"10.3390/jimaging10090237","DOIUrl":"https://doi.org/10.3390/jimaging10090237","url":null,"abstract":"Point-of-gaze estimation is part of a larger set of tasks aimed at improving user experience, providing business insights, or facilitating interactions with different devices. There has been a growing interest in this task, particularly due to the need for upgrades in e-meeting platforms during the pandemic when on-site activities were no longer possible for educational institutions, corporations, and other organizations. Current research advancements are focusing on more complex methodologies for data collection and task implementation, creating a gap that we intend to address with our contributions. Thus, we introduce a methodology for data acquisition that shows promise due to its nonrestrictive and straightforward nature, notably increasing the yield of collected data without compromising diversity or quality. Additionally, we present a novel and efficient convolutional neural network specifically tailored for calibration-free point-of-gaze estimation that outperforms current state-of-the-art methods on the MPIIFaceGaze dataset by a substantial margin, and sets a strong baseline on our own data.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433013/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142336671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Method for Augmenting Side-Scan Sonar Seafloor Sediment Image Dataset Based on BCEL1-CBAM-INGAN. 基于 BCEL1-CBAM-INGAN 的侧扫声纳海底沉积物图像数据集增强方法。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090233

Haixing Xia, Yang Cui, Shaohua Jin, Gang Bian, Wei Zhang, Chengyang Peng

In this paper, a method for augmenting samples of side-scan sonar seafloor sediment images based on CBAM-BCEL1-INGAN is proposed, aiming to address the difficulties in acquiring and labeling datasets, as well as the insufficient diversity and quantity of data samples. Firstly, a Convolutional Block Attention Module (CBAM) is integrated into the residual blocks of the INGAN generator to enhance the learning of specific attributes and improve the quality of the generated images. Secondly, a BCEL1 loss function (combining binary cross-entropy and L1 loss functions) is introduced into the discriminator, enabling it to focus on both global image consistency and finer distinctions for better generation results. Finally, augmented samples are input into an AlexNet classifier to verify their authenticity. Experimental results demonstrate the excellent performance of the method in generating images of coarse sand, gravel, and bedrock, as evidenced by significant improvements in the Frechet Inception Distance (FID) and Inception Score (IS). The introduction of the CBAM and BCEL1 loss function notably enhances the quality and details of the generated images. Moreover, classification experiments using the AlexNet classifier show an increase in the recognition rate from 90.5% using only INGAN-generated images of bedrock to 97.3% using images augmented using our method, marking a 6.8% improvement. Additionally, the classification accuracy of bedrock-type matrices is improved by 5.2% when images enhanced using the method presented in this paper are added to the training set, which is 2.7% higher than that of the simple method amplification. This validates the effectiveness of our method in the task of generating seafloor sediment images, partially alleviating the scarcity of side-scan sonar seafloor sediment image data.

本文提出了一种基于 CBAM-BCEL1-INGAN 的侧扫声纳海底沉积物图像样本增强方法，旨在解决数据集获取和标注困难以及数据样本多样性和数量不足的问题。首先，在INGAN 生成器的残差块中集成了卷积块注意模块（CBAM），以加强对特定属性的学习，提高生成图像的质量。其次，在鉴别器中引入了 BCEL1 损失函数（结合了二元交叉熵和 L1 损失函数），使其既能关注全局图像的一致性，又能进行更精细的区分，以获得更好的生成结果。最后，将增强样本输入 AlexNet 分类器，以验证其真实性。实验结果表明，该方法在生成粗砂、砾石和基岩图像时表现出色，这体现在弗雷谢特起始距离（FID）和起始分数（IS）的显著提高上。CBAM 和 BCEL1 损失函数的引入显著提高了生成图像的质量和细节。此外，使用 AlexNet 分类器进行的分类实验表明，仅使用 INGAN 生成的基岩图像的识别率为 90.5%，而使用我们的方法增强的图像的识别率为 97.3%，提高了 6.8%。此外，在训练集中加入使用本文方法增强的图像后，基岩类型矩阵的分类准确率提高了 5.2%，比简单方法放大的准确率高出 2.7%。这验证了我们的方法在生成海底沉积物图像任务中的有效性，部分缓解了侧扫声纳海底沉积物图像数据稀缺的问题。

{"title":"Method for Augmenting Side-Scan Sonar Seafloor Sediment Image Dataset Based on BCEL1-CBAM-INGAN.","authors":"Haixing Xia, Yang Cui, Shaohua Jin, Gang Bian, Wei Zhang, Chengyang Peng","doi":"10.3390/jimaging10090233","DOIUrl":"https://doi.org/10.3390/jimaging10090233","url":null,"abstract":"In this paper, a method for augmenting samples of side-scan sonar seafloor sediment images based on CBAM-BCEL1-INGAN is proposed, aiming to address the difficulties in acquiring and labeling datasets, as well as the insufficient diversity and quantity of data samples. Firstly, a Convolutional Block Attention Module (CBAM) is integrated into the residual blocks of the INGAN generator to enhance the learning of specific attributes and improve the quality of the generated images. Secondly, a BCEL1 loss function (combining binary cross-entropy and L1 loss functions) is introduced into the discriminator, enabling it to focus on both global image consistency and finer distinctions for better generation results. Finally, augmented samples are input into an AlexNet classifier to verify their authenticity. Experimental results demonstrate the excellent performance of the method in generating images of coarse sand, gravel, and bedrock, as evidenced by significant improvements in the Frechet Inception Distance (FID) and Inception Score (IS). The introduction of the CBAM and BCEL1 loss function notably enhances the quality and details of the generated images. Moreover, classification experiments using the AlexNet classifier show an increase in the recognition rate from 90.5% using only INGAN-generated images of bedrock to 97.3% using images augmented using our method, marking a 6.8% improvement. Additionally, the classification accuracy of bedrock-type matrices is improved by 5.2% when images enhanced using the method presented in this paper are added to the training set, which is 2.7% higher than that of the simple method amplification. This validates the effectiveness of our method in the task of generating seafloor sediment images, partially alleviating the scarcity of side-scan sonar seafloor sediment image data.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433333/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Multi-Task Model for Pulmonary Nodule Segmentation and Classification. 肺结节分割和分类的多任务模型

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090234

Tiequn Tang, Rongfu Zhang

In the computer-aided diagnosis of lung cancer, the automatic segmentation of pulmonary nodules and the classification of benign and malignant tumors are two fundamental tasks. However, deep learning models often overlook the potential benefits of task correlations in improving their respective performances, as they are typically designed for a single task only. Therefore, we propose a multi-task network (MT-Net) that integrates shared backbone architecture and a prediction distillation structure for the simultaneous segmentation and classification of pulmonary nodules. The model comprises a coarse segmentation subnetwork (Coarse Seg-net), a cooperative classification subnetwork (Class-net), and a cooperative segmentation subnetwork (Fine Seg-net). Coarse Seg-net and Fine Seg-net share identical structure, where Coarse Seg-net provides prior location information for the subsequent Fine Seg-net and Class-net, thereby boosting pulmonary nodule segmentation and classification performance. We quantitatively and qualitatively analyzed the performance of the model by using the public dataset LIDC-IDRI. Our results show that the model achieves a Dice similarity coefficient (DI) index of 83.2% for pulmonary nodule segmentation, as well as an accuracy (ACC) of 91.9% for benign and malignant pulmonary nodule classification, which is competitive with other state-of-the-art methods. The experimental results demonstrate that the performance of pulmonary nodule segmentation and classification can be improved by a unified model that leverages the potential correlation between tasks.

在肺癌的计算机辅助诊断中，肺结节的自动分割和良性肿瘤与恶性肿瘤的分类是两项基本任务。然而，由于深度学习模型通常只针对单一任务而设计，它们往往忽视了任务相关性在提高各自性能方面的潜在优势。因此，我们提出了一种多任务网络（MT-Net），它集成了共享骨干架构和预测蒸馏结构，可同时对肺结节进行分割和分类。该模型包括一个粗分割子网络（Coarse Seg-net）、一个合作分类子网络（Class-net）和一个合作分割子网络（Fine Seg-net）。粗分割子网和细分割子网具有相同的结构，其中粗分割子网为后续的细分割子网和分类子网提供先验位置信息，从而提高肺结节的分割和分类性能。我们利用公开数据集 LIDC-IDRI 对模型的性能进行了定量和定性分析。结果表明，该模型在肺结节分割方面的 Dice 相似性系数 (DI) 指数达到 83.2%，在肺结节良恶性分类方面的准确率 (ACC) 达到 91.9%，与其他最先进的方法相比具有竞争力。实验结果表明，利用任务间潜在的相关性，统一模型可以提高肺结节分割和分类的性能。

{"title":"A Multi-Task Model for Pulmonary Nodule Segmentation and Classification.","authors":"Tiequn Tang, Rongfu Zhang","doi":"10.3390/jimaging10090234","DOIUrl":"https://doi.org/10.3390/jimaging10090234","url":null,"abstract":"In the computer-aided diagnosis of lung cancer, the automatic segmentation of pulmonary nodules and the classification of benign and malignant tumors are two fundamental tasks. However, deep learning models often overlook the potential benefits of task correlations in improving their respective performances, as they are typically designed for a single task only. Therefore, we propose a multi-task network (MT-Net) that integrates shared backbone architecture and a prediction distillation structure for the simultaneous segmentation and classification of pulmonary nodules. The model comprises a coarse segmentation subnetwork (Coarse Seg-net), a cooperative classification subnetwork (Class-net), and a cooperative segmentation subnetwork (Fine Seg-net). Coarse Seg-net and Fine Seg-net share identical structure, where Coarse Seg-net provides prior location information for the subsequent Fine Seg-net and Class-net, thereby boosting pulmonary nodule segmentation and classification performance. We quantitatively and qualitatively analyzed the performance of the model by using the public dataset LIDC-IDRI. Our results show that the model achieves a Dice similarity coefficient (DI) index of 83.2% for pulmonary nodule segmentation, as well as an accuracy (ACC) of 91.9% for benign and malignant pulmonary nodule classification, which is competitive with other state-of-the-art methods. The experimental results demonstrate that the performance of pulmonary nodule segmentation and classification can be improved by a unified model that leverages the potential correlation between tasks.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433280/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Convolutional Neural Network-Machine Learning Model: Hybrid Model for Meningioma Tumour and Healthy Brain Classification. 卷积神经网络-机器学习模型：用于脑膜瘤肿瘤和健康大脑分类的混合模型。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090235

Simona Moldovanu, Gigi Tăbăcaru, Marian Barbu

This paper presents a hybrid study of convolutional neural networks (CNNs), machine learning (ML), and transfer learning (TL) in the context of brain magnetic resonance imaging (MRI). The anatomy of the brain is very complex; inside the skull, a brain tumour can form in any part. With MRI technology, cross-sectional images are generated, and radiologists can detect the abnormalities. When the size of the tumour is very small, it is undetectable to the human visual system, necessitating alternative analysis using AI tools. As is widely known, CNNs explore the structure of an image and provide features on the SoftMax fully connected (SFC) layer, and the classification of the items that belong to the input classes is established. Two comparison studies for the classification of meningioma tumours and healthy brains are presented in this paper: (i) classifying MRI images using an original CNN and two pre-trained CNNs, DenseNet169 and EfficientNetV2B0; (ii) determining which CNN and ML combination yields the most accurate classification when SoftMax is replaced with three ML models; in this context, Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) were proposed. In a binary classification of tumours and healthy brains, the EfficientNetB0-SVM combination shows an accuracy of 99.5% on the test dataset. A generalisation of the results was performed, and overfitting was prevented by using the bagging ensemble method.

本文以脑磁共振成像（MRI）为背景，介绍了卷积神经网络（CNN）、机器学习（ML）和迁移学习（TL）的混合研究。大脑的解剖结构非常复杂；在头骨内部，脑肿瘤可能在任何部位形成。利用核磁共振成像技术可以生成横截面图像，放射科医生可以检测出异常情况。当肿瘤非常小的时候，人类的视觉系统无法检测到，这就需要使用人工智能工具进行替代分析。众所周知，CNN 会探索图像的结构，并在 SoftMax 全连接（SFC）层上提供特征，然后对属于输入类别的项目进行分类。本文介绍了脑膜瘤肿瘤和健康大脑分类的两项对比研究：(i) 使用原始 CNN 和两个预先训练过的 CNN（DenseNet169 和 EfficientNetV2B0）对 MRI 图像进行分类；(ii) 当 SoftMax 被三个 ML 模型取代时，确定哪个 CNN 和 ML 组合能产生最准确的分类；在这种情况下，提出了随机森林 (RF)、K-最近邻 (KNN) 和支持向量机 (SVM)。在肿瘤和健康大脑的二元分类中，EfficientNetB0-SVM 组合在测试数据集上的准确率达到了 99.5%。对结果进行了泛化，并通过使用装袋集合方法防止了过拟合。

{"title":"Convolutional Neural Network-Machine Learning Model: Hybrid Model for Meningioma Tumour and Healthy Brain Classification.","authors":"Simona Moldovanu, Gigi Tăbăcaru, Marian Barbu","doi":"10.3390/jimaging10090235","DOIUrl":"https://doi.org/10.3390/jimaging10090235","url":null,"abstract":"This paper presents a hybrid study of convolutional neural networks (CNNs), machine learning (ML), and transfer learning (TL) in the context of brain magnetic resonance imaging (MRI). The anatomy of the brain is very complex; inside the skull, a brain tumour can form in any part. With MRI technology, cross-sectional images are generated, and radiologists can detect the abnormalities. When the size of the tumour is very small, it is undetectable to the human visual system, necessitating alternative analysis using AI tools. As is widely known, CNNs explore the structure of an image and provide features on the SoftMax fully connected (SFC) layer, and the classification of the items that belong to the input classes is established. Two comparison studies for the classification of meningioma tumours and healthy brains are presented in this paper: (i) classifying MRI images using an original CNN and two pre-trained CNNs, DenseNet169 and EfficientNetV2B0; (ii) determining which CNN and ML combination yields the most accurate classification when SoftMax is replaced with three ML models; in this context, Random Forest (RF), K-Nearest Neighbors (KNN), and Support Vector Machine (SVM) were proposed. In a binary classification of tumours and healthy brains, the EfficientNetB0-SVM combination shows an accuracy of 99.5% on the test dataset. A generalisation of the results was performed, and overfitting was prevented by using the bagging ensemble method.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433632/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Historical Blurry Video-Based Face Recognition. 基于历史模糊视频的人脸识别。

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-20 DOI: 10.3390/jimaging10090236

Lujun Zhai, Suxia Cui, Yonghui Wang, Song Wang, Jun Zhou, Greg Wilsbacher

Face recognition is a widely used computer vision, which plays an increasingly important role in user authentication systems, security systems, and consumer electronics. The models for most current applications are based on high-definition digital cameras. In this paper, we focus on digital images derived from historical motion picture films. Historical motion picture films often have poorer resolution than modern digital imagery, making face detection a more challenging task. To approach this problem, we first propose a trunk-branch concatenated multi-task cascaded convolutional neural network (TB-MTCNN), which efficiently extracts facial features from blurry historical films by combining the trunk with branch networks and employing various sizes of kernels to enrich the multi-scale receptive field. Next, we build a deep neural network-integrated object-tracking algorithm to compensate for failed recognition over one or more video frames. The framework combines simple online and real-time tracking with deep data association (Deep SORT), and TB-MTCNN with the residual neural network (ResNet) model. Finally, a state-of-the-art image restoration method is employed to reduce the effect of noise and blurriness. The experimental results show that our proposed joint face recognition and tracking network can significantly reduce missed recognition in historical motion picture film frames.

人脸识别是一种广泛应用的计算机视觉技术，在用户身份验证系统、安全系统和消费电子产品中发挥着越来越重要的作用。目前大多数应用的模型都基于高清数码相机。在本文中，我们将重点关注从历史电影胶片中提取的数字图像。与现代数字图像相比，历史电影胶片的分辨率通常较低，这使得人脸检测成为一项更具挑战性的任务。为了解决这个问题，我们首先提出了一种主干-分支串联多任务级联卷积神经网络（TB-MTCNN），它通过将主干网络与分支网络相结合，并采用不同大小的核来丰富多尺度感受野，从而有效地从模糊的历史影片中提取面部特征。接下来，我们构建了一种深度神经网络集成的对象跟踪算法，以补偿一个或多个视频帧的识别失败。该框架将简单的在线实时跟踪与深度数据关联（Deep SORT）相结合，并将 TB-MTCNN 与残差神经网络（ResNet）模型相结合。最后，还采用了最先进的图像修复方法来减少噪声和模糊的影响。实验结果表明，我们提出的联合人脸识别和跟踪网络可以显著减少历史电影胶片中的漏识现象。

{"title":"Historical Blurry Video-Based Face Recognition.","authors":"Lujun Zhai, Suxia Cui, Yonghui Wang, Song Wang, Jun Zhou, Greg Wilsbacher","doi":"10.3390/jimaging10090236","DOIUrl":"https://doi.org/10.3390/jimaging10090236","url":null,"abstract":"Face recognition is a widely used computer vision, which plays an increasingly important role in user authentication systems, security systems, and consumer electronics. The models for most current applications are based on high-definition digital cameras. In this paper, we focus on digital images derived from historical motion picture films. Historical motion picture films often have poorer resolution than modern digital imagery, making face detection a more challenging task. To approach this problem, we first propose a trunk-branch concatenated multi-task cascaded convolutional neural network (TB-MTCNN), which efficiently extracts facial features from blurry historical films by combining the trunk with branch networks and employing various sizes of kernels to enrich the multi-scale receptive field. Next, we build a deep neural network-integrated object-tracking algorithm to compensate for failed recognition over one or more video frames. The framework combines simple online and real-time tracking with deep data association (Deep SORT), and TB-MTCNN with the residual neural network (ResNet) model. Finally, a state-of-the-art image restoration method is employed to reduce the effect of noise and blurriness. The experimental results show that our proposed joint face recognition and tracking network can significantly reduce missed recognition in historical motion picture film frames.","PeriodicalId":37035,"journal":{"name":"Journal of Imaging","volume":"10 9","pages":""},"PeriodicalIF":2.7,"publicationDate":"2024-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11433217/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142355819","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Enhancing Deep Learning Model Explainability in Brain Tumor Datasets Using Post-Heuristic Approaches. 利用后探索方法增强脑肿瘤数据集中深度学习模型的可解释性

IF 2.7 Q3 IMAGING SCIENCE & PHOTOGRAPHIC TECHNOLOGY

Journal of Imaging

Pub Date : 2024-09-18 DOI: 10.3390/jimaging10090232

Konstantinos Pasvantis, Eftychios Protopapadakis

The application of deep learning models in medical diagnosis has showcased considerable efficacy in recent years. Nevertheless, a notable limitation involves the inherent lack of explainability during decision-making processes. This study addresses such a constraint by enhancing the interpretability robustness. The primary focus is directed towards refining the explanations generated by the LIME Library and LIME image explainer. This is achieved through post-processing mechanisms based on scenario-specific rules. Multiple experiments have been conducted using publicly accessible datasets related to brain tumor detection. Our proposed post-heuristic approach demonstrates significant advancements, yielding more robust and concrete results in the context of medical diagnosis.

近年来，深度学习模型在医疗诊断中的应用取得了显著成效。然而，一个值得注意的局限是，决策过程本身缺乏可解释性。本研究通过增强可解释性的稳健性来解决这一制约因素。主要重点是完善 LIME 库和 LIME 图像解释器生成的解释。这是通过基于特定场景规则的后处理机制实现的。我们使用与脑肿瘤检测相关的公开数据集进行了多项实验。我们提出的后启发式方法取得了显著进步，在医疗诊断方面产生了更稳健、更具体的结果。

引用次数: 0