Prediction of midpalatal suture maturation stage based on transfer learning and enhanced vision transformer.

IF 3.3 3区医学 Q2 MEDICAL INFORMATICS BMC Medical Informatics and Decision Making Pub Date : 2024-08-22 DOI:10.1186/s12911-024-02598-w

Haomin Tang, Shu Liu, Weijie Tan, Lingling Fu, Ming Yan, Hongchao Feng

{"title":"Prediction of midpalatal suture maturation stage based on transfer learning and enhanced vision transformer.","authors":"Haomin Tang, Shu Liu, Weijie Tan, Lingling Fu, Ming Yan, Hongchao Feng","doi":"10.1186/s12911-024-02598-w","DOIUrl":null,"url":null,"abstract":"Background: Maxillary expansion is an important treatment method for maxillary transverse hypoplasia. Different methods of maxillary expansion should be carried out depending on the midpalatal suture maturation levels, and the diagnosis was validated by palatal plane cone beam computed tomography (CBCT) images by orthodontists, while such a method suffered from low efficiency and strong subjectivity. This study develops and evaluates an enhanced vision transformer (ViT) to automatically classify CBCT images of midpalatal sutures with different maturation stages.Methods: In recent years, the use of convolutional neural network (CNN) to classify images of midpalatal suture with different maturation stages has brought positive significance to the decision of the clinical maxillary expansion method. However, CNN cannot adequately learn the long-distance dependencies between images and features, which are also required for global recognition of midpalatal suture CBCT images. The Self-Attention of ViT has the function of capturing the relationship between long-distance pixels of the image. However, it lacks the inductive bias of CNN and needs more data training. To solve this problem, a CNN-enhanced ViT model based on transfer learning is proposed to classify midpalatal suture CBCT images. In this study, 2518 CBCT images of the palate plane are collected, and the images are divided into 1259 images as the training set, 506 images as the verification set, and 753 images as the test set. After the training set image preprocessing, the CNN-enhanced ViT model is trained and adjusted, and the generalization ability of the model is tested on the test set.Results: The classification accuracy of our proposed ViT model is 95.75%, and its Macro-averaging Area under the receiver operating characteristic Curve (AUC) and Micro-averaging AUC are 97.89% and 98.36% respectively on our data test set. The classification accuracy of the best performing CNN model EfficientnetV2_S was 93.76% on our data test set. The classification accuracy of the clinician is 89.10% on our data test set.Conclusions: The experimental results show that this method can effectively complete CBCT images classification of midpalatal suture maturation stages, and the performance is better than a clinician. Therefore, the model can provide a valuable reference for orthodontists and assist them in making correct a diagnosis.","PeriodicalId":9340,"journal":{"name":"BMC Medical Informatics and Decision Making","volume":null,"pages":null},"PeriodicalIF":3.3000,"publicationDate":"2024-08-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11340164/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"BMC Medical Informatics and Decision Making","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1186/s12911-024-02598-w","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MEDICAL INFORMATICS","Score":null,"Total":0}

引用次数: 0

Abstract

Background: Maxillary expansion is an important treatment method for maxillary transverse hypoplasia. Different methods of maxillary expansion should be carried out depending on the midpalatal suture maturation levels, and the diagnosis was validated by palatal plane cone beam computed tomography (CBCT) images by orthodontists, while such a method suffered from low efficiency and strong subjectivity. This study develops and evaluates an enhanced vision transformer (ViT) to automatically classify CBCT images of midpalatal sutures with different maturation stages.

Methods: In recent years, the use of convolutional neural network (CNN) to classify images of midpalatal suture with different maturation stages has brought positive significance to the decision of the clinical maxillary expansion method. However, CNN cannot adequately learn the long-distance dependencies between images and features, which are also required for global recognition of midpalatal suture CBCT images. The Self-Attention of ViT has the function of capturing the relationship between long-distance pixels of the image. However, it lacks the inductive bias of CNN and needs more data training. To solve this problem, a CNN-enhanced ViT model based on transfer learning is proposed to classify midpalatal suture CBCT images. In this study, 2518 CBCT images of the palate plane are collected, and the images are divided into 1259 images as the training set, 506 images as the verification set, and 753 images as the test set. After the training set image preprocessing, the CNN-enhanced ViT model is trained and adjusted, and the generalization ability of the model is tested on the test set.

Results: The classification accuracy of our proposed ViT model is 95.75%, and its Macro-averaging Area under the receiver operating characteristic Curve (AUC) and Micro-averaging AUC are 97.89% and 98.36% respectively on our data test set. The classification accuracy of the best performing CNN model EfficientnetV2_S was 93.76% on our data test set. The classification accuracy of the clinician is 89.10% on our data test set.

Conclusions: The experimental results show that this method can effectively complete CBCT images classification of midpalatal suture maturation stages, and the performance is better than a clinician. Therefore, the model can provide a valuable reference for orthodontists and assist them in making correct a diagnosis.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于迁移学习和增强视觉转换器的腭中缝成熟阶段预测。

背景：上颌骨扩容是上颌骨横发育不良的重要治疗方法。根据腭中缝成熟度的不同，应采用不同的上颌扩弓方法，正畸医生通过腭平面锥形束计算机断层扫描（CBCT）图像进行诊断验证，但这种方法存在效率低、主观性强等问题。本研究开发并评估了一种增强型视觉变换器（ViT），用于自动对不同成熟阶段的腭中缝 CBCT 图像进行分类：近年来，利用卷积神经网络（CNN）对不同成熟阶段的腭中缝图像进行分类，为临床上颌扩容方法的决策带来了积极意义。然而，CNN 无法充分学习图像和特征之间的长距离依赖关系，而这也是对腭中缝 CBCT 图像进行全局识别所必需的。ViT 的 "自注意 "具有捕捉图像长距离像素间关系的功能。然而，它缺乏 CNN 的归纳偏差，需要更多的数据训练。为解决这一问题，本文提出了一种基于迁移学习的 CNN 增强 ViT 模型，用于对腭中缝 CBCT 图像进行分类。本研究收集了 2518 张腭平面 CBCT 图像，并将图像分为 1259 张作为训练集，506 张作为验证集，753 张作为测试集。对训练集图像进行预处理后，对 CNN 增强的 ViT 模型进行训练和调整，并在测试集上测试模型的泛化能力：我们提出的 ViT 模型的分类准确率为 95.75%，在数据测试集上的接收器工作特征曲线下的宏观平均面积（AUC）和微观平均面积（AUC）分别为 97.89% 和 98.36%。在我们的数据测试集上，表现最好的 CNN 模型 EfficientnetV2_S 的分类准确率为 93.76%。在我们的数据测试集上，临床医生的分类准确率为 89.10%：实验结果表明，该方法可以有效地完成腭中缝成熟阶段的 CBCT 图像分类，且性能优于临床医生。因此，该模型可以为口腔正畸医生提供有价值的参考，帮助他们做出正确的诊断。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

BMC Medical Informatics and Decision Making 医学-医学：信息

CiteScore

7.20

自引率

5.70%

发文量

297

审稿时长

1 months

期刊介绍： BMC Medical Informatics and Decision Making is an open access journal publishing original peer-reviewed research articles in relation to the design, development, implementation, use, and evaluation of health information technologies and decision-making for human health.