{"title":"CephTransXnet: An attention enhanced feature fusion network leveraging neighborhood rough set approach for cephalometric landmark prediction","authors":"R. Neeraja, L. Jani Anbarasi","doi":"10.1016/j.compbiomed.2025.109891","DOIUrl":null,"url":null,"abstract":"<div><div>The convergence of medical imaging, computer vision, and orthodontics has made automatic cephalometric landmark detection a pivotal area of research. Accurate cephalometric analysis is crucial in orthodontics, orthognathic and maxillofacial surgery for diagnosis, treatment planning, and monitoring craniofacial growth. In this research study, a multi-branch fused feature extraction network titled <span><math><mrow><msub><mrow><mi>C</mi><mi>e</mi><mi>p</mi><mi>h</mi><mi>T</mi><mi>r</mi><mi>a</mi><mi>n</mi><mi>s</mi><mi>X</mi></mrow><mrow><mi>n</mi><mi>e</mi><mi>t</mi></mrow></msub></mrow></math></span> is proposed to automatically predict landmark coordinates from cephalometric radiographs. The initial sequential branch enhances discriminative local feature learning and feature extraction through parallel feature fusion by integrating Convolved Pooled Normalized (<span><math><mrow><msub><mrow><mi>C</mi><mi>P</mi><mi>N</mi></mrow><mi>B</mi></msub></mrow></math></span>) and Gradient Optimized Multi-Path Bottleneck (<span><math><mrow><msub><mrow><mi>G</mi><mi>M</mi><mi>B</mi></mrow><mi>B</mi></msub></mrow></math></span>) blocks with Channel and Spatial Attention (<span><math><mrow><msub><mrow><mi>C</mi><mi>S</mi><mi>A</mi><mi>T</mi></mrow><mi>M</mi></msub></mrow></math></span>) module. The Swin Transformer (<span><math><mrow><msub><mrow><mi>S</mi><mi>T</mi></mrow><mi>B</mi></msub><mo>)</mo></mrow></math></span> branch efficiently handles long-range dependencies and extracts global features in cephalometric radiographs. The multi-branch fused features along with features from skip connections of <span><math><mrow><msub><mrow><mi>C</mi><mi>P</mi><mi>N</mi></mrow><mi>B</mi></msub></mrow></math></span> and <span><math><mrow><msub><mrow><mi>G</mi><mi>M</mi><mi>B</mi></mrow><mi>B</mi></msub></mrow></math></span> blocks are concatenated using a Coordinate Attention module <span><math><mrow><mo>(</mo><msub><mrow><mi>C</mi><mi>o</mi><mi>A</mi><mi>T</mi></mrow><mi>M</mi></msub><mo>)</mo></mrow></math></span> to captures the positional relationships between various landmark features. A Landmark Discriminative Deviation Factor <span><math><mrow><mo>(</mo><mrow><mi>L</mi><mi>D</mi><mi>D</mi><mi>F</mi></mrow><mo>)</mo></mrow></math></span> is determined by applying the Neighborhood Rough Set <span><math><mrow><mo>(</mo><mrow><mi>N</mi><mi>R</mi><mi>S</mi></mrow><mo>)</mo></mrow></math></span> approach to analyse the surrounding features of each landmark by considering spatial relationships or similarity measures between the landmarks and neighboring regions. The Spatial Pyramid Pooling (<span><math><mrow><msub><mrow><mi>S</mi><mi>P</mi><mi>P</mi></mrow><mi>L</mi></msub></mrow></math></span>) layer incorporated in the final phase of <span><math><mrow><msub><mrow><mi>C</mi><mi>e</mi><mi>p</mi><mi>h</mi><mi>T</mi><mi>r</mi><mi>a</mi><mi>n</mi><mi>s</mi><mi>X</mi></mrow><mrow><mi>n</mi><mi>e</mi><mi>t</mi></mrow></msub></mrow></math></span> model extracts multi-scale features by pooling over sub-regions of varying sizes, enabling the network to capture both local and global context for precise cephalometric landmark identification. The <span><math><mrow><msub><mrow><mi>C</mi><mi>e</mi><mi>p</mi><mi>h</mi><mi>T</mi><mi>r</mi><mi>a</mi><mi>n</mi><mi>s</mi><mi>X</mi></mrow><mrow><mi>n</mi><mi>e</mi><mi>t</mi></mrow></msub></mrow></math></span> framework achieved an average Successful Detection Rates <span><math><mrow><mo>(</mo><msub><mrow><mi>S</mi><mi>D</mi><mi>R</mi></mrow><mi>s</mi></msub><mo>)</mo></mrow></math></span> of 88.71 % and 79.05 % in 2 mm using the 2015 International Symposium on Biomedical Imaging (ISBI) grand challenge dental X-ray analysis dataset. The effectiveness of the <span><math><mrow><msub><mrow><mi>C</mi><mi>e</mi><mi>p</mi><mi>h</mi><mi>T</mi><mi>r</mi><mi>a</mi><mi>n</mi><mi>s</mi><mi>X</mi></mrow><mrow><mi>n</mi><mi>e</mi><mi>t</mi></mrow></msub></mrow></math></span> model is evaluated using a private clinical dataset obtained from Solanki Dental Care Clinic in Sharjah, UAE, and attained an average <span><math><mrow><msub><mrow><mi>S</mi><mi>D</mi><mi>R</mi></mrow><mi>s</mi></msub></mrow></math></span> of 74.38 % in 2 mm precision range.</div></div>","PeriodicalId":10578,"journal":{"name":"Computers in biology and medicine","volume":"188 ","pages":"Article 109891"},"PeriodicalIF":7.0000,"publicationDate":"2025-02-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in biology and medicine","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0010482525002422","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"BIOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
The convergence of medical imaging, computer vision, and orthodontics has made automatic cephalometric landmark detection a pivotal area of research. Accurate cephalometric analysis is crucial in orthodontics, orthognathic and maxillofacial surgery for diagnosis, treatment planning, and monitoring craniofacial growth. In this research study, a multi-branch fused feature extraction network titled is proposed to automatically predict landmark coordinates from cephalometric radiographs. The initial sequential branch enhances discriminative local feature learning and feature extraction through parallel feature fusion by integrating Convolved Pooled Normalized () and Gradient Optimized Multi-Path Bottleneck () blocks with Channel and Spatial Attention () module. The Swin Transformer ( branch efficiently handles long-range dependencies and extracts global features in cephalometric radiographs. The multi-branch fused features along with features from skip connections of and blocks are concatenated using a Coordinate Attention module to captures the positional relationships between various landmark features. A Landmark Discriminative Deviation Factor is determined by applying the Neighborhood Rough Set approach to analyse the surrounding features of each landmark by considering spatial relationships or similarity measures between the landmarks and neighboring regions. The Spatial Pyramid Pooling () layer incorporated in the final phase of model extracts multi-scale features by pooling over sub-regions of varying sizes, enabling the network to capture both local and global context for precise cephalometric landmark identification. The framework achieved an average Successful Detection Rates of 88.71 % and 79.05 % in 2 mm using the 2015 International Symposium on Biomedical Imaging (ISBI) grand challenge dental X-ray analysis dataset. The effectiveness of the model is evaluated using a private clinical dataset obtained from Solanki Dental Care Clinic in Sharjah, UAE, and attained an average of 74.38 % in 2 mm precision range.
期刊介绍:
Computers in Biology and Medicine is an international forum for sharing groundbreaking advancements in the use of computers in bioscience and medicine. This journal serves as a medium for communicating essential research, instruction, ideas, and information regarding the rapidly evolving field of computer applications in these domains. By encouraging the exchange of knowledge, we aim to facilitate progress and innovation in the utilization of computers in biology and medicine.