Samet Öztürk, Murat Yüce, Gül Gizem Pamuk, Candan Varlık, Ahmet Tan Cimilli, Musa Atay
{"title":"Automatic bone age assessment: a Turkish population study.","authors":"Samet Öztürk, Murat Yüce, Gül Gizem Pamuk, Candan Varlık, Ahmet Tan Cimilli, Musa Atay","doi":"10.4274/dir.2025.242999","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>Established methods for bone age assessment (BAA), such as the Greulich and Pyle atlas, suffer from variability due to population differences and observer discrepancies. Although automated BAA offers speed and consistency, limited research exists on its performance across different populations using deep learning. This study examines deep learning algorithms on the Turkish population to enhance bone age models by understanding demographic influences.</p><p><strong>Methods: </strong>We analyzed reports from Bağcılar Hospital's Health Information Management System between April 2012 and September 2023 using \"bone age\" as a keyword. Patient images were re-evaluated by an experienced radiologist and anonymized. A total of 2,730 hand radiographs from Bağcılar Hospital (Turkish population), 12,572 from the Radiological Society of North America (RSNA), and 6,185 from the Radiological Hand Pose Estimation (RHPE) public datasets were collected, along with corresponding bone ages and gender information. A random set of 546 radiographs (273 from Bağcılar, 273 from public datasets) was initially randomly split for an internal test set with bone age stratification; the remaining data were used for training and validation. BAAs were generated using a modified InceptionV3 model on 500 × 500-pixel images, selecting the model with the lowest mean absolute error (MAE) on the validation set.</p><p><strong>Results: </strong>Three models were trained and tested based on dataset origin: Bağcılar (Turkish), public (RSNA-RHPE), and a Combined model. Internal test set predictions of the Combined model estimated bone age within less than 6, 12, 18, and 24 months at rates of 44%, 73%, 87%, and 94%, respectively. The MAE was 9.2 months in the overall internal test set, 7 months on the public test set, and 11.5 months on the Bağcılar internal test data. The Bağcılar-only model had an MAE of 12.7 months on the Bağcılar internal test data. Despite less training data, there was no significant difference between the combined and Bağcılar models on the Bağcılar dataset (<i>P</i> > 0.05). The public model showed an MAE of 16.5 months on the Bağcılar dataset, significantly worse than the other models (<i>P</i> < 0.05).</p><p><strong>Conclusion: </strong>We developed an automatic BAA model including the Turkish population, one of the few such studies using deep learning. Despite challenges from population differences and data heterogeneity, these models can be effectively used in various clinical settings. Model accuracy can improve over time with cumulative data, and publicly available datasets may further refine them. Our approach enables more accurate and efficient BAAs, supporting healthcare professionals where traditional methods are time-consuming and variable.</p><p><strong>Clinical significance: </strong>The developed automated BAA model for the Turkish population offers a reliable and efficient alternative to traditional methods. By utilizing deep learning with diverse datasets from Bağcılar Hospital and publicly available sources, the model minimizes assessment time and reduces variability. This advancement enhances clinical decision-making, supports standardized BAA practices, and improves patient care in various healthcare settings.</p>","PeriodicalId":11341,"journal":{"name":"Diagnostic and interventional radiology","volume":" ","pages":""},"PeriodicalIF":1.4000,"publicationDate":"2025-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Diagnostic and interventional radiology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.4274/dir.2025.242999","RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Purpose: Established methods for bone age assessment (BAA), such as the Greulich and Pyle atlas, suffer from variability due to population differences and observer discrepancies. Although automated BAA offers speed and consistency, limited research exists on its performance across different populations using deep learning. This study examines deep learning algorithms on the Turkish population to enhance bone age models by understanding demographic influences.
Methods: We analyzed reports from Bağcılar Hospital's Health Information Management System between April 2012 and September 2023 using "bone age" as a keyword. Patient images were re-evaluated by an experienced radiologist and anonymized. A total of 2,730 hand radiographs from Bağcılar Hospital (Turkish population), 12,572 from the Radiological Society of North America (RSNA), and 6,185 from the Radiological Hand Pose Estimation (RHPE) public datasets were collected, along with corresponding bone ages and gender information. A random set of 546 radiographs (273 from Bağcılar, 273 from public datasets) was initially randomly split for an internal test set with bone age stratification; the remaining data were used for training and validation. BAAs were generated using a modified InceptionV3 model on 500 × 500-pixel images, selecting the model with the lowest mean absolute error (MAE) on the validation set.
Results: Three models were trained and tested based on dataset origin: Bağcılar (Turkish), public (RSNA-RHPE), and a Combined model. Internal test set predictions of the Combined model estimated bone age within less than 6, 12, 18, and 24 months at rates of 44%, 73%, 87%, and 94%, respectively. The MAE was 9.2 months in the overall internal test set, 7 months on the public test set, and 11.5 months on the Bağcılar internal test data. The Bağcılar-only model had an MAE of 12.7 months on the Bağcılar internal test data. Despite less training data, there was no significant difference between the combined and Bağcılar models on the Bağcılar dataset (P > 0.05). The public model showed an MAE of 16.5 months on the Bağcılar dataset, significantly worse than the other models (P < 0.05).
Conclusion: We developed an automatic BAA model including the Turkish population, one of the few such studies using deep learning. Despite challenges from population differences and data heterogeneity, these models can be effectively used in various clinical settings. Model accuracy can improve over time with cumulative data, and publicly available datasets may further refine them. Our approach enables more accurate and efficient BAAs, supporting healthcare professionals where traditional methods are time-consuming and variable.
Clinical significance: The developed automated BAA model for the Turkish population offers a reliable and efficient alternative to traditional methods. By utilizing deep learning with diverse datasets from Bağcılar Hospital and publicly available sources, the model minimizes assessment time and reduces variability. This advancement enhances clinical decision-making, supports standardized BAA practices, and improves patient care in various healthcare settings.
期刊介绍:
Diagnostic and Interventional Radiology (Diagn Interv Radiol) is the open access, online-only official publication of Turkish Society of Radiology. It is published bimonthly and the journal’s publication language is English.
The journal is a medium for original articles, reviews, pictorial essays, technical notes related to all fields of diagnostic and interventional radiology.