Konstantinos Koukoutegos, Richard 's Heeren, Liesbeth De Wever, Frederik De Keyzer, Frederik Maes, Hilde Bosmans
{"title":"Segmentation-based quantitative measurements in renal CT imaging using deep learning.","authors":"Konstantinos Koukoutegos, Richard 's Heeren, Liesbeth De Wever, Frederik De Keyzer, Frederik Maes, Hilde Bosmans","doi":"10.1186/s41747-024-00507-4","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Renal quantitative measurements are important descriptors for assessing kidney function. We developed a deep learning-based method for automated kidney measurements from computed tomography (CT) images.</p><p><strong>Methods: </strong>The study datasets comprised potential kidney donors (n = 88), both contrast-enhanced (Dataset 1 CE) and noncontrast (Dataset 1 NC) CT scans, and test sets of contrast-enhanced cases (Test set 2, n = 18), cases from a photon-counting (PC)CT scanner reconstructed at 60 and 190 keV (Test set 3 PCCT, n = 15), and low-dose cases (Test set 4, n = 8), which were retrospectively analyzed to train, validate, and test two networks for kidney segmentation and subsequent measurements. Segmentation performance was evaluated using the Dice similarity coefficient (DSC). The quantitative measurements' effectiveness was compared to manual annotations using the intraclass correlation coefficient (ICC).</p><p><strong>Results: </strong>The contrast-enhanced and noncontrast models demonstrated excellent reliability in renal segmentation with DSC of 0.95 (Test set 1 CE), 0.94 (Test set 2), 0.92 (Test set 3 PCCT) and 0.94 (Test set 1 NC), 0.92 (Test set 3 PCCT), and 0.93 (Test set 4). Volume estimation was accurate with mean volume errors of 4%, 3%, 6% mL (contrast test sets) and 4%, 5%, 7% mL (noncontrast test sets). Renal axes measurements (length, width, and thickness) had ICC values greater than 0.90 (p < 0.001) for all test sets, supported by narrow 95% confidence intervals.</p><p><strong>Conclusion: </strong>Two deep learning networks were shown to derive quantitative measurements from contrast-enhanced and noncontrast renal CT imaging at the human performance level.</p><p><strong>Relevance statement: </strong>Deep learning-based networks can automatically obtain renal clinical descriptors from both noncontrast and contrast-enhanced CT images. When healthy subjects comprise the training cohort, careful consideration is required during model adaptation, especially in scenarios involving unhealthy kidneys. This creates an opportunity for improved clinical decision-making without labor-intensive manual effort.</p><p><strong>Key points: </strong>Trained 3D UNet models quantify renal measurements from contrast and noncontrast CT. The models performed interchangeably to the manual annotator and to each other. The models can provide expert-level, quantitative, accurate, and rapid renal measurements.</p>","PeriodicalId":36926,"journal":{"name":"European Radiology Experimental","volume":null,"pages":null},"PeriodicalIF":3.7000,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11465135/pdf/","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"European Radiology Experimental","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s41747-024-00507-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"RADIOLOGY, NUCLEAR MEDICINE & MEDICAL IMAGING","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Renal quantitative measurements are important descriptors for assessing kidney function. We developed a deep learning-based method for automated kidney measurements from computed tomography (CT) images.
Methods: The study datasets comprised potential kidney donors (n = 88), both contrast-enhanced (Dataset 1 CE) and noncontrast (Dataset 1 NC) CT scans, and test sets of contrast-enhanced cases (Test set 2, n = 18), cases from a photon-counting (PC)CT scanner reconstructed at 60 and 190 keV (Test set 3 PCCT, n = 15), and low-dose cases (Test set 4, n = 8), which were retrospectively analyzed to train, validate, and test two networks for kidney segmentation and subsequent measurements. Segmentation performance was evaluated using the Dice similarity coefficient (DSC). The quantitative measurements' effectiveness was compared to manual annotations using the intraclass correlation coefficient (ICC).
Results: The contrast-enhanced and noncontrast models demonstrated excellent reliability in renal segmentation with DSC of 0.95 (Test set 1 CE), 0.94 (Test set 2), 0.92 (Test set 3 PCCT) and 0.94 (Test set 1 NC), 0.92 (Test set 3 PCCT), and 0.93 (Test set 4). Volume estimation was accurate with mean volume errors of 4%, 3%, 6% mL (contrast test sets) and 4%, 5%, 7% mL (noncontrast test sets). Renal axes measurements (length, width, and thickness) had ICC values greater than 0.90 (p < 0.001) for all test sets, supported by narrow 95% confidence intervals.
Conclusion: Two deep learning networks were shown to derive quantitative measurements from contrast-enhanced and noncontrast renal CT imaging at the human performance level.
Relevance statement: Deep learning-based networks can automatically obtain renal clinical descriptors from both noncontrast and contrast-enhanced CT images. When healthy subjects comprise the training cohort, careful consideration is required during model adaptation, especially in scenarios involving unhealthy kidneys. This creates an opportunity for improved clinical decision-making without labor-intensive manual effort.
Key points: Trained 3D UNet models quantify renal measurements from contrast and noncontrast CT. The models performed interchangeably to the manual annotator and to each other. The models can provide expert-level, quantitative, accurate, and rapid renal measurements.