Christopher J Carnabatu, David T Fetzer, Alexander Tessnow, Shelby Holt, Vivek R Sant
{"title":"Avoidable biopsies? Validating artificial intelligence-based decision support software in indeterminate thyroid nodules.","authors":"Christopher J Carnabatu, David T Fetzer, Alexander Tessnow, Shelby Holt, Vivek R Sant","doi":"10.1016/j.surg.2024.07.074","DOIUrl":null,"url":null,"abstract":"<p><strong>Background: </strong>Multiple artificial intelligence (AI) systems have been approved to risk-stratify thyroid nodules through sonographic characterization. We sought to validate the ability of one such AI system, Koios DS (Koios Medical, Chicago, IL), to aid in improving risk stratification of indeterminate thyroid nodules.</p><p><strong>Methods: </strong>A retrospective single-institution dataset was compiled of 28 cytologically indeterminate thyroid nodules having undergone molecular testing and surgical resection, with surgical pathology categorized as malignant or benign. Nodules were retrospectively evaluated with Koios DS. After nodule selection, automated and AI-adapter-derived Thyroid Imaging Reporting and Data System (TI-RADS) levels were recorded, and agreement with radiologist-derived levels was assessed using Cohen's κ statistic. The performance of malignancy classification was compared between the radiologist and AI-adapter. Biopsy thresholds were re-evaluated using the AI-adapter.</p><p><strong>Results: </strong>In this cohort, 7 (25%) nodules were malignant on surgical pathology. The median nodule size was 2.4 cm (interquartile range: 1.8-2.9 cm). Median radiologist and automated TI-RADS levels were both 4, with κ 0.25 (\"fair agreement\"). Malignancy classification by the radiologist provided sensitivity 100%, specificity 33.3%, positive predictive value (PPV) 33.3%, and negative predictive value (NPV) 100%, compared with the AI-adapter's performance with sensitivity 85.7%, specificity 76.2%, PPV 54.5%, and NPV 94.1%. Using the AI-adapter, 14 of 28 biopsies would have been deferred, 13 of which were surgically benign.</p><p><strong>Conclusion: </strong>Koios automated and radiologist-derived TI-RADS levels were in consistent agreement for indeterminate thyroid nodules. Malignancy reclassification with the AI-adapter improved PPV at minimal cost to NPV. Risk stratification with the addition of the AI-adapter may allow for more accurate patient counseling and the avoidance of biopsies in select cases that would otherwise be cytologically indeterminate.</p>","PeriodicalId":22152,"journal":{"name":"Surgery","volume":" ","pages":""},"PeriodicalIF":3.2000,"publicationDate":"2024-10-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Surgery","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.surg.2024.07.074","RegionNum":2,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"SURGERY","Score":null,"Total":0}
引用次数: 0
Abstract
Background: Multiple artificial intelligence (AI) systems have been approved to risk-stratify thyroid nodules through sonographic characterization. We sought to validate the ability of one such AI system, Koios DS (Koios Medical, Chicago, IL), to aid in improving risk stratification of indeterminate thyroid nodules.
Methods: A retrospective single-institution dataset was compiled of 28 cytologically indeterminate thyroid nodules having undergone molecular testing and surgical resection, with surgical pathology categorized as malignant or benign. Nodules were retrospectively evaluated with Koios DS. After nodule selection, automated and AI-adapter-derived Thyroid Imaging Reporting and Data System (TI-RADS) levels were recorded, and agreement with radiologist-derived levels was assessed using Cohen's κ statistic. The performance of malignancy classification was compared between the radiologist and AI-adapter. Biopsy thresholds were re-evaluated using the AI-adapter.
Results: In this cohort, 7 (25%) nodules were malignant on surgical pathology. The median nodule size was 2.4 cm (interquartile range: 1.8-2.9 cm). Median radiologist and automated TI-RADS levels were both 4, with κ 0.25 ("fair agreement"). Malignancy classification by the radiologist provided sensitivity 100%, specificity 33.3%, positive predictive value (PPV) 33.3%, and negative predictive value (NPV) 100%, compared with the AI-adapter's performance with sensitivity 85.7%, specificity 76.2%, PPV 54.5%, and NPV 94.1%. Using the AI-adapter, 14 of 28 biopsies would have been deferred, 13 of which were surgically benign.
Conclusion: Koios automated and radiologist-derived TI-RADS levels were in consistent agreement for indeterminate thyroid nodules. Malignancy reclassification with the AI-adapter improved PPV at minimal cost to NPV. Risk stratification with the addition of the AI-adapter may allow for more accurate patient counseling and the avoidance of biopsies in select cases that would otherwise be cytologically indeterminate.
期刊介绍:
For 66 years, Surgery has published practical, authoritative information about procedures, clinical advances, and major trends shaping general surgery. Each issue features original scientific contributions and clinical reports. Peer-reviewed articles cover topics in oncology, trauma, gastrointestinal, vascular, and transplantation surgery. The journal also publishes papers from the meetings of its sponsoring societies, the Society of University Surgeons, the Central Surgical Association, and the American Association of Endocrine Surgeons.