Nouf M. Alzahrani, Ann M. Henry, Bashar M. Al‐Qaisieh, Louise J. Murray, Michael G. Nix
{"title":"深度学习自动分割中的置信度估计,用于放射治疗的磁共振成像中的脑部风险器官","authors":"Nouf M. Alzahrani, Ann M. Henry, Bashar M. Al‐Qaisieh, Louise J. Murray, Michael G. Nix","doi":"10.1002/acm2.14513","DOIUrl":null,"url":null,"abstract":"PurposeWe have built a novel AI‐driven QA method called AutoConfidence (ACo), to estimate segmentation confidence on a per‐voxel basis without gold standard segmentations, enabling robust, efficient review of automated segmentation (AS). We have demonstrated this method in brain OAR AS on MRI, using internal and external (third‐party) AS models.MethodsThirty‐two retrospectives, MRI planned, glioma cases were randomly selected from a local clinical cohort for ACo training. A generator was trained adversarialy to produce internal autosegmentations (IAS) with a discriminator to estimate voxel‐wise IAS uncertainty, given the input MRI. Confidence maps for each proposed segmentation were produced for operator use in AS editing and were compared with “difference to gold‐standard” error maps. Nine cases were used for testing ACo performance on IAS and validation with two external deep learning segmentation model predictions [external model with low‐quality AS (EM‐LQ) and external model with high‐quality AS (EM‐HQ)]. Matthew's correlation coefficient (MCC), false‐positive rate (FPR), false‐negative rate (FNR), and visual assessment were used for evaluation. Edge removal and geometric distance corrections were applied to achieve more useful and clinically relevant confidence maps and performance metrics.ResultsACo showed generally excellent performance on both internal and external segmentations, across all OARs (except lenses). MCC was higher on IAS and low‐quality external segmentations (EM‐LQ) than high‐quality ones (EM‐HQ). On IAS and EM‐LQ, average MCC (excluding lenses) varied from 0.6 to 0.9, while average FPR and FNR were ≤0.13 and ≤0.21, respectively. For EM‐HQ, average MCC varied from 0.4 to 0.8, while average FPR and FNR were ≤0.37 and ≤0.22, respectively.ConclusionACo was a reliable predictor of uncertainty and errors on AS generated both internally and externally, demonstrating its potential as an independent, reference‐free QA tool, which could help operators deliver robust, efficient autosegmentation in the radiotherapy clinic.","PeriodicalId":2,"journal":{"name":"ACS Applied Bio Materials","volume":null,"pages":null},"PeriodicalIF":4.6000,"publicationDate":"2024-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Automated confidence estimation in deep learning auto‐segmentation for brain organs at risk on MRI for radiotherapy\",\"authors\":\"Nouf M. Alzahrani, Ann M. Henry, Bashar M. Al‐Qaisieh, Louise J. Murray, Michael G. Nix\",\"doi\":\"10.1002/acm2.14513\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"PurposeWe have built a novel AI‐driven QA method called AutoConfidence (ACo), to estimate segmentation confidence on a per‐voxel basis without gold standard segmentations, enabling robust, efficient review of automated segmentation (AS). We have demonstrated this method in brain OAR AS on MRI, using internal and external (third‐party) AS models.MethodsThirty‐two retrospectives, MRI planned, glioma cases were randomly selected from a local clinical cohort for ACo training. A generator was trained adversarialy to produce internal autosegmentations (IAS) with a discriminator to estimate voxel‐wise IAS uncertainty, given the input MRI. Confidence maps for each proposed segmentation were produced for operator use in AS editing and were compared with “difference to gold‐standard” error maps. Nine cases were used for testing ACo performance on IAS and validation with two external deep learning segmentation model predictions [external model with low‐quality AS (EM‐LQ) and external model with high‐quality AS (EM‐HQ)]. Matthew's correlation coefficient (MCC), false‐positive rate (FPR), false‐negative rate (FNR), and visual assessment were used for evaluation. Edge removal and geometric distance corrections were applied to achieve more useful and clinically relevant confidence maps and performance metrics.ResultsACo showed generally excellent performance on both internal and external segmentations, across all OARs (except lenses). MCC was higher on IAS and low‐quality external segmentations (EM‐LQ) than high‐quality ones (EM‐HQ). On IAS and EM‐LQ, average MCC (excluding lenses) varied from 0.6 to 0.9, while average FPR and FNR were ≤0.13 and ≤0.21, respectively. For EM‐HQ, average MCC varied from 0.4 to 0.8, while average FPR and FNR were ≤0.37 and ≤0.22, respectively.ConclusionACo was a reliable predictor of uncertainty and errors on AS generated both internally and externally, demonstrating its potential as an independent, reference‐free QA tool, which could help operators deliver robust, efficient autosegmentation in the radiotherapy clinic.\",\"PeriodicalId\":2,\"journal\":{\"name\":\"ACS Applied Bio Materials\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":4.6000,\"publicationDate\":\"2024-09-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACS Applied Bio Materials\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1002/acm2.14513\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"MATERIALS SCIENCE, BIOMATERIALS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACS Applied Bio Materials","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1002/acm2.14513","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"MATERIALS SCIENCE, BIOMATERIALS","Score":null,"Total":0}
Automated confidence estimation in deep learning auto‐segmentation for brain organs at risk on MRI for radiotherapy
PurposeWe have built a novel AI‐driven QA method called AutoConfidence (ACo), to estimate segmentation confidence on a per‐voxel basis without gold standard segmentations, enabling robust, efficient review of automated segmentation (AS). We have demonstrated this method in brain OAR AS on MRI, using internal and external (third‐party) AS models.MethodsThirty‐two retrospectives, MRI planned, glioma cases were randomly selected from a local clinical cohort for ACo training. A generator was trained adversarialy to produce internal autosegmentations (IAS) with a discriminator to estimate voxel‐wise IAS uncertainty, given the input MRI. Confidence maps for each proposed segmentation were produced for operator use in AS editing and were compared with “difference to gold‐standard” error maps. Nine cases were used for testing ACo performance on IAS and validation with two external deep learning segmentation model predictions [external model with low‐quality AS (EM‐LQ) and external model with high‐quality AS (EM‐HQ)]. Matthew's correlation coefficient (MCC), false‐positive rate (FPR), false‐negative rate (FNR), and visual assessment were used for evaluation. Edge removal and geometric distance corrections were applied to achieve more useful and clinically relevant confidence maps and performance metrics.ResultsACo showed generally excellent performance on both internal and external segmentations, across all OARs (except lenses). MCC was higher on IAS and low‐quality external segmentations (EM‐LQ) than high‐quality ones (EM‐HQ). On IAS and EM‐LQ, average MCC (excluding lenses) varied from 0.6 to 0.9, while average FPR and FNR were ≤0.13 and ≤0.21, respectively. For EM‐HQ, average MCC varied from 0.4 to 0.8, while average FPR and FNR were ≤0.37 and ≤0.22, respectively.ConclusionACo was a reliable predictor of uncertainty and errors on AS generated both internally and externally, demonstrating its potential as an independent, reference‐free QA tool, which could help operators deliver robust, efficient autosegmentation in the radiotherapy clinic.