用于深度学习自动分割的以临床可用性为导向的综合轮廓质量评估:通过机器学习结合多种定量指标。

IF 3.4 3区 医学 Q2 ONCOLOGY Practical Radiation Oncology Pub Date : 2024-09-02 DOI:10.1016/j.prro.2024.07.007
Ying Zhang, Asma Amjad, Jie Ding, Christina Sarosiek, Mohammad Zarenia, Renae Conlin, William A Hall, Beth Erickson, Eric Paulson
{"title":"用于深度学习自动分割的以临床可用性为导向的综合轮廓质量评估:通过机器学习结合多种定量指标。","authors":"Ying Zhang, Asma Amjad, Jie Ding, Christina Sarosiek, Mohammad Zarenia, Renae Conlin, William A Hall, Beth Erickson, Eric Paulson","doi":"10.1016/j.prro.2024.07.007","DOIUrl":null,"url":null,"abstract":"<p><strong>Purpose: </strong>The current commonly-used metrics for evaluating the quality of auto-segmented contours have limitations and do not always reflect the clinical usefulness of the contours. This work aims to develop a novel contour quality classification (CQC) method by combining multiple quantitative metrics for clinical usability-oriented contour quality evaluation for deep learning-based auto-segmentation (DLAS).</p><p><strong>Methods: </strong>The CQC was designed to categorize contours on slices as acceptable, minor edit, or major edit based on the expected editing effort/time with supervised ensemble tree classification models using seven quantitative metrics. Organ-specific models were trained for five abdominal organs (pancreas, duodenum, stomach, small and large-bowels) using 50 MRI datasets. Twenty additional MRI and nine CT datasets were employed for testing. Inter-observer variation (IOV) was assessed among six observers and consensus labels were established through majority vote for evaluation. The CQC was also compared with a threshold-based baseline approach.</p><p><strong>Results: </strong>For the five organs, the average AUC was 0.982±0.01 and 0.979±0.01, the mean-accuracy was 95.8±1.7% and 94.3±2.1%, and the mean risk-rate was 0.8±0.4% and 0.7±0.5% for MRI and CT testing dataset, respectively. The CQC results closely matched the IOV results (mean-accuracy of 94.2±0.8% and 94.8±1.7%) and were significantly higher than those obtained using the threshold-based method (mean-accuracy of 80.0±4.7%, 83.8±5.2%, and 77.3±6.6% using one, two, and three metrics).</p><p><strong>Conclusion: </strong>The CQC models demonstrated high performance in classifying the quality of contour slices. This method can address the limitations of existing metrics and offers an intuitive and comprehensive solution for clinically oriented evaluation and comparison of DLAS systems.</p>","PeriodicalId":54245,"journal":{"name":"Practical Radiation Oncology","volume":null,"pages":null},"PeriodicalIF":3.4000,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Comprehensive Clinical Usability-oriented Contour Quality Evaluation for Deep learning Auto-segmentation: Combining Multiple Quantitative Metrics through Machine Learning.\",\"authors\":\"Ying Zhang, Asma Amjad, Jie Ding, Christina Sarosiek, Mohammad Zarenia, Renae Conlin, William A Hall, Beth Erickson, Eric Paulson\",\"doi\":\"10.1016/j.prro.2024.07.007\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Purpose: </strong>The current commonly-used metrics for evaluating the quality of auto-segmented contours have limitations and do not always reflect the clinical usefulness of the contours. This work aims to develop a novel contour quality classification (CQC) method by combining multiple quantitative metrics for clinical usability-oriented contour quality evaluation for deep learning-based auto-segmentation (DLAS).</p><p><strong>Methods: </strong>The CQC was designed to categorize contours on slices as acceptable, minor edit, or major edit based on the expected editing effort/time with supervised ensemble tree classification models using seven quantitative metrics. Organ-specific models were trained for five abdominal organs (pancreas, duodenum, stomach, small and large-bowels) using 50 MRI datasets. Twenty additional MRI and nine CT datasets were employed for testing. Inter-observer variation (IOV) was assessed among six observers and consensus labels were established through majority vote for evaluation. The CQC was also compared with a threshold-based baseline approach.</p><p><strong>Results: </strong>For the five organs, the average AUC was 0.982±0.01 and 0.979±0.01, the mean-accuracy was 95.8±1.7% and 94.3±2.1%, and the mean risk-rate was 0.8±0.4% and 0.7±0.5% for MRI and CT testing dataset, respectively. The CQC results closely matched the IOV results (mean-accuracy of 94.2±0.8% and 94.8±1.7%) and were significantly higher than those obtained using the threshold-based method (mean-accuracy of 80.0±4.7%, 83.8±5.2%, and 77.3±6.6% using one, two, and three metrics).</p><p><strong>Conclusion: </strong>The CQC models demonstrated high performance in classifying the quality of contour slices. This method can address the limitations of existing metrics and offers an intuitive and comprehensive solution for clinically oriented evaluation and comparison of DLAS systems.</p>\",\"PeriodicalId\":54245,\"journal\":{\"name\":\"Practical Radiation Oncology\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":3.4000,\"publicationDate\":\"2024-09-02\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Practical Radiation Oncology\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"https://doi.org/10.1016/j.prro.2024.07.007\",\"RegionNum\":3,\"RegionCategory\":\"医学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ONCOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Practical Radiation Oncology","FirstCategoryId":"3","ListUrlMain":"https://doi.org/10.1016/j.prro.2024.07.007","RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ONCOLOGY","Score":null,"Total":0}
引用次数: 0

摘要

目的:目前常用的自动分割轮廓质量评估指标存在局限性,并不总能反映轮廓的临床实用性。本研究旨在开发一种新颖的轮廓质量分类(CQC)方法,该方法结合了多种定量指标,用于基于深度学习的自动分割(DLAS)中以临床实用性为导向的轮廓质量评估:CQC 的设计目的是根据预期的编辑工作量/时间,利用七个量化指标的监督集合树分类模型,将切片上的轮廓分为可接受、小编辑或大编辑。使用 50 个 MRI 数据集为五个腹部器官(胰腺、十二指肠、胃、小肠和大肠)训练了特定器官模型。另外还使用了 20 个 MRI 数据集和 9 个 CT 数据集进行测试。对六位观察者的观察者间差异(IOV)进行了评估,并通过多数票确定了共识标签进行评估。CQC 还与基于阈值的基线方法进行了比较:对于五个器官,MRI 和 CT 测试数据集的平均 AUC 分别为 0.982±0.01 和 0.979±0.01,平均准确率分别为 95.8±1.7% 和 94.3±2.1%,平均风险率分别为 0.8±0.4% 和 0.7±0.5%。CQC结果与IOV结果(平均准确率为94.2±0.8%和94.8±1.7%)非常接近,并且明显高于使用基于阈值的方法所获得的结果(使用一个、两个和三个指标的平均准确率分别为80.0±4.7%、83.8±5.2%和77.3±6.6%):结论:CQC 模型在轮廓切片质量分类方面表现出很高的性能。这种方法可以解决现有指标的局限性,为临床导向的 DLAS 系统评估和比较提供了直观、全面的解决方案。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Comprehensive Clinical Usability-oriented Contour Quality Evaluation for Deep learning Auto-segmentation: Combining Multiple Quantitative Metrics through Machine Learning.

Purpose: The current commonly-used metrics for evaluating the quality of auto-segmented contours have limitations and do not always reflect the clinical usefulness of the contours. This work aims to develop a novel contour quality classification (CQC) method by combining multiple quantitative metrics for clinical usability-oriented contour quality evaluation for deep learning-based auto-segmentation (DLAS).

Methods: The CQC was designed to categorize contours on slices as acceptable, minor edit, or major edit based on the expected editing effort/time with supervised ensemble tree classification models using seven quantitative metrics. Organ-specific models were trained for five abdominal organs (pancreas, duodenum, stomach, small and large-bowels) using 50 MRI datasets. Twenty additional MRI and nine CT datasets were employed for testing. Inter-observer variation (IOV) was assessed among six observers and consensus labels were established through majority vote for evaluation. The CQC was also compared with a threshold-based baseline approach.

Results: For the five organs, the average AUC was 0.982±0.01 and 0.979±0.01, the mean-accuracy was 95.8±1.7% and 94.3±2.1%, and the mean risk-rate was 0.8±0.4% and 0.7±0.5% for MRI and CT testing dataset, respectively. The CQC results closely matched the IOV results (mean-accuracy of 94.2±0.8% and 94.8±1.7%) and were significantly higher than those obtained using the threshold-based method (mean-accuracy of 80.0±4.7%, 83.8±5.2%, and 77.3±6.6% using one, two, and three metrics).

Conclusion: The CQC models demonstrated high performance in classifying the quality of contour slices. This method can address the limitations of existing metrics and offers an intuitive and comprehensive solution for clinically oriented evaluation and comparison of DLAS systems.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Practical Radiation Oncology
Practical Radiation Oncology Medicine-Radiology, Nuclear Medicine and Imaging
CiteScore
5.20
自引率
6.10%
发文量
177
审稿时长
34 days
期刊介绍: The overarching mission of Practical Radiation Oncology is to improve the quality of radiation oncology practice. PRO''s purpose is to document the state of current practice, providing background for those in training and continuing education for practitioners, through discussion and illustration of new techniques, evaluation of current practices, and publication of case reports. PRO strives to provide its readers content that emphasizes knowledge "with a purpose." The content of PRO includes: Original articles focusing on patient safety, quality measurement, or quality improvement initiatives Original articles focusing on imaging, contouring, target delineation, simulation, treatment planning, immobilization, organ motion, and other practical issues ASTRO guidelines, position papers, and consensus statements Essays that highlight enriching personal experiences in caring for cancer patients and their families.
期刊最新文献
Centralized Quality Assurance of Stereotactic Body Radiation Therapy for the Veterans Affairs Cooperative Studies Program Study Number 2005: A Phase 3 Randomized Trial of Lung Cancer Surgery or Stereotactic Radiotherapy for Operable Early-Stage Non-Small Cell Lung Cancer (VALOR). Comprehensive Clinical Usability-oriented Contour Quality Evaluation for Deep learning Auto-segmentation: Combining Multiple Quantitative Metrics through Machine Learning. Radiosurgery Society Case-Based Guide to Stereotactic Body Radiation Therapy for Challenging Cases of Spinal Metastases. Table of Contents Editorial Board
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1