The effect of data diversity on the performance of deep learning models for predicting early gastric cancer under endoscopy

Conghui Shi, Jia Li, Lianlian Wu
{"title":"The effect of data diversity on the performance of deep learning models for predicting early gastric cancer under endoscopy","authors":"Conghui Shi, Jia Li, Lianlian Wu","doi":"10.55976/jdh.1202214319-24","DOIUrl":null,"url":null,"abstract":" \nAims: This study aimed to explore the effect of training set diversity on the performance of deep learning models for predicting early gastric cancer (EGC) under endoscopy.\nMethods: Images of EGC and non-cancerous lesions under narrow-band imaging (ME-NBI) and magnifying blue laser imaging (ME-BLI) were retrospectively collected. Training set 1 was composed of 150 non-cancerous and 309 EGC ME-NBI images, training set 2 was composed of 1505 non-cancerous and 309 EGC ME-BLI images, and training set 3 was the combination of training set 1 and 2. Test set 1 was composed of 376 non-cancerous and 1052 EGC ME-NBI images, test set 2 consisted of 529 non-cancerous and 71 EGC ME-BLI images, and test set 3 was the combination of test set 1 and test set 2. Three deep learning models, convolutional neural network (CNN) 1, CNN 2 and CNN 3 (CNN 1, CNN 2 and CNN 3 were independently trained using training set 1, training set 2 and training set 3, respectively), were constructed, and their performances on each test set were respectively evaluated. One hundred and thirty-eight ME-NBI videos and 17 ME-BLI videos were further collected to evaluate and compare the performance of each model in real time.\nResults: On the whole, the performance of CNN 3 was the best. The accuracy (Acc), sensitivity (Sn), specificity (Sp) and area under the curve (AUC) of test set 1 in CNN 3 were 87.89% (1255/1428), 90.96% (342/376), 86.79% (913/1052) and 94.60%, respectively. The Acc, Sn, Sp and AUC of test set 2 in CNN 3 were 95% (570/600), 97.92% (518/529), 73.24% (52/71) and 90.93% respectively. The Acc, Sn, Sp and AUC of test set 3 in CNN 3 were 89.99% (1825/2028), 95.03% (860/905), 85.93% (965/1123) and 94.89%, respectively. The performance of CNN 3 was also the best in videos test set. The Acc, Sn and Sp of videos test set in CNN 3 were 91.03% (142/156), 90.58% (125/138) and 94.44% (17/18), respectively.\nConclusions: The deep learning model with the most diverse training data has the best diagnostic effect.","PeriodicalId":131334,"journal":{"name":"Journal of Digital Health","volume":"14 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Digital Health","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.55976/jdh.1202214319-24","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

  Aims: This study aimed to explore the effect of training set diversity on the performance of deep learning models for predicting early gastric cancer (EGC) under endoscopy. Methods: Images of EGC and non-cancerous lesions under narrow-band imaging (ME-NBI) and magnifying blue laser imaging (ME-BLI) were retrospectively collected. Training set 1 was composed of 150 non-cancerous and 309 EGC ME-NBI images, training set 2 was composed of 1505 non-cancerous and 309 EGC ME-BLI images, and training set 3 was the combination of training set 1 and 2. Test set 1 was composed of 376 non-cancerous and 1052 EGC ME-NBI images, test set 2 consisted of 529 non-cancerous and 71 EGC ME-BLI images, and test set 3 was the combination of test set 1 and test set 2. Three deep learning models, convolutional neural network (CNN) 1, CNN 2 and CNN 3 (CNN 1, CNN 2 and CNN 3 were independently trained using training set 1, training set 2 and training set 3, respectively), were constructed, and their performances on each test set were respectively evaluated. One hundred and thirty-eight ME-NBI videos and 17 ME-BLI videos were further collected to evaluate and compare the performance of each model in real time. Results: On the whole, the performance of CNN 3 was the best. The accuracy (Acc), sensitivity (Sn), specificity (Sp) and area under the curve (AUC) of test set 1 in CNN 3 were 87.89% (1255/1428), 90.96% (342/376), 86.79% (913/1052) and 94.60%, respectively. The Acc, Sn, Sp and AUC of test set 2 in CNN 3 were 95% (570/600), 97.92% (518/529), 73.24% (52/71) and 90.93% respectively. The Acc, Sn, Sp and AUC of test set 3 in CNN 3 were 89.99% (1825/2028), 95.03% (860/905), 85.93% (965/1123) and 94.89%, respectively. The performance of CNN 3 was also the best in videos test set. The Acc, Sn and Sp of videos test set in CNN 3 were 91.03% (142/156), 90.58% (125/138) and 94.44% (17/18), respectively. Conclusions: The deep learning model with the most diverse training data has the best diagnostic effect.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
数据多样性对内镜下早期胃癌预测深度学习模型性能的影响
目的:本研究旨在探讨训练集多样性对内镜下早期胃癌(EGC)预测深度学习模型性能的影响。方法:回顾性收集EGC及非癌性病变的窄带成像(ME-NBI)和放大蓝光成像(ME-BLI)图像。训练集1由150张非癌性和309张EGC ME-NBI图像组成,训练集2由1505张非癌性和309张EGC ME-BLI图像组成,训练集3是训练集1和2的组合。测试集1由376张非癌性和1052张EGC ME-NBI图像组成,测试集2由529张非癌性和71张EGC ME-BLI图像组成,测试集3是测试集1和测试集2的组合。构建了卷积神经网络(CNN) 1、CNN 2和CNN 3三个深度学习模型(CNN 1、CNN 2和CNN 3分别使用训练集1、训练集2和训练集3独立训练),并分别对其在每个测试集上的性能进行了评价。进一步收集了138个ME-NBI视频和17个ME-BLI视频,实时评价和比较各模型的性能。结果:整体来看,CNN 3的表现最好。CNN 3中测试集1的准确度(Acc)、灵敏度(Sn)、特异度(Sp)和曲线下面积(AUC)分别为87.89%(1255/1428)、90.96%(342/376)、86.79%(913/1052)和94.60%。cnn3中测试集2的Acc、Sn、Sp和AUC分别为95%(570/600)、97.92%(518/529)、73.24%(52/71)和90.93%。CNN 3中测试集3的Acc、Sn、Sp和AUC分别为89.99%(1825/2028)、95.03%(860/905)、85.93%(965/1123)和94.89%。在视频测试集中,CNN 3的表现也是最好的。CNN 3视频测试集的Acc、Sn和Sp分别为91.03%(142/156)、90.58%(125/138)和94.44%(17/18)。结论:训练数据最多样化的深度学习模型诊断效果最好。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
An audit on problem lists transfers in general practice in Leeds, United Kingdom Designing NLP applications to support ICD coding: an impact analysis and guidelines to enhance baseline performance when processing patient discharge notes The internet of medical things in healthcare management: a review Evaluation of open access COVID-19 related mobile applications in India: An application store-based quantitative analysis AI image-based diagnosis systems: how to implement them?
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1