从零开始还是预先训练?利用有限数据深入分析深度学习方法

IF 1.6 Q2 ENGINEERING, MULTIDISCIPLINARY International Journal of System Assurance Engineering and Management Pub Date : 2024-04-29 DOI:10.1007/s13198-024-02345-4
Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat
{"title":"从零开始还是预先训练?利用有限数据深入分析深度学习方法","authors":"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat","doi":"10.1007/s13198-024-02345-4","DOIUrl":null,"url":null,"abstract":"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\n</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":null,"pages":null},"PeriodicalIF":1.6000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data\",\"authors\":\"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat\",\"doi\":\"10.1007/s13198-024-02345-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\\n</p>\",\"PeriodicalId\":14463,\"journal\":{\"name\":\"International Journal of System Assurance Engineering and Management\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of System Assurance Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13198-024-02345-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02345-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
引用次数: 0

摘要

不可否认,卷积神经网络(CNN)在图像识别领域的广泛应用标志着一项重大突破。然而,这些网络需要大量数据才能很好地学习,这可能具有挑战性。这可能会使模型容易出现过拟合,即在训练数据上表现良好,但在新数据上表现不佳。为解决这一问题,出现了各种策略,包括合理选择合适的网络架构。本研究通过对两种不同方法的比较分析,深入探讨了如何缓解数据稀缺问题:利用紧凑型 CNN 架构和使用预训练模型进行迁移学习。我们的研究涉及三个不同的数据集,每个数据集都来自不同的领域。值得注意的是,我们的研究结果揭示了性能上的细微差别。研究显示,使用 ResNet50 这样复杂的预训练模型,可以在花卉和玉米疾病识别数据集上获得更好的结果,这强调了针对特定数据类型利用先验知识的优势。相反,在肺炎数据集上,从零开始训练的较简单 CNN 架构是更优越的策略,这凸显了根据特定数据集和领域调整方法的必要性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。

摘要图片

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data

The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
4.30
自引率
10.00%
发文量
252
期刊介绍: This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems. Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.
期刊最新文献
A generalized product adoption model under random marketing conditions Assessing e-learning platforms in higher education with reference to student satisfaction: a PLS-SEM approach WON: A hypothetical multi-hop ad-hoc wireless ultra-large scale worldwide one network Deep learning for fault diagnosis of monoblock centrifugal pumps: a Hilbert–Huang transform approach Identification of rice crop diseases using gray level co-occurrence matrix (GLCM) and Neuro-GA classifier
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1