从零开始还是预先训练？利用有限数据深入分析深度学习方法

IF 1.6 Q2 ENGINEERING, MULTIDISCIPLINARY International Journal of System Assurance Engineering and Management Pub Date : 2024-04-29 DOI:10.1007/s13198-024-02345-4

Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat

{"title":"从零开始还是预先训练？利用有限数据深入分析深度学习方法","authors":"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat","doi":"10.1007/s13198-024-02345-4","DOIUrl":null,"url":null,"abstract":"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\n</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":"135 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data\",\"authors\":\"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat\",\"doi\":\"10.1007/s13198-024-02345-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\\n</p>\",\"PeriodicalId\":14463,\"journal\":{\"name\":\"International Journal of System Assurance Engineering and Management\",\"volume\":\"135 1\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of System Assurance Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13198-024-02345-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02345-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}

引用次数: 0

摘要

不可否认，卷积神经网络（CNN）在图像识别领域的广泛应用标志着一项重大突破。然而，这些网络需要大量数据才能很好地学习，这可能具有挑战性。这可能会使模型容易出现过拟合，即在训练数据上表现良好，但在新数据上表现不佳。为解决这一问题，出现了各种策略，包括合理选择合适的网络架构。本研究通过对两种不同方法的比较分析，深入探讨了如何缓解数据稀缺问题：利用紧凑型 CNN 架构和使用预训练模型进行迁移学习。我们的研究涉及三个不同的数据集，每个数据集都来自不同的领域。值得注意的是，我们的研究结果揭示了性能上的细微差别。研究显示，使用 ResNet50 这样复杂的预训练模型，可以在花卉和玉米疾病识别数据集上获得更好的结果，这强调了针对特定数据类型利用先验知识的优势。相反，在肺炎数据集上，从零开始训练的较简单 CNN 架构是更优越的策略，这凸显了根据特定数据集和领域调整方法的必要性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data

The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

International Journal of System Assurance Engineering and Management ENGINEERING, MULTIDISCIPLINARY-

CiteScore

4.30

自引率

10.00%

发文量

252

期刊介绍： This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems. Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.