Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat
{"title":"从零开始还是预先训练?利用有限数据深入分析深度学习方法","authors":"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat","doi":"10.1007/s13198-024-02345-4","DOIUrl":null,"url":null,"abstract":"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\n</p>","PeriodicalId":14463,"journal":{"name":"International Journal of System Assurance Engineering and Management","volume":"135 1","pages":""},"PeriodicalIF":1.6000,"publicationDate":"2024-04-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data\",\"authors\":\"Saqib Ul Sabha, Assif Assad, Nusrat Mohi Ud Din, Muzafar Rasool Bhat\",\"doi\":\"10.1007/s13198-024-02345-4\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p>The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.\\n</p>\",\"PeriodicalId\":14463,\"journal\":{\"name\":\"International Journal of System Assurance Engineering and Management\",\"volume\":\"135 1\",\"pages\":\"\"},\"PeriodicalIF\":1.6000,\"publicationDate\":\"2024-04-29\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"International Journal of System Assurance Engineering and Management\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s13198-024-02345-4\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"ENGINEERING, MULTIDISCIPLINARY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of System Assurance Engineering and Management","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s13198-024-02345-4","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, MULTIDISCIPLINARY","Score":null,"Total":0}
From scratch or pretrained? An in-depth analysis of deep learning approaches with limited data
The widespread adoption of Convolutional Neural Networks (CNNs) in image recognition has undeniably marked a significant breakthrough. However, these networks need a lot of data to learn well, which can be challenging. This can make models prone to overfitting, where they perform well on training data but not on new data. Various strategies have emerged to address this issue, including reasonably selecting an appropriate network architecture. This study delves into mitigating data scarcity by undertaking a comparative analysis of two distinct methods: utilizing compact CNN architectures and applying transfer learning with pre-trained models. Our investigation extends across three disparate datasets, each hailing from a different domain. Remarkably, our findings unveil nuances in performance. The study reveals that using a complex pre-trained model like ResNet50 yields better results for the flower and Maize disease identification datasets, emphasizing the advantages of leveraging prior knowledge for specific data types. Conversely, starting from a simpler CNN architecture trained from scratch is the superior strategy with the Pneumonia dataset, highlighting the need to adapt the approach based on the specific dataset and domain.
期刊介绍:
This Journal is established with a view to cater to increased awareness for high quality research in the seamless integration of heterogeneous technologies to formulate bankable solutions to the emergent complex engineering problems.
Assurance engineering could be thought of as relating to the provision of higher confidence in the reliable and secure implementation of a system’s critical characteristic features through the espousal of a holistic approach by using a wide variety of cross disciplinary tools and techniques. Successful realization of sustainable and dependable products, systems and services involves an extensive adoption of Reliability, Quality, Safety and Risk related procedures for achieving high assurancelevels of performance; also pivotal are the management issues related to risk and uncertainty that govern the practical constraints encountered in their deployment. It is our intention to provide a platform for the modeling and analysis of large engineering systems, among the other aforementioned allied goals of systems assurance engineering, leading to the enforcement of performance enhancement measures. Achieving a fine balance between theory and practice is the primary focus. The Journal only publishes high quality papers that have passed the rigorous peer review procedure of an archival scientific Journal. The aim is an increasing number of submissions, wide circulation and a high impact factor.