深度学习中数据稀缺问题的系统综述:解决方案与应用

ACM Computing Surveys (CSUR) Pub Date : 2022-01-06 DOI:10.1145/3502287

Ms. Aayushi Bansal, Dr. Rewa Sharma, Dr. Mamta Kathuria

{"title":"深度学习中数据稀缺问题的系统综述:解决方案与应用","authors":"Ms. Aayushi Bansal, Dr. Rewa Sharma, Dr. Mamta Kathuria","doi":"10.1145/3502287","DOIUrl":null,"url":null,"abstract":"Recent advancements in deep learning architecture have increased its utility in real-life applications. Deep learning models require a large amount of data to train the model. In many application domains, there is a limited set of data available for training neural networks as collecting new data is either not feasible or requires more resources such as in marketing, computer vision, and medical science. These models require a large amount of data to avoid the problem of overfitting. One of the data space solutions to the problem of limited data is data augmentation. The purpose of this study focuses on various data augmentation techniques that can be used to further improve the accuracy of a neural network. This saves the cost and time consumption required to collect new data for the training of deep neural networks by augmenting available data. This also regularizes the model and improves its capability of generalization. The need for large datasets in different fields such as computer vision, natural language processing, security, and healthcare is also covered in this survey paper. The goal of this paper is to provide a comprehensive survey of recent advancements in data augmentation techniques and their application in various domains.","PeriodicalId":7000,"journal":{"name":"ACM Computing Surveys (CSUR)","volume":"248 1","pages":"1 - 29"},"PeriodicalIF":0.0000,"publicationDate":"2022-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":"{\"title\":\"A Systematic Review on Data Scarcity Problem in Deep Learning: Solution and Applications\",\"authors\":\"Ms. Aayushi Bansal, Dr. Rewa Sharma, Dr. Mamta Kathuria\",\"doi\":\"10.1145/3502287\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Recent advancements in deep learning architecture have increased its utility in real-life applications. Deep learning models require a large amount of data to train the model. In many application domains, there is a limited set of data available for training neural networks as collecting new data is either not feasible or requires more resources such as in marketing, computer vision, and medical science. These models require a large amount of data to avoid the problem of overfitting. One of the data space solutions to the problem of limited data is data augmentation. The purpose of this study focuses on various data augmentation techniques that can be used to further improve the accuracy of a neural network. This saves the cost and time consumption required to collect new data for the training of deep neural networks by augmenting available data. This also regularizes the model and improves its capability of generalization. The need for large datasets in different fields such as computer vision, natural language processing, security, and healthcare is also covered in this survey paper. The goal of this paper is to provide a comprehensive survey of recent advancements in data augmentation techniques and their application in various domains.\",\"PeriodicalId\":7000,\"journal\":{\"name\":\"ACM Computing Surveys (CSUR)\",\"volume\":\"248 1\",\"pages\":\"1 - 29\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-01-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"33\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"ACM Computing Surveys (CSUR)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3502287\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"ACM Computing Surveys (CSUR)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3502287","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 33

摘要

深度学习架构的最新进展增加了其在现实应用中的实用性。深度学习模型需要大量的数据来训练模型。在许多应用领域，可用于训练神经网络的数据集有限，因为收集新数据要么不可行，要么需要更多的资源，例如市场营销、计算机视觉和医学科学。这些模型需要大量的数据来避免过拟合的问题。解决数据有限问题的数据空间解决方案之一是数据增强。本研究的目的集中在各种数据增强技术，可用于进一步提高神经网络的准确性。通过增加可用数据，节省了为深度神经网络训练收集新数据所需的成本和时间。这也使模型规范化，提高了模型的泛化能力。本调查报告还涵盖了计算机视觉、自然语言处理、安全和医疗保健等不同领域对大型数据集的需求。本文的目的是对数据增强技术的最新进展及其在各个领域的应用进行全面的综述。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A Systematic Review on Data Scarcity Problem in Deep Learning: Solution and Applications

Recent advancements in deep learning architecture have increased its utility in real-life applications. Deep learning models require a large amount of data to train the model. In many application domains, there is a limited set of data available for training neural networks as collecting new data is either not feasible or requires more resources such as in marketing, computer vision, and medical science. These models require a large amount of data to avoid the problem of overfitting. One of the data space solutions to the problem of limited data is data augmentation. The purpose of this study focuses on various data augmentation techniques that can be used to further improve the accuracy of a neural network. This saves the cost and time consumption required to collect new data for the training of deep neural networks by augmenting available data. This also regularizes the model and improves its capability of generalization. The need for large datasets in different fields such as computer vision, natural language processing, security, and healthcare is also covered in this survey paper. The goal of this paper is to provide a comprehensive survey of recent advancements in data augmentation techniques and their application in various domains.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

ACM Computing Surveys (CSUR)

自引率

0.00%

发文量