{"title":"A study of variance and its utility in machine learning","authors":"K. G. Sharma, Yashpal Singh","doi":"10.2174/2210327912666220617153359","DOIUrl":null,"url":null,"abstract":"\n\nWith the availability of inexpensive devices like storage and data sensors, collecting and storing data is now simpler than ever. Biotechnology, pharmacy, business, online marketing websites, Twitter, Facebook, and blogs are some of the sources of the data. Understanding the data is crucial today as every business activity from private to public, from hospitals to mega mart benefits from this. However, due to the explosive volume of data, it is becoming almost impossible to decipher the data manually. We are creating 2.5 quintillion bytes per day in 2022. One quintillion byte is one billion Gigabytes. Approximately, 90% of the total data is created in the last two years. Naturally, an automatic technique to analyze the data is a necessity of today. Therefore, data mining is performed with the help of machine learning tools to analyze and understand the data. Data Mining and Machine Learning are heavily dependent on statistical tools and techniques. Therefore, we sometimes use the term – “Statistical Learning” for Machine Learning. Many machine learning techniques exist in the literature and improvement is a continuous process as no model is perfect. This paper examines the influence of variance, a statistical concept, on various machine learning approaches and tries to understand how this concept can be used to improve performance.\n","PeriodicalId":37686,"journal":{"name":"International Journal of Sensors, Wireless Communications and Control","volume":"71 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2022-06-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Sensors, Wireless Communications and Control","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2174/2210327912666220617153359","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Mathematics","Score":null,"Total":0}
引用次数: 1
Abstract
With the availability of inexpensive devices like storage and data sensors, collecting and storing data is now simpler than ever. Biotechnology, pharmacy, business, online marketing websites, Twitter, Facebook, and blogs are some of the sources of the data. Understanding the data is crucial today as every business activity from private to public, from hospitals to mega mart benefits from this. However, due to the explosive volume of data, it is becoming almost impossible to decipher the data manually. We are creating 2.5 quintillion bytes per day in 2022. One quintillion byte is one billion Gigabytes. Approximately, 90% of the total data is created in the last two years. Naturally, an automatic technique to analyze the data is a necessity of today. Therefore, data mining is performed with the help of machine learning tools to analyze and understand the data. Data Mining and Machine Learning are heavily dependent on statistical tools and techniques. Therefore, we sometimes use the term – “Statistical Learning” for Machine Learning. Many machine learning techniques exist in the literature and improvement is a continuous process as no model is perfect. This paper examines the influence of variance, a statistical concept, on various machine learning approaches and tries to understand how this concept can be used to improve performance.
期刊介绍:
International Journal of Sensors, Wireless Communications and Control publishes timely research articles, full-length/ mini reviews and communications on these three strongly related areas, with emphasis on networked control systems whose sensors are interconnected via wireless communication networks. The emergence of high speed wireless network technologies allows a cluster of devices to be linked together economically to form a distributed system. Wireless communication is playing an increasingly important role in such distributed systems. Transmitting sensor measurements and control commands over wireless links allows rapid deployment, flexible installation, fully mobile operation and prevents the cable wear and tear problem in industrial automation, healthcare and environmental assessment. Wireless networked systems has raised and continues to raise fundamental challenges in the fields of science, engineering and industrial applications, hence, more new modelling techniques, problem formulations and solutions are required.