{"title":"经典机器学习算法在大数据环境中的适应:问题与挑战:案例研究:Spark下的隐马尔可夫模型","authors":"Imad Sassi, Sara Ouaftouh, S. Anter","doi":"10.1109/ICSSD47982.2019.9002857","DOIUrl":null,"url":null,"abstract":"Big Data Analytics presents a great opportunity for scientists and businesses. It changed the methods of managing and analyzing the huge amount of data. To make big data valuable, we often use Machine Learning algorithms. Indeed, these algorithms have shown, in the past, their processing speed, efficiency and accuracy. But today, with the complex characteristics of big data, new problems have emerged and we are facing new challenges when developing and designing a new Machine Learning algorithm for Big Data Analytics. Therefore, it is essential to review the classical algorithms to adapt them to this new context. One of the methods of adaptation is the coupling between new technologies (i.e., distributed computing by GPU, Hadoop, Spark) and the Machine Learning algorithms to reduce the computational cost of data analysis. This paper highlights main challenges of adaptation of Machine Learning algorithms to the Big Data context and describes a novel method to make these algorithms efficient and fast in Big Data processing by taking as a case study the Hidden Markov Models using Spark framework. The results of complexity comparison of classical algorithms and those adapted to the Big Data context using Spark show a great improvement.","PeriodicalId":342806,"journal":{"name":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"8","resultStr":"{\"title\":\"Adaptation of Classical Machine Learning Algorithms to Big Data Context: Problems and Challenges : Case Study: Hidden Markov Models Under Spark\",\"authors\":\"Imad Sassi, Sara Ouaftouh, S. Anter\",\"doi\":\"10.1109/ICSSD47982.2019.9002857\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Big Data Analytics presents a great opportunity for scientists and businesses. It changed the methods of managing and analyzing the huge amount of data. To make big data valuable, we often use Machine Learning algorithms. Indeed, these algorithms have shown, in the past, their processing speed, efficiency and accuracy. But today, with the complex characteristics of big data, new problems have emerged and we are facing new challenges when developing and designing a new Machine Learning algorithm for Big Data Analytics. Therefore, it is essential to review the classical algorithms to adapt them to this new context. One of the methods of adaptation is the coupling between new technologies (i.e., distributed computing by GPU, Hadoop, Spark) and the Machine Learning algorithms to reduce the computational cost of data analysis. This paper highlights main challenges of adaptation of Machine Learning algorithms to the Big Data context and describes a novel method to make these algorithms efficient and fast in Big Data processing by taking as a case study the Hidden Markov Models using Spark framework. The results of complexity comparison of classical algorithms and those adapted to the Big Data context using Spark show a great improvement.\",\"PeriodicalId\":342806,\"journal\":{\"name\":\"2019 1st International Conference on Smart Systems and Data Science (ICSSD)\",\"volume\":\"36 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"8\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 1st International Conference on Smart Systems and Data Science (ICSSD)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICSSD47982.2019.9002857\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 1st International Conference on Smart Systems and Data Science (ICSSD)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSSD47982.2019.9002857","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adaptation of Classical Machine Learning Algorithms to Big Data Context: Problems and Challenges : Case Study: Hidden Markov Models Under Spark
Big Data Analytics presents a great opportunity for scientists and businesses. It changed the methods of managing and analyzing the huge amount of data. To make big data valuable, we often use Machine Learning algorithms. Indeed, these algorithms have shown, in the past, their processing speed, efficiency and accuracy. But today, with the complex characteristics of big data, new problems have emerged and we are facing new challenges when developing and designing a new Machine Learning algorithm for Big Data Analytics. Therefore, it is essential to review the classical algorithms to adapt them to this new context. One of the methods of adaptation is the coupling between new technologies (i.e., distributed computing by GPU, Hadoop, Spark) and the Machine Learning algorithms to reduce the computational cost of data analysis. This paper highlights main challenges of adaptation of Machine Learning algorithms to the Big Data context and describes a novel method to make these algorithms efficient and fast in Big Data processing by taking as a case study the Hidden Markov Models using Spark framework. The results of complexity comparison of classical algorithms and those adapted to the Big Data context using Spark show a great improvement.