J. Flores, J. Garcia-Nava, Monserrat A. Castro-Coria, Victor M. Tellez, B. E. Huerta, Josue Espinosa-Romero, F. Calderón
{"title":"Parallel mining of frequent patterns for school records analytics at the Universidad Michoacana","authors":"J. Flores, J. Garcia-Nava, Monserrat A. Castro-Coria, Victor M. Tellez, B. E. Huerta, Josue Espinosa-Romero, F. Calderón","doi":"10.1109/ROPEC.2017.8261636","DOIUrl":null,"url":null,"abstract":"This paper presents research results on school record analytics, developed for Universidad Michoacana (UM-SNH), based on a parallel implementation of data mining techniques. Core elements of this research work were finding frequent patterns on academic records for all students of UMSNH from 2005 to 2016, and searching for relevant frequent pattern subsets by using the distributed computing platform Spark. The FP-Growth algorithm used for finding frequent patterns is presented, as well as serial, concurrent, and parallel implementations of the mining process based on it. Experimental results are discussed on two different directions: (a) the superior performance achieved by parallel implementation when compared to serial and concurrent versions of the application, and (b) the advantages that mining at the frequent patterns level provides for information retrieval on this specific problem, when compared to mining at association rules or correlation statistics levels.","PeriodicalId":260469,"journal":{"name":"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","volume":"55 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International Autumn Meeting on Power, Electronics and Computing (ROPEC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ROPEC.2017.8261636","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
This paper presents research results on school record analytics, developed for Universidad Michoacana (UM-SNH), based on a parallel implementation of data mining techniques. Core elements of this research work were finding frequent patterns on academic records for all students of UMSNH from 2005 to 2016, and searching for relevant frequent pattern subsets by using the distributed computing platform Spark. The FP-Growth algorithm used for finding frequent patterns is presented, as well as serial, concurrent, and parallel implementations of the mining process based on it. Experimental results are discussed on two different directions: (a) the superior performance achieved by parallel implementation when compared to serial and concurrent versions of the application, and (b) the advantages that mining at the frequent patterns level provides for information retrieval on this specific problem, when compared to mining at association rules or correlation statistics levels.