{"title":"MINING TAXATION DATA WITH PARALLEL BMARS","authors":"S. Bakin, M. Hegland, Graham J. Williams","doi":"10.1080/01495730008947349","DOIUrl":null,"url":null,"abstract":"Abstract A new parallel version of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm is discussed. By partitioning the data over the processors of a parallel computational system one achieves good parallel efficiency. Instead of using truncated power basis functions of the original MARS, the new method (BMARS) utilises B-sp!ines which improves numerical stability and reduces the computational cost of the procedure. In addition, the coefficients of the basis functions of a BMARS model provide quickly accessible information about the local behaviour of the function. The algorithm has a time complexity proportional to the number of data records. The method provides a new means for the detection of areas in the space of features which are characterised by the \"interesting\" patterns of response values. This is applied to searching for classes of incorrect tax returns using multiple predictor variables or features. The parallel algorithm makes it feasible to investigate very large databases, such as the taxation database.","PeriodicalId":406098,"journal":{"name":"Parallel Algorithms and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Parallel Algorithms and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1080/01495730008947349","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 6
Abstract
Abstract A new parallel version of Friedman's Multivariate Adaptive Regression Splines (MARS) algorithm is discussed. By partitioning the data over the processors of a parallel computational system one achieves good parallel efficiency. Instead of using truncated power basis functions of the original MARS, the new method (BMARS) utilises B-sp!ines which improves numerical stability and reduces the computational cost of the procedure. In addition, the coefficients of the basis functions of a BMARS model provide quickly accessible information about the local behaviour of the function. The algorithm has a time complexity proportional to the number of data records. The method provides a new means for the detection of areas in the space of features which are characterised by the "interesting" patterns of response values. This is applied to searching for classes of incorrect tax returns using multiple predictor variables or features. The parallel algorithm makes it feasible to investigate very large databases, such as the taxation database.