{"title":"云系统的并行大规模属性约简","authors":"Junbo Zhang, Tianrui Li, Yi Pan","doi":"10.1109/PDCAT.2013.36","DOIUrl":null,"url":null,"abstract":"Attribute reduction for big data is viewed as an important preprocessing step in the areas of pattern recognition, machine learning and data mining. In this paper, a novel parallel method based on MapReduce for large-scale attribute reduction is proposed. By using this method, several representative heuristic attribute reduction algorithms in rough set theory have been parallelized. Further, each of the improved parallel algorithms can select the same attribute reduct as its sequential version, therefore, owns the same classification accuracy. An extensive experimental evaluation shows that these parallel algorithms are effective for big data.","PeriodicalId":187974,"journal":{"name":"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2013-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"PLAR: Parallel Large-Scale Attribute Reduction on Cloud Systems\",\"authors\":\"Junbo Zhang, Tianrui Li, Yi Pan\",\"doi\":\"10.1109/PDCAT.2013.36\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Attribute reduction for big data is viewed as an important preprocessing step in the areas of pattern recognition, machine learning and data mining. In this paper, a novel parallel method based on MapReduce for large-scale attribute reduction is proposed. By using this method, several representative heuristic attribute reduction algorithms in rough set theory have been parallelized. Further, each of the improved parallel algorithms can select the same attribute reduct as its sequential version, therefore, owns the same classification accuracy. An extensive experimental evaluation shows that these parallel algorithms are effective for big data.\",\"PeriodicalId\":187974,\"journal\":{\"name\":\"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"volume\":\"10 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2013-12-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/PDCAT.2013.36\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2013 International Conference on Parallel and Distributed Computing, Applications and Technologies","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PDCAT.2013.36","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
PLAR: Parallel Large-Scale Attribute Reduction on Cloud Systems
Attribute reduction for big data is viewed as an important preprocessing step in the areas of pattern recognition, machine learning and data mining. In this paper, a novel parallel method based on MapReduce for large-scale attribute reduction is proposed. By using this method, several representative heuristic attribute reduction algorithms in rough set theory have been parallelized. Further, each of the improved parallel algorithms can select the same attribute reduct as its sequential version, therefore, owns the same classification accuracy. An extensive experimental evaluation shows that these parallel algorithms are effective for big data.