{"title":"使用Gridgain平台的基于云的文本文档分类","authors":"M. Samovsky, T. Kacur","doi":"10.1109/SACI.2012.6250009","DOIUrl":null,"url":null,"abstract":"Motivation for the research effort presented in this paper is to use the cloud computing storage and computational capabilities for text mining tasks. Cloud computing is nowadays favored approach in area of data- analysis and related fields by providing data storage and computational capabilities as the services. Main aim of our research activities is to design and develop experimental cloud platform for text mining tasks. In this particular paper we describe the design and implementation of a distributed tree-based algorithm for text categorization purposes. We used our own implementation of decision tree classification algorithm and used Gridgain framework for its cloud implementation. Cloud also provides storage services for handling large data collections as well as increases computational effectiveness as the algorithm is implemented in distributed fashion. We describe the experiments we have performed on the private cloud using the two datasets and analyze the results.","PeriodicalId":293436,"journal":{"name":"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)","volume":"23 6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-05-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Cloud-based classification of text documents using the Gridgain platform\",\"authors\":\"M. Samovsky, T. Kacur\",\"doi\":\"10.1109/SACI.2012.6250009\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Motivation for the research effort presented in this paper is to use the cloud computing storage and computational capabilities for text mining tasks. Cloud computing is nowadays favored approach in area of data- analysis and related fields by providing data storage and computational capabilities as the services. Main aim of our research activities is to design and develop experimental cloud platform for text mining tasks. In this particular paper we describe the design and implementation of a distributed tree-based algorithm for text categorization purposes. We used our own implementation of decision tree classification algorithm and used Gridgain framework for its cloud implementation. Cloud also provides storage services for handling large data collections as well as increases computational effectiveness as the algorithm is implemented in distributed fashion. We describe the experiments we have performed on the private cloud using the two datasets and analyze the results.\",\"PeriodicalId\":293436,\"journal\":{\"name\":\"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)\",\"volume\":\"23 6 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2012-05-24\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SACI.2012.6250009\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 7th IEEE International Symposium on Applied Computational Intelligence and Informatics (SACI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SACI.2012.6250009","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Cloud-based classification of text documents using the Gridgain platform
Motivation for the research effort presented in this paper is to use the cloud computing storage and computational capabilities for text mining tasks. Cloud computing is nowadays favored approach in area of data- analysis and related fields by providing data storage and computational capabilities as the services. Main aim of our research activities is to design and develop experimental cloud platform for text mining tasks. In this particular paper we describe the design and implementation of a distributed tree-based algorithm for text categorization purposes. We used our own implementation of decision tree classification algorithm and used Gridgain framework for its cloud implementation. Cloud also provides storage services for handling large data collections as well as increases computational effectiveness as the algorithm is implemented in distributed fashion. We describe the experiments we have performed on the private cloud using the two datasets and analyze the results.