Sheelesh Kumar Sharma, Navel Sharma, Prem Prakash Potter
{"title":"Fusion Approach for Document Classification using Random Forest and SVM","authors":"Sheelesh Kumar Sharma, Navel Sharma, Prem Prakash Potter","doi":"10.1109/SMART50582.2020.9337131","DOIUrl":null,"url":null,"abstract":"Document classification is an important task due to its many potential applications. With the ever-increasing number of digital documents, it has become imperative to design efficient and accurate methods for document classification. When the categories of documents are already known, the problem can be solved using supervised learning approach. System can learn the traits of a document category from the labeled data and later on can be used as a predictor for the unseen data. Here, we discuss a fusion supervised learning approach for document classification. The power of Random forests and Support Vector Machine is harnessed for making a hybrid approach for the task of document categorization. The proposed approach categories documents into their respective categories and it works well various benchmarked datasets such as 20 Newsgroups, CMU and Classic Data Sets.","PeriodicalId":129946,"journal":{"name":"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 9th International Conference System Modeling and Advancement in Research Trends (SMART)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SMART50582.2020.9337131","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Document classification is an important task due to its many potential applications. With the ever-increasing number of digital documents, it has become imperative to design efficient and accurate methods for document classification. When the categories of documents are already known, the problem can be solved using supervised learning approach. System can learn the traits of a document category from the labeled data and later on can be used as a predictor for the unseen data. Here, we discuss a fusion supervised learning approach for document classification. The power of Random forests and Support Vector Machine is harnessed for making a hybrid approach for the task of document categorization. The proposed approach categories documents into their respective categories and it works well various benchmarked datasets such as 20 Newsgroups, CMU and Classic Data Sets.