{"title":"Research Paper Classification using Supervised Machine Learning Techniques","authors":"S. Chowdhury, M. Schoen","doi":"10.1109/IETC47856.2020.9249211","DOIUrl":null,"url":null,"abstract":"In this work, different Machine Learning (ML) techniques are used and evaluated based on their performance of classifying peer reviewed published content. The ultimate objective is to extract meaningful information from published abstracts. In pursuing this objective, the ML techniques are utilized to classify different publications into three fields: Science, Business, and Social Science. The ML techniques applied in this work are Support Vector Machines, Naïve Bayes, K-Nearest Neighbor, and Decision Tree. In addition to the description of the utilized ML algorithms, the methodology and algorithms for text recognition using the aforementioned ML techniques are provided. The comparative study based on four different performance measures suggests that – with the exception of Decision Tree algorithm – the proposed ML techniques with the detailed pre-processing algorithms work well for classifying publications into categories based on the text provided in the abstract.","PeriodicalId":186446,"journal":{"name":"2020 Intermountain Engineering, Technology and Computing (IETC)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"29","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Intermountain Engineering, Technology and Computing (IETC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IETC47856.2020.9249211","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 29
Abstract
In this work, different Machine Learning (ML) techniques are used and evaluated based on their performance of classifying peer reviewed published content. The ultimate objective is to extract meaningful information from published abstracts. In pursuing this objective, the ML techniques are utilized to classify different publications into three fields: Science, Business, and Social Science. The ML techniques applied in this work are Support Vector Machines, Naïve Bayes, K-Nearest Neighbor, and Decision Tree. In addition to the description of the utilized ML algorithms, the methodology and algorithms for text recognition using the aforementioned ML techniques are provided. The comparative study based on four different performance measures suggests that – with the exception of Decision Tree algorithm – the proposed ML techniques with the detailed pre-processing algorithms work well for classifying publications into categories based on the text provided in the abstract.