{"title":"CLASSIFICATION OF ARTICLES USING MACHINE LEARNING: CASE STUDY OF TRA VINH UNIVERSITY JOURNAL OF SCIENCE, VIETNAM","authors":"Nghe Thai Nguyen, Nhut Minh Hua, An Bao Nguyen","doi":"10.35382/tvujs.13.6.2023.2108","DOIUrl":null,"url":null,"abstract":"The rapid development of technologies has led to an increasing number of research works submitted to journals or conferences. However, the process of submitting articles can be challenging for authors due to the wide range of subjects covered by submission systems, such as the Association for Computing Machinery, with 2,000 subjects. This challenge arises from the need to accurately categorize the manuscript into the appropriate subject area before submission. This article proposes an automatic solution that extracts information and categorizes scientific papers into relevant topics to address this issue. The proposed approach employs pre-processing, extraction, vectorization, and classification techniques using three machine learning methods: support vector machines, Naïve Bayes, and decision trees. The experiments conducted on a dataset of articles published in the Tra Vinh University Journal of Science show promising results. The support vector machines technique, in particular, achieved an accuracy rate of over 75%, demonstrating its potential as a tool for developing an automatic classification system for scientific papers.","PeriodicalId":159074,"journal":{"name":"TRA VINH UNIVERSITY JOURNAL OF SCIENCE; ISSN: 2815-6072; E-ISSN: 2815-6099","volume":"17 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2023-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"TRA VINH UNIVERSITY JOURNAL OF SCIENCE; ISSN: 2815-6072; E-ISSN: 2815-6099","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.35382/tvujs.13.6.2023.2108","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
The rapid development of technologies has led to an increasing number of research works submitted to journals or conferences. However, the process of submitting articles can be challenging for authors due to the wide range of subjects covered by submission systems, such as the Association for Computing Machinery, with 2,000 subjects. This challenge arises from the need to accurately categorize the manuscript into the appropriate subject area before submission. This article proposes an automatic solution that extracts information and categorizes scientific papers into relevant topics to address this issue. The proposed approach employs pre-processing, extraction, vectorization, and classification techniques using three machine learning methods: support vector machines, Naïve Bayes, and decision trees. The experiments conducted on a dataset of articles published in the Tra Vinh University Journal of Science show promising results. The support vector machines technique, in particular, achieved an accuracy rate of over 75%, demonstrating its potential as a tool for developing an automatic classification system for scientific papers.