{"title":"The Algorithm of Document Classification of Research and Education Institution Using Machine Learning Methods","authors":"M. Krasnyanskiy, A. Obukhov, E. M. Solomatina","doi":"10.1109/EASTСONF.2019.8725319","DOIUrl":null,"url":null,"abstract":"Currently, there are many classification technologies based on machine learning and artificial intelligence. However, a sufficient theoretical basis for the integration of existing classification methods for the analysis of documents of scientific and educational institutions is not developed. Within the framework of the research provided in the article, an algorithm of classification of documents is formed, taking into account the specifics of the documents of the subject area of scientific and educational institution. The system of characteristics, by which the documents can be grouped to solve the problem of combined classification, is also presented. The article considers the approach of preprocessing text allowing the use of well-known methods of machine learning to improve the accuracy and speed of documents classification. Thus, the conducted research can be used to solve the problem of classification of documents in electronic document management systems of scientific and educational institutions.","PeriodicalId":261560,"journal":{"name":"2019 International Science and Technology Conference \"EastСonf\"","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 International Science and Technology Conference \"EastСonf\"","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/EASTСONF.2019.8725319","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
Currently, there are many classification technologies based on machine learning and artificial intelligence. However, a sufficient theoretical basis for the integration of existing classification methods for the analysis of documents of scientific and educational institutions is not developed. Within the framework of the research provided in the article, an algorithm of classification of documents is formed, taking into account the specifics of the documents of the subject area of scientific and educational institution. The system of characteristics, by which the documents can be grouped to solve the problem of combined classification, is also presented. The article considers the approach of preprocessing text allowing the use of well-known methods of machine learning to improve the accuracy and speed of documents classification. Thus, the conducted research can be used to solve the problem of classification of documents in electronic document management systems of scientific and educational institutions.