{"title":"Neural Embedding & Hybrid ML Models for Text Classification","authors":"Mariem Bounabi, K. E. Moutaouakil, K. Satori","doi":"10.1109/IRASET48871.2020.9092230","DOIUrl":null,"url":null,"abstract":"Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.","PeriodicalId":271840,"journal":{"name":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRASET48871.2020.9092230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.