{"title":"用于文本分类的神经嵌入和混合ML模型","authors":"Mariem Bounabi, K. E. Moutaouakil, K. Satori","doi":"10.1109/IRASET48871.2020.9092230","DOIUrl":null,"url":null,"abstract":"Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.","PeriodicalId":271840,"journal":{"name":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Neural Embedding & Hybrid ML Models for Text Classification\",\"authors\":\"Mariem Bounabi, K. E. Moutaouakil, K. Satori\",\"doi\":\"10.1109/IRASET48871.2020.9092230\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.\",\"PeriodicalId\":271840,\"journal\":{\"name\":\"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)\",\"volume\":\"22 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-04-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IRASET48871.2020.9092230\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRASET48871.2020.9092230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Neural Embedding & Hybrid ML Models for Text Classification
Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.