Neural Embedding & Hybrid ML Models for Text Classification

Mariem Bounabi, K. E. Moutaouakil, K. Satori
{"title":"Neural Embedding & Hybrid ML Models for Text Classification","authors":"Mariem Bounabi, K. E. Moutaouakil, K. Satori","doi":"10.1109/IRASET48871.2020.9092230","DOIUrl":null,"url":null,"abstract":"Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.","PeriodicalId":271840,"journal":{"name":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 1st International Conference on Innovative Research in Applied Science, Engineering and Technology (IRASET)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IRASET48871.2020.9092230","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

Representation of knowledge remains a problem for models of machine learning (ML). The Paragraph vector is one of the current methods for embedding the text, where many parameters govern the utility of representation. In this context, we are addressing the effect, on the text classification area, of Paragraph Vector-Distributed Memory (PV-DM) as variant of doc2vec. In comparison, we apply other classification systems focused on doc2vec forms, and a collection of classifiers with current practices in this article. Then, we incorporate hybrid ML methods to improve the quality of classification. The experiments, on benchmarking dataset, prove that the results obtained are excellent, with 99% accuracy in the system based on the PV-DM with average method, and majority voting as a classifier.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
用于文本分类的神经嵌入和混合ML模型
知识的表示仍然是机器学习(ML)模型的一个问题。段落向量是当前嵌入文本的方法之一,其中许多参数控制了表示的效用。在这种情况下,我们正在解决段落向量分布式记忆(PV-DM)作为doc2vec变体在文本分类领域的影响。相比之下,我们在本文中应用了其他侧重于doc2vec表单的分类系统,以及一组具有当前实践的分类器。然后,我们结合混合机器学习方法来提高分类质量。在基准数据集上进行的实验证明,基于平均方法的PV-DM,以多数投票作为分类器,系统的准确率达到99%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Conception of a Training System for Emergency Situation Managers Optimization by the Response Surface Methodology of Color Optimal Control of Wind Energy Generation System Synthesis and Characterisation of Anhydrous Proton Conducting Membranes Based on Sulfonated Poly(vinyl alcohol) and Silicotungstic Acid with or without Silica for Fuel Cell Applications Towards a behavioral network intrusion detection system based on the SVM model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1