A Comparative Assessment of Various Embeddings for Keyword Extraction

Ghaith Ashqar, Alev Mutlu
{"title":"A Comparative Assessment of Various Embeddings for Keyword Extraction","authors":"Ghaith Ashqar, Alev Mutlu","doi":"10.1109/HORA58378.2023.10156762","DOIUrl":null,"url":null,"abstract":"Automatic keyword extraction from a text document is the problem of identifying in-text words or phrases that best describe the content of the text document. Recently, word embeddings found application in keyword extraction as they improve the performance by incorporating semantic information. In this study, we focus various embeddings and and compare their performance in keyword extraction. To this aim, firstly, we modified a keyword extraction system called KeyBERT to work with different embeddings. Then, we run the modfied application using ten models on seven benchmark datasets. The experimental findings show that all-mpnet-base-v2 achieved statistically better results over the other models in precision, recall, and F1 score. Moreover, all-mpnet-base-v2 achieved highest scores for MAP and MRR and also retrieved the most number of relevant keywords on the average.","PeriodicalId":247679,"journal":{"name":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-06-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2023 5th International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HORA58378.2023.10156762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Automatic keyword extraction from a text document is the problem of identifying in-text words or phrases that best describe the content of the text document. Recently, word embeddings found application in keyword extraction as they improve the performance by incorporating semantic information. In this study, we focus various embeddings and and compare their performance in keyword extraction. To this aim, firstly, we modified a keyword extraction system called KeyBERT to work with different embeddings. Then, we run the modfied application using ten models on seven benchmark datasets. The experimental findings show that all-mpnet-base-v2 achieved statistically better results over the other models in precision, recall, and F1 score. Moreover, all-mpnet-base-v2 achieved highest scores for MAP and MRR and also retrieved the most number of relevant keywords on the average.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
关键词提取中不同嵌入方法的比较评价
从文本文档中自动提取关键字是识别最能描述文本文档内容的文本单词或短语的问题。近年来,词嵌入技术在关键词提取中得到了广泛的应用,因为它通过融合语义信息来提高提取性能。在本研究中,我们关注了各种嵌入,并比较了它们在关键字提取方面的性能。为此,首先,我们修改了一个名为KeyBERT的关键字提取系统来处理不同的嵌入。然后,我们在七个基准数据集上使用十个模型运行修改后的应用程序。实验结果表明,与其他模型相比,all-mpnet-base-v2在准确率、召回率和F1分数方面取得了更好的统计结果。此外,all-mpnet-base-v2在MAP和MRR方面得分最高,并且平均检索到的相关关键词数量最多。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Classification of Urban Sounds with PSO and WO Based Feature Selection Methods Modeling a system determining the fastest way to get from one point to another by public transport NNA and Activation Equation-Based Prediction of New COVID-19 Infections Plaka tanıma sistemleri ve hibrit bir sistem önerisi Color Image Encryption Using a Sine Variation of the Logistic Map for S-Box and Key Generation
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1