A Keyphrase Extraction Method Based on Multi-feature Evaluation and Mask Mechanism

Liwen Ma, Weifeng Liu
{"title":"A Keyphrase Extraction Method Based on Multi-feature Evaluation and Mask Mechanism","authors":"Liwen Ma, Weifeng Liu","doi":"10.1109/ICCAIS56082.2022.9990092","DOIUrl":null,"url":null,"abstract":"Keyphrase extraction aims to identify phrases in documents that contain core content. However, existing unsupervised keyphrase extraction models are limited to focusing on a single feature leading to biased results. In response to the above problems, it evaluates keyphrase scores through multiple features of semantic importance, topic diversity, and position features. Firstly, it masked the candidate keyphrase from a document and the Manhattan distance between the mask document and the original document is calculated as the semantic importance feature. Secondly, it calculated the topic-word distribution of candidate keyphrases as topic diversity, and the position features are calculated. Finally, the phrase importance score is calculated by integrating the three sub-models. Experiments are conducted on three academic datasets and compared with six state-of-the-art baseline models, outperforming existing methods. The results show that evaluating phrase importance from multiple features significantly improves the performance of extracting keyphrases.","PeriodicalId":273404,"journal":{"name":"2022 11th International Conference on Control, Automation and Information Sciences (ICCAIS)","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-11-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 11th International Conference on Control, Automation and Information Sciences (ICCAIS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCAIS56082.2022.9990092","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Keyphrase extraction aims to identify phrases in documents that contain core content. However, existing unsupervised keyphrase extraction models are limited to focusing on a single feature leading to biased results. In response to the above problems, it evaluates keyphrase scores through multiple features of semantic importance, topic diversity, and position features. Firstly, it masked the candidate keyphrase from a document and the Manhattan distance between the mask document and the original document is calculated as the semantic importance feature. Secondly, it calculated the topic-word distribution of candidate keyphrases as topic diversity, and the position features are calculated. Finally, the phrase importance score is calculated by integrating the three sub-models. Experiments are conducted on three academic datasets and compared with six state-of-the-art baseline models, outperforming existing methods. The results show that evaluating phrase importance from multiple features significantly improves the performance of extracting keyphrases.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多特征评价和掩码机制的关键词提取方法
关键词提取旨在识别文档中包含核心内容的短语。然而,现有的无监督关键字提取模型仅限于关注单个特征,导致结果有偏差。针对上述问题,该算法通过语义重要性、话题多样性和位置特征等多个特征来评估关键词得分。首先,将候选关键词从文档中屏蔽出来,计算掩码文档与原始文档之间的曼哈顿距离作为语义重要性特征。其次,计算候选关键词的主题词分布作为主题多样性,并计算其位置特征;最后,通过对三个子模型的整合,计算出短语重要性得分。实验在三个学术数据集上进行,并与六个最先进的基线模型进行了比较,优于现有方法。结果表明,从多个特征中评估短语重要性显著提高了关键短语提取的性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Wireless Smart Shoes for Running Gait Analysis Based on Deep Learning A quadratic correlation algorithm with variable sets of lags for frequency estimation Deployment of UAVs for Optimal Multihop Ad-hoc Networks Using Particle Swarm Optimization and Behavior-based Control Analyze the Transient Overvoltages in the station of Vietnamese model HVDC-MMC system Dual-scale generalized Radon-Fourier transform family for long time coherent integration
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1