词义归纳与聚类和互信息最大化

Hadi Abdine , Moussa Kamal Eddine , Davide Buscaldi , Michalis Vazirgiannis
{"title":"词义归纳与聚类和互信息最大化","authors":"Hadi Abdine ,&nbsp;Moussa Kamal Eddine ,&nbsp;Davide Buscaldi ,&nbsp;Michalis Vazirgiannis","doi":"10.1016/j.aiopen.2023.12.001","DOIUrl":null,"url":null,"abstract":"<div><p>Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word’s senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state-of-the-art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public<span><sup>1</sup></span>.</p></div>","PeriodicalId":100068,"journal":{"name":"AI Open","volume":"4 ","pages":"Pages 193-201"},"PeriodicalIF":0.0000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2666651023000232/pdfft?md5=a0553e94f2fab365fb751bcc0ddf8e6c&pid=1-s2.0-S2666651023000232-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Word sense induction with agglomerative clustering and mutual information maximization\",\"authors\":\"Hadi Abdine ,&nbsp;Moussa Kamal Eddine ,&nbsp;Davide Buscaldi ,&nbsp;Michalis Vazirgiannis\",\"doi\":\"10.1016/j.aiopen.2023.12.001\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word’s senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state-of-the-art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public<span><sup>1</sup></span>.</p></div>\",\"PeriodicalId\":100068,\"journal\":{\"name\":\"AI Open\",\"volume\":\"4 \",\"pages\":\"Pages 193-201\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2666651023000232/pdfft?md5=a0553e94f2fab365fb751bcc0ddf8e6c&pid=1-s2.0-S2666651023000232-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"AI Open\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2666651023000232\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"AI Open","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2666651023000232","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

摘要

词义归纳(WSI)是自然语言处理中一个具有挑战性的问题,它涉及在无监督的情况下自动检测一个词的词义(即含义)。最近的研究通过预训练一个语言模型,该模型可以专门用于词义消歧,从而在词义归纳任务中取得了显著的成果。与此相反,其他研究则采用现成的预训练语言模型,并增加了诱导词义的策略。本文提出了一种基于分层聚类和不变信息聚类(IIC)的新型无监督方法。IIC 损失用于训练一个小型模型,以优化一对合成意译中出现的目标词的两个向量表示之间的互信息。该模型随后将用于推理模式,以提取更高质量的向量表示,用于分层聚类。我们在两个 WSI 任务和两种不同的聚类配置(固定聚类数和动态聚类数)中对我们的方法进行了评估。我们的经验表明,我们的方法至少与最先进的基线方法不相上下,在几种配置中的表现都优于它们。重现这项工作的代码和数据可公开获取1。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Word sense induction with agglomerative clustering and mutual information maximization

Word sense induction (WSI) is a challenging problem in natural language processing that involves the unsupervised automatic detection of a word’s senses (i.e., meanings). Recent work achieves significant results on the WSI task by pre-training a language model that can exclusively disambiguate word senses. In contrast, others employ off-the-shelf pre-trained language models with additional strategies to induce senses. This paper proposes a novel unsupervised method based on hierarchical clustering and invariant information clustering (IIC). The IIC loss is used to train a small model to optimize the mutual information between two vector representations of a target word occurring in a pair of synthetic paraphrases. This model is later used in inference mode to extract a higher-quality vector representation to be used in the hierarchical clustering. We evaluate our method on two WSI tasks and in two distinct clustering configurations (fixed and dynamic number of clusters). We empirically show that our approach is at least on par with the state-of-the-art baselines, outperforming them in several configurations. The code and data to reproduce this work are available to the public1.

求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
CiteScore
45.00
自引率
0.00%
发文量
0
期刊最新文献
GPT understands, too Adaptive negative representations for graph contrastive learning PM2.5 forecasting under distribution shift: A graph learning approach Enhancing neural network classification using fractional-order activation functions CPT: Colorful Prompt Tuning for pre-trained vision-language models
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1