MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms

Findings Pub Date : 2024-01-25 DOI:10.48550/arXiv.2401.14526
Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte, O. E. Ojo, Xinyi Liu, Iyanuoluwa Shode, Yuan Zhao, Jing Peng, Anna Feldman
{"title":"MEDs for PETs: Multilingual Euphemism Disambiguation for Potentially Euphemistic Terms","authors":"Patrick Lee, Alain Chirino Trujillo, Diana Cuevas Plancarte, O. E. Ojo, Xinyi Liu, Iyanuoluwa Shode, Yuan Zhao, Jing Peng, Anna Feldman","doi":"10.48550/arXiv.2401.14526","DOIUrl":null,"url":null,"abstract":"Euphemisms are found across the world’s languages, making them a universal linguistic phenomenon. As such, euphemistic data may have useful properties for computational tasks across languages. In this study, we explore this premise by training a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.","PeriodicalId":508951,"journal":{"name":"Findings","volume":"281 3","pages":"875-881"},"PeriodicalIF":0.0000,"publicationDate":"2024-01-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Findings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.48550/arXiv.2401.14526","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0

Abstract

Euphemisms are found across the world’s languages, making them a universal linguistic phenomenon. As such, euphemistic data may have useful properties for computational tasks across languages. In this study, we explore this premise by training a multilingual transformer model (XLM-RoBERTa) to disambiguate potentially euphemistic terms (PETs) in multilingual and cross-lingual settings. In line with current trends, we demonstrate that zero-shot learning across languages takes place. We also show cases where multilingual models perform better on the task compared to monolingual models by a statistically significant margin, indicating that multilingual data presents additional opportunities for models to learn about cross-lingual, computational properties of euphemisms. In a follow-up analysis, we focus on universal euphemistic “categories” such as death and bodily functions among others. We test to see whether cross-lingual data of the same domain is more important than within-language data of other domains to further understand the nature of the cross-lingual transfer.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
MEDs for PETs:针对潜在委婉用语的多语种委婉消歧义法
委婉语遍布世界各种语言,是一种普遍的语言现象。因此,委婉语数据在跨语言计算任务中可能具有有用的特性。在本研究中,我们通过训练多语言转换器模型(XLM-RoBERTa)来探索这一前提,从而在多语言和跨语言环境中消除潜在委婉语(PET)的歧义。与当前趋势一致,我们证明了跨语言零点学习的发生。我们还展示了多语言模型在任务中的表现优于单语言模型的情况,其差异在统计学上非常明显,这表明多语言数据为模型学习委婉语的跨语言计算特性提供了更多机会。在后续分析中,我们将重点放在通用委婉语 "类别 "上,如死亡和身体机能等。我们将测试同一领域的跨语言数据是否比其他领域的语内数据更重要,以进一步了解跨语言迁移的性质。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Changes in Traffic Jams and Injuries Impact on Acceptability of Automated Vehicles: A Strong Curvilinear Relation with no signs of Loss Aversion. Day-of-Week, Month, and Seasonal Demand Variations: Comparing Flow Estimates Across New Travel Data Sources Human Mobility Patterns during the 2024 Total Solar Eclipse in Canada Substituting Car Trips: Does Intermodal Mobility Decrease External Costs and How Does It Affect Travel Times? An Analysis Based on GPS Tracking Data Revealed Preferences for Utilitarian Cycling Energy Expenditure versus Travel Time
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1