“Somewhere along your pedigree, a bitch got over the wall!” A proposal of implicitly offensive language typology

Q2 Arts and Humanities Lodz Papers in Pragmatics Pub Date : 2023-12-01 DOI:10.1515/lpp-2023-0019

Kristina Š. Despot, A. Anić, Tony Veale

{"title":"“Somewhere along your pedigree, a bitch got over the wall!” A proposal of implicitly offensive language typology","authors":"Kristina Š. Despot, A. Anić, Tony Veale","doi":"10.1515/lpp-2023-0019","DOIUrl":null,"url":null,"abstract":"Abstract The automatic detection of implicitly offensive language is a challenge for NLP, as such language is subtle, contextual, and plausibly deniable, but it is becoming increasingly important with the wider use of large language models to generate human-quality texts. This study argues that current difficulties in detecting implicit offence are exacerbated by multiple factors: (a) inadequate definitions of implicit and explicit offense; (b) an insufficient typology of implicit offence; and (c) a dearth of detailed analysis of implicitly offensive linguistic data. In this study, based on a qualitative analysis of an implicitly offensive dataset, a new typology of implicitly offensive language is proposed along with a detailed, example-led account of the new typology, an operational definition of implicitly offensive language, and a thorough analysis of the role of figurative language and humour in each type. Our analyses identify three main issues with previous datasets and typologies used in NLP approaches: (a) conflating content and form in the annotation; (b) treating figurativeness, particularly metaphor, as the main device of implicitness, while ignoring its equally important role in the explicit offence; and (c) an over-focus on form-specific datasets (e.g. focusing only on offensive comparisons), which fails to reflect the full complexity of offensive language use.","PeriodicalId":39423,"journal":{"name":"Lodz Papers in Pragmatics","volume":" 27","pages":"385 - 414"},"PeriodicalIF":0.0000,"publicationDate":"2023-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Lodz Papers in Pragmatics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1515/lpp-2023-0019","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"Arts and Humanities","Score":null,"Total":0}

引用次数: 0

Abstract

Abstract The automatic detection of implicitly offensive language is a challenge for NLP, as such language is subtle, contextual, and plausibly deniable, but it is becoming increasingly important with the wider use of large language models to generate human-quality texts. This study argues that current difficulties in detecting implicit offence are exacerbated by multiple factors: (a) inadequate definitions of implicit and explicit offense; (b) an insufficient typology of implicit offence; and (c) a dearth of detailed analysis of implicitly offensive linguistic data. In this study, based on a qualitative analysis of an implicitly offensive dataset, a new typology of implicitly offensive language is proposed along with a detailed, example-led account of the new typology, an operational definition of implicitly offensive language, and a thorough analysis of the role of figurative language and humour in each type. Our analyses identify three main issues with previous datasets and typologies used in NLP approaches: (a) conflating content and form in the annotation; (b) treating figurativeness, particularly metaphor, as the main device of implicitness, while ignoring its equally important role in the explicit offence; and (c) an over-focus on form-specific datasets (e.g. focusing only on offensive comparisons), which fails to reflect the full complexity of offensive language use.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

"在你血统的某处，一个婊子翻过了墙！"隐含攻击性语言类型学提案

自动检测隐式冒犯性语言是NLP面临的一个挑战，因为这种语言是微妙的，上下文的，并且似乎是可否认的，但随着大型语言模型在生成人类质量文本中的广泛使用，它变得越来越重要。本研究认为，多重因素加剧了当前隐性犯罪侦查的困难:(a)隐性犯罪和显性犯罪的定义不充分;(b)隐性罪行的类型不够;(三)缺乏对含蓄冒犯性语言数据的详细分析。在本研究中，基于对隐式冒犯性数据集的定性分析，提出了一种新的隐式冒犯性语言类型，并对新类型进行了详细的、以实例为主导的描述，对隐式冒犯性语言进行了操作定义，并对每种类型中比喻语言和幽默的作用进行了全面分析。我们的分析确定了NLP方法中使用的以前的数据集和类型学的三个主要问题:(a)在注释中合并内容和形式;(b)将比喻，特别是隐喻作为隐含的主要手段，而忽视了其在显性冒犯中的同等重要作用;(c)过度关注特定于形式的数据集(例如，只关注攻击性比较)，这未能反映攻击性语言使用的全部复杂性。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Lodz Papers in Pragmatics Arts and Humanities-Language and Linguistics

CiteScore

1.10

自引率

0.00%

发文量