Proceedings of COLING. International Conference on Computational Linguistics最新文献

英文中文

Unsupervised Lexical Substitution with Decontextualised Embeddings 非语境化嵌入的无监督词汇替换

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-17 DOI: 10.48550/arXiv.2209.08236

Takashi Wada, Timothy Baldwin, Yuji Matsumoto, Jey Han Lau

We propose a new unsupervised method for lexical substitution using pre-trained language models. Compared to previous approaches that use the generative capability of language models to predict substitutes, our method retrieves substitutes based on the similarity of contextualised and decontextualised word embeddings, i.e. the average contextual representation of a word in multiple contexts. We conduct experiments in English and Italian, and show that our method substantially outperforms strong baselines and establishes a new state-of-the-art without any explicit supervision or fine-tuning. We further show that our method performs particularly well at predicting low-frequency substitutes, and also generates a diverse list of substitute candidates, reducing morphophonetic or morphosyntactic biases induced by article-noun agreement.

我们提出了一种新的使用预训练语言模型进行词汇替换的无监督方法。与之前使用语言模型的生成能力来预测替代品的方法相比，我们的方法基于上下文化和非上下文化词嵌入的相似性来检索替代品，即一个词在多个上下文中的平均上下文表示。我们在英语和意大利语中进行了实验，并表明我们的方法在没有任何明确监督或微调的情况下大大优于强大的基线，并建立了新的最先进的技术。我们进一步表明，我们的方法在预测低频替代方面表现得特别好，并且还生成了一个多样化的替代候选列表，减少了由冠词-名词一致性引起的词音或形态句法偏差。

引用次数: 4

ConFiguRe: Exploring Discourse-level Chinese Figures of Speech 配置:探索语篇层次的汉语修辞格

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-16 DOI: 10.48550/arXiv.2209.07678

Dawei Zhu, Qiusi Zhan, Zhejian Zhou, Yifan Song, Jiebin Zhang, Sujian Li

Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figures and seldom provide a comprehensive exploration from a view of computational linguistics. In this paper, we first propose the concept of figurative unit, which is the carrier of a figure. Then we select 12 types of figures commonly used in Chinese, and build a Chinese corpus for Contextualized Figure Recognition (ConFiguRe). Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type. On ConFiguRe, three tasks, i.e., figure extraction, figure type classification and figure recognition, are designed and the state-of-the-art techniques are utilized to implement the benchmarks. We conduct thorough experiments and show that all three tasks are challenging for existing models, thus requiring further research. Our dataset and code are publicly available at https://github.com/pku-tangent/ConFiguRe.

隐喻、反讽等修辞手法在文学作品和口语对话中无处不在。这给自然语言理解带来了巨大的挑战，因为修辞格通常会偏离其表面意义来表达更深层次的语义含义。以往的研究多侧重于数字的文学方面，很少从计算语言学的角度进行全面的探索。本文首先提出了形象单位的概念，形象单位是形象的载体。在此基础上，选取12种汉语常用图形，构建语境化图形识别的汉语语料库(ConFiguRe)。与以往的符号级或句子级的对等物不同，ConFiguRe的目的是从话语级语境中提取比喻单位，并将比喻单位分类为正确的比喻类型。在配置上，设计了三个任务，即图形提取、图形类型分类和图形识别，并利用最先进的技术来实现基准。我们进行了彻底的实验，并表明这三个任务对现有模型都是具有挑战性的，因此需要进一步的研究。我们的数据集和代码可以在https://github.com/pku-tangent/ConFiguRe上公开获取。

{"title":"ConFiguRe: Exploring Discourse-level Chinese Figures of Speech","authors":"Dawei Zhu, Qiusi Zhan, Zhejian Zhou, Yifan Song, Jiebin Zhang, Sujian Li","doi":"10.48550/arXiv.2209.07678","DOIUrl":"https://doi.org/10.48550/arXiv.2209.07678","url":null,"abstract":"Figures of speech, such as metaphor and irony, are ubiquitous in literature works and colloquial conversations. This poses great challenge for natural language understanding since figures of speech usually deviate from their ostensible meanings to express deeper semantic implications. Previous research lays emphasis on the literary aspect of figures and seldom provide a comprehensive exploration from a view of computational linguistics. In this paper, we first propose the concept of figurative unit, which is the carrier of a figure. Then we select 12 types of figures commonly used in Chinese, and build a Chinese corpus for Contextualized Figure Recognition (ConFiguRe). Different from previous token-level or sentence-level counterparts, ConFiguRe aims at extracting a figurative unit from discourse-level context, and classifying the figurative unit into the right figure type. On ConFiguRe, three tasks, i.e., figure extraction, figure type classification and figure recognition, are designed and the state-of-the-art techniques are utilized to implement the benchmarks. We conduct thorough experiments and show that all three tasks are challenging for existing models, thus requiring further research. Our dataset and code are publicly available at https://github.com/pku-tangent/ConFiguRe.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"1 1","pages":"3374-3385"},"PeriodicalIF":0.0,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77560377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios 可能的故事:在多个可能的场景下评估情境常识性推理

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-16 DOI: 10.48550/arXiv.2209.07760

Mana Ashida, Saku Sugawara

The possible consequences for the same context may vary depending on the situation we refer to. However, current studies in natural language processing do not focus on situated commonsense reasoning under multiple possible scenarios. This study frames this task by asking multiple questions with the same set of possible endings as candidate answers, given a short story text. Our resulting dataset, Possible Stories, consists of more than 4.5K questions over 1.3K story texts in English. We discover that even current strong pretrained language models struggle to answer the questions consistently, highlighting that the highest accuracy in an unsupervised setting (60.2%) is far behind human accuracy (92.5%). Through a comparison with existing datasets, we observe that the questions in our dataset contain minimal annotation artifacts in the answer options. In addition, our dataset includes examples that require counterfactual reasoning, as well as those requiring readers’ reactions and fictional information, suggesting that our dataset can serve as a challenging testbed for future studies on situated commonsense reasoning.

同一情境下可能产生的结果可能因我们所指的情况而异。然而，目前自然语言处理的研究并不关注多场景下的情境常识推理。这项研究通过问多个问题来构建这个任务，这些问题有相同的可能结局作为候选答案，给出一个短篇故事文本。我们的结果数据集“可能的故事”由超过4.5万个问题和1.3万个英语故事文本组成。我们发现，即使是目前强大的预训练语言模型也很难始终一致地回答问题，这突出表明，在无监督设置下的最高准确率(60.2%)远远落后于人类的准确率(92.5%)。通过与现有数据集的比较，我们观察到我们数据集中的问题在答案选项中包含最小的注释工件。此外，我们的数据集包括需要反事实推理的示例，以及需要读者反应和虚构信息的示例，这表明我们的数据集可以作为未来关于情境常识推理研究的具有挑战性的测试平台。

{"title":"Possible Stories: Evaluating Situated Commonsense Reasoning under Multiple Possible Scenarios","authors":"Mana Ashida, Saku Sugawara","doi":"10.48550/arXiv.2209.07760","DOIUrl":"https://doi.org/10.48550/arXiv.2209.07760","url":null,"abstract":"The possible consequences for the same context may vary depending on the situation we refer to. However, current studies in natural language processing do not focus on situated commonsense reasoning under multiple possible scenarios. This study frames this task by asking multiple questions with the same set of possible endings as candidate answers, given a short story text. Our resulting dataset, Possible Stories, consists of more than 4.5K questions over 1.3K story texts in English. We discover that even current strong pretrained language models struggle to answer the questions consistently, highlighting that the highest accuracy in an unsupervised setting (60.2%) is far behind human accuracy (92.5%). Through a comparison with existing datasets, we observe that the questions in our dataset contain minimal annotation artifacts in the answer options. In addition, our dataset includes examples that require counterfactual reasoning, as well as those requiring readers’ reactions and fictional information, suggesting that our dataset can serve as a challenging testbed for future studies on situated commonsense reasoning.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"31 1","pages":"3606-3630"},"PeriodicalIF":0.0,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78896643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

A Multi-turn Machine Reading Comprehension Framework with Rethink Mechanism for Emotion-Cause Pair Extraction 基于反思机制的多回合机器阅读理解框架情感-原因对提取

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-16 DOI: 10.48550/arXiv.2209.07972

Chang Zhou, Dandan Song, Jing Xu, Zhijing Wu

Emotion-cause pair extraction (ECPE) is an emerging task in emotion cause analysis, which extracts potential emotion-cause pairs from an emotional document. Most recent studies use end-to-end methods to tackle the ECPE task. However, these methods either suffer from a label sparsity problem or fail to model complicated relations between emotions and causes. Furthermore, they all do not consider explicit semantic information of clauses. To this end, we transform the ECPE task into a document-level machine reading comprehension (MRC) task and propose a Multi-turn MRC framework with Rethink mechanism (MM-R). Our framework can model complicated relations between emotions and causes while avoiding generating the pairing matrix (the leading cause of the label sparsity problem). Besides, the multi-turn structure can fuse explicit semantic information flow between emotions and causes. Extensive experiments on the benchmark emotion cause corpus demonstrate the effectiveness of our proposed framework, which outperforms existing state-of-the-art methods.

情绪原因对提取(ECPE)是一项新兴的情绪原因分析任务，它从情绪文档中提取潜在的情绪原因对。最近的大多数研究使用端到端方法来解决ECPE任务。然而，这些方法要么存在标签稀疏性问题，要么无法对情绪和原因之间的复杂关系进行建模。此外，它们都没有考虑子句的显式语义信息。为此，我们将ECPE任务转化为文档级机器阅读理解(MRC)任务，并提出了一个带有反思机制(MM-R)的多回合机器阅读理解框架。我们的框架可以模拟情绪和原因之间的复杂关系，同时避免生成配对矩阵(标签稀疏性问题的主要原因)。此外，多回合结构可以融合情感和原因之间明确的语义信息流。在基准情感原因语料库上的大量实验证明了我们提出的框架的有效性，它优于现有的最先进的方法。

引用次数: 3

Less Is Better: Recovering Intended-Feature Subspace to Robustify NLU Models 越少越好:恢复意图特征子空间以鲁棒化NLU模型

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-16 DOI: 10.48550/arXiv.2209.07879

Ting Wu, Tao Gui

Datasets with significant proportions of bias present threats for training a trustworthy model on NLU tasks. Despite yielding great progress, current debiasing methods impose excessive reliance on the knowledge of bias attributes. Definition of the attributes, however, is elusive and varies across different datasets. In addition, leveraging these attributes at input level to bias mitigation may leave a gap between intrinsic properties and the underlying decision rule. To narrow down this gap and liberate the supervision on bias, we suggest extending bias mitigation into feature space. Therefore, a novel model, Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed. Assuming that shortcut features caused by various biases are unintended for prediction, RISK views them as redundant features. When delving into a lower manifold to remove redundancies, RISK reveals that an extremely low-dimensional subspace with intended features can robustly represent the highly biased dataset. Empirical results demonstrate our model can consistently improve model generalization to out-of-distribution set, and achieves a new state-of-the-art performance.

具有显著偏差比例的数据集对在NLU任务上训练可信模型存在威胁。尽管取得了很大的进步，但目前的去偏方法过于依赖于偏差属性的知识。然而，属性的定义是难以捉摸的，并且在不同的数据集上有所不同。此外，在输入级别利用这些属性来减少偏差可能会在内在属性和基本决策规则之间留下差距。为了缩小这一差距并解放对偏差的监督，我们建议将偏差缓解扩展到特征空间。在此基础上，提出了一种新的模型——基于无知识的预期特征子空间恢复模型(RISK)。假设由各种偏差引起的快捷功能对预测来说是无意的，RISK将它们视为冗余功能。当深入到一个较低的流形以消除冗余时，RISK揭示了一个具有预期特征的极低维子空间可以鲁棒地表示高度偏差的数据集。实证结果表明，该模型能够持续提高模型对分布外集的泛化能力，达到了新的水平。

{"title":"Less Is Better: Recovering Intended-Feature Subspace to Robustify NLU Models","authors":"Ting Wu, Tao Gui","doi":"10.48550/arXiv.2209.07879","DOIUrl":"https://doi.org/10.48550/arXiv.2209.07879","url":null,"abstract":"Datasets with significant proportions of bias present threats for training a trustworthy model on NLU tasks. Despite yielding great progress, current debiasing methods impose excessive reliance on the knowledge of bias attributes. Definition of the attributes, however, is elusive and varies across different datasets. In addition, leveraging these attributes at input level to bias mitigation may leave a gap between intrinsic properties and the underlying decision rule. To narrow down this gap and liberate the supervision on bias, we suggest extending bias mitigation into feature space. Therefore, a novel model, Recovering Intended-Feature Subspace with Knowledge-Free (RISK) is developed. Assuming that shortcut features caused by various biases are unintended for prediction, RISK views them as redundant features. When delving into a lower manifold to remove redundancies, RISK reveals that an extremely low-dimensional subspace with intended features can robustly represent the highly biased dataset. Empirical results demonstrate our model can consistently improve model generalization to out-of-distribution set, and achieves a new state-of-the-art performance.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"27 1","pages":"1666-1676"},"PeriodicalIF":0.0,"publicationDate":"2022-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80279978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Revisiting the Practical Effectiveness of Constituency Parse Extraction from Pre-trained Language Models 从预训练语言模型中重新审视选区解析提取的实际有效性

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-15 DOI: 10.48550/arXiv.2211.00479

Taeuk Kim

Constituency Parse Extraction from Pre-trained Language Models (CPE-PLM) is a recent paradigm that attempts to induce constituency parse trees relying only on the internal knowledge of pre-trained language models. While attractive in the perspective that similar to in-context learning, it does not require task-specific fine-tuning, the practical effectiveness of such an approach still remains unclear, except that it can function as a probe for investigating language models’ inner workings. In this work, we mathematically reformulate CPE-PLM and propose two advanced ensemble methods tailored for it, demonstrating that the new parsing paradigm can be competitive with common unsupervised parsers by introducing a set of heterogeneous PLMs combined using our techniques. Furthermore, we explore some scenarios where the trees generated by CPE-PLM are practically useful. Specifically, we show that CPE-PLM is more effective than typical supervised parsers in few-shot settings.

从预训练语言模型中提取选区解析(CPE-PLM)是一种最新的范式，它试图仅依靠预训练语言模型的内部知识来诱导选区解析树。虽然从类似于上下文学习的角度来看很有吸引力，但它不需要特定于任务的微调，这种方法的实际有效性仍然不清楚，除了它可以作为调查语言模型内部工作的探针。在这项工作中，我们在数学上重新制定了CPE-PLM，并提出了为其量身定制的两种高级集成方法，通过引入一组使用我们技术组合的异构plm，证明了新的解析范式可以与常见的无监督解析器竞争。此外，我们还探讨了由CPE-PLM生成的树在实际应用中的一些场景。具体来说，我们证明了CPE-PLM在少量射击设置中比典型的监督解析器更有效。

引用次数: 1

Knowledge Is Flat: A Seq2Seq Generative Framework for Various Knowledge Graph Completion 知识是平的:各种知识图补全的Seq2Seq生成框架

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-15 DOI: 10.48550/arXiv.2209.07299

Chen Chen, Yufei Wang, Bing Li, Kwok-Yan Lam

Knowledge Graph Completion (KGC) has been recently extended to multiple knowledge graph (KG) structures, initiating new research directions, e.g. static KGC, temporal KGC and few-shot KGC. Previous works often design KGC models closely coupled with specific graph structures, which inevitably results in two drawbacks: 1) structure-specific KGC models are mutually incompatible; 2) existing KGC methods are not adaptable to emerging KGs. In this paper, we propose KG-S2S, a Seq2Seq generative framework that could tackle different verbalizable graph structures by unifying the representation of KG facts into “flat” text, regardless of their original form. To remedy the KG structure information loss from the “flat” text, we further improve the input representations of entities and relations, and the inference algorithm in KG-S2S. Experiments on five benchmarks show that KG-S2S outperforms many competitive baselines, setting new state-of-the-art performance. Finally, we analyze KG-S2S’s ability on the different relations and the Non-entity Generations.

近年来，知识图谱补全(Knowledge Graph Completion, KGC)已扩展到多种知识图谱结构，并开创了静态知识图谱、时态知识图谱和少量知识图谱等新的研究方向。以往的研究往往将KGC模型设计得与特定的图结构紧密耦合，这不可避免地导致了两个缺点:1)特定结构的KGC模型相互不兼容;2)现有的KGC方法不适用于新兴的KGs，本文提出了一个Seq2Seq生成框架KG- s2s，该框架可以通过将KG事实的表示统一为“平面”文本来处理不同的可语言化图结构，而不考虑其原始形式。为了弥补“平面”文本中KG结构信息的丢失，我们进一步改进了实体和关系的输入表示，以及KG- s2s中的推理算法。在五个基准测试中进行的实验表明，KG-S2S优于许多竞争基准，创造了新的最先进的性能。最后，分析了KG-S2S在不同关系和非实体世代上的能力。

引用次数: 14

CommunityLM: Probing Partisan Worldviews from Language Models communitym:从语言模型探究党派世界观

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-15 DOI: 10.48550/arXiv.2209.07065

Hang Jiang, Doug Beeferman, Brandon Roy, Dwaipayan Roy

As political attitudes have diverged ideologically in the United States, political speech has diverged lingusitically. The ever-widening polarization between the US political parties is accelerated by an erosion of mutual understanding between them. We aim to make these communities more comprehensible to each other with a framework that probes community-specific responses to the same survey questions using community language models CommunityLM. In our framework we identify committed partisan members for each community on Twitter and fine-tune LMs on the tweets authored by them. We then assess the worldviews of the two groups using prompt-based probing of their corresponding LMs, with prompts that elicit opinions about public figures and groups surveyed by the American National Election Studies (ANES) 2020 Exploratory Testing Survey. We compare the responses generated by the LMs to the ANES survey results, and find a level of alignment that greatly exceeds several baseline methods. Our work aims to show that we can use community LMs to query the worldview of any group of people given a sufficiently large sample of their social media discussions or media diet.

随着美国政治态度在意识形态上的分歧，政治言论在语言上也出现了分歧。美国政党之间相互理解的侵蚀加速了两党之间日益扩大的两极分化。我们的目标是通过使用社区语言模型CommunityLM来探索对相同调查问题的社区特定反应的框架，使这些社区更容易相互理解。在我们的框架中，我们为Twitter上的每个社区确定忠诚的党派成员，并对他们撰写的推文进行微调。然后，我们使用对相应LMs的基于提示的探索来评估这两个群体的世界观，并使用提示来引出对美国国家选举研究(ANES) 2020探索性测试调查中调查的公众人物和群体的意见。我们将LMs生成的响应与ANES调查结果进行了比较，发现其一致性大大超过了几种基线方法。我们的工作旨在表明，我们可以使用社区lm来查询任何一群人的世界观，只要有足够大的社交媒体讨论或媒体饮食样本。

{"title":"CommunityLM: Probing Partisan Worldviews from Language Models","authors":"Hang Jiang, Doug Beeferman, Brandon Roy, Dwaipayan Roy","doi":"10.48550/arXiv.2209.07065","DOIUrl":"https://doi.org/10.48550/arXiv.2209.07065","url":null,"abstract":"As political attitudes have diverged ideologically in the United States, political speech has diverged lingusitically. The ever-widening polarization between the US political parties is accelerated by an erosion of mutual understanding between them. We aim to make these communities more comprehensible to each other with a framework that probes community-specific responses to the same survey questions using community language models CommunityLM. In our framework we identify committed partisan members for each community on Twitter and fine-tune LMs on the tweets authored by them. We then assess the worldviews of the two groups using prompt-based probing of their corresponding LMs, with prompts that elicit opinions about public figures and groups surveyed by the American National Election Studies (ANES) 2020 Exploratory Testing Survey. We compare the responses generated by the LMs to the ANES survey results, and find a level of alignment that greatly exceeds several baseline methods. Our work aims to show that we can use community LMs to query the worldview of any group of people given a sufficiently large sample of their social media discussions or media diet.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"105 1","pages":"6818-6826"},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88034746","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Measuring Geographic Performance Disparities of Offensive Language Classifiers 侮辱性语言分类器地域表现差异的测量

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-15 DOI: 10.48550/arXiv.2209.07353

Brandon Lwowski, P. Rad, Anthony Rios

Text classifiers are applied at scale in the form of one-size-fits-all solutions. Nevertheless, many studies show that classifiers are biased regarding different languages and dialects. When measuring and discovering these biases, some gaps present themselves and should be addressed. First, “Does language, dialect, and topical content vary across geographical regions?” and secondly “If there are differences across the regions, do they impact model performance?”. We introduce a novel dataset called GeoOLID with more than 14 thousand examples across 15 geographically and demographically diverse cities to address these questions. We perform a comprehensive analysis of geographical-related content and their impact on performance disparities of offensive language detection models. Overall, we find that current models do not generalize across locations. Likewise, we show that while offensive language models produce false positives on African American English, model performance is not correlated with each city’s minority population proportions. Warning: This paper contains offensive language.

文本分类器以一刀切的解决方案的形式在规模上应用。然而，许多研究表明，分类器对不同的语言和方言是有偏见的。在测量和发现这些偏差时，会出现一些差距，应该加以解决。第一，“语言、方言和主题内容是否因地理区域而异?”第二，“如果不同地区之间存在差异，它们会影响模型的性能吗?”为了解决这些问题，我们引入了一个名为GeoOLID的新数据集，其中包含15个地理和人口结构不同的城市的14000多个示例。我们对地理相关内容及其对攻击性语言检测模型性能差异的影响进行了全面分析。总的来说，我们发现当前的模型不能在不同的地点进行推广。同样，我们表明，虽然攻击性语言模型对非裔美国人英语产生误报，但模型的表现与每个城市的少数民族人口比例无关。警告:本文含有冒犯性语言。

{"title":"Measuring Geographic Performance Disparities of Offensive Language Classifiers","authors":"Brandon Lwowski, P. Rad, Anthony Rios","doi":"10.48550/arXiv.2209.07353","DOIUrl":"https://doi.org/10.48550/arXiv.2209.07353","url":null,"abstract":"Text classifiers are applied at scale in the form of one-size-fits-all solutions. Nevertheless, many studies show that classifiers are biased regarding different languages and dialects. When measuring and discovering these biases, some gaps present themselves and should be addressed. First, “Does language, dialect, and topical content vary across geographical regions?” and secondly “If there are differences across the regions, do they impact model performance?”. We introduce a novel dataset called GeoOLID with more than 14 thousand examples across 15 geographically and demographically diverse cities to address these questions. We perform a comprehensive analysis of geographical-related content and their impact on performance disparities of offensive language detection models. Overall, we find that current models do not generalize across locations. Likewise, we show that while offensive language models produce false positives on African American English, model performance is not correlated with each city’s minority population proportions. Warning: This paper contains offensive language.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"3 1","pages":"6600-6616"},"PeriodicalIF":0.0,"publicationDate":"2022-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90523183","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Hierarchical Attention Network for Explainable Depression Detection on Twitter Aided by Metaphor Concept Mappings 基于隐喻概念映射的推特可解释抑郁检测的层次注意网络

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-15 DOI: 10.48550/arXiv.2209.07494

Sooji Han, Rui Mao, E. Cambria

Automatic depression detection on Twitter can help individuals privately and conveniently understand their mental health status in the early stages before seeing mental health professionals. Most existing black-box-like deep learning methods for depression detection largely focused on improving classification performance. However, explaining model decisions is imperative in health research because decision-making can often be high-stakes and life-and-death. Reliable automatic diagnosis of mental health problems including depression should be supported by credible explanations justifying models’ predictions. In this work, we propose a novel explainable model for depression detection on Twitter. It comprises a novel encoder combining hierarchical attention mechanisms and feed-forward neural networks. To support psycholinguistic studies, our model leverages metaphorical concept mappings as input. Thus, it not only detects depressed individuals, but also identifies features of such users’ tweets and associated metaphor concept mappings.

Twitter上的自动抑郁检测可以帮助个人在看心理健康专家之前，私下和方便地了解他们的早期心理健康状况。大多数现有的类似黑盒的深度学习方法主要集中在提高分类性能上。然而，解释模型决策在健康研究中是必要的，因为决策往往是高风险和生死攸关的。对包括抑郁症在内的心理健康问题的可靠自动诊断，应该得到对模型预测的可信解释的支持。在这项工作中，我们提出了一种新的可解释的模型，用于Twitter上的抑郁检测。它包括一种结合分层注意机制和前馈神经网络的新型编码器。为了支持心理语言学研究，我们的模型利用隐喻概念映射作为输入。因此，它不仅可以检测到抑郁的个体，还可以识别这些用户的推文特征和相关的隐喻概念映射。

引用次数: 29

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of COLING. International Conference on Computational Linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀