Proceedings of COLING. International Conference on Computational Linguistics最新文献

英文中文

MonoByte: A Pool of Monolingual Byte-level Language Models MonoByte:一组单语字节级语言模型

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-22 DOI: 10.48550/arXiv.2209.11035

Hugo Abonizio, Leandro Rodrigues de Souza, R. Lotufo, Rodrigo Nogueira

The zero-shot cross-lingual ability of models pretrained on multilingual and even monolingual corpora has spurred many hypotheses to explain this intriguing empirical result. However, due to the costs of pretraining, most research uses public models whose pretraining methodology, such as the choice of tokenization, corpus size, and computational budget, might differ drastically. When researchers pretrain their own models, they often do so under a constrained budget, and the resulting models might underperform significantly compared to SOTA models. These experimental differences led to various inconsistent conclusions about the nature of the cross-lingual ability of these models. To help further research on the topic, we released 10 monolingual byte-level models rigorously pretrained under the same configuration with a large compute budget (equivalent to 420 days on a V100) and corpora that are 4 times larger than the original BERT’s. Because they are tokenizer-free, the problem of unseen token embeddings is eliminated, thus allowing researchers to try a wider range of cross-lingual experiments in languages with different scripts. Additionally, we release two models pretrained on non-natural language texts that can be used in sanity-check experiments. Experiments on QA and NLI tasks show that our monolingual models achieve competitive performance to the multilingual one, and hence can be served to strengthen our understanding of cross-lingual transferability in language models.

在多语言甚至单语言语料库上预训练的模型的零概率跨语言能力激发了许多假设来解释这一有趣的实证结果。然而，由于预训练的成本，大多数研究使用公共模型，其预训练方法，如标记化的选择，语料库大小和计算预算，可能会有很大的不同。当研究人员预训练他们自己的模型时，他们经常在有限的预算下这样做，结果模型可能比SOTA模型表现不佳。这些实验差异导致了关于这些模型跨语言能力性质的各种不一致的结论。为了帮助进一步研究这个主题，我们发布了10个单语言字节级模型，在相同的配置下进行了严格的预训练，计算预算很大(相当于V100上的420天)，语料库比原始BERT大4倍。由于它们是无标记器的，因此消除了看不见的标记嵌入问题，从而允许研究人员在不同脚本的语言中尝试更广泛的跨语言实验。此外，我们发布了两个在非自然语言文本上进行预训练的模型，可用于安全性检查实验。在QA和NLI任务上的实验表明，我们的单语模型与多语模型相比具有竞争力，因此可以加强我们对语言模型中跨语言可迁移性的理解。

{"title":"MonoByte: A Pool of Monolingual Byte-level Language Models","authors":"Hugo Abonizio, Leandro Rodrigues de Souza, R. Lotufo, Rodrigo Nogueira","doi":"10.48550/arXiv.2209.11035","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11035","url":null,"abstract":"The zero-shot cross-lingual ability of models pretrained on multilingual and even monolingual corpora has spurred many hypotheses to explain this intriguing empirical result. However, due to the costs of pretraining, most research uses public models whose pretraining methodology, such as the choice of tokenization, corpus size, and computational budget, might differ drastically. When researchers pretrain their own models, they often do so under a constrained budget, and the resulting models might underperform significantly compared to SOTA models. These experimental differences led to various inconsistent conclusions about the nature of the cross-lingual ability of these models. To help further research on the topic, we released 10 monolingual byte-level models rigorously pretrained under the same configuration with a large compute budget (equivalent to 420 days on a V100) and corpora that are 4 times larger than the original BERT’s. Because they are tokenizer-free, the problem of unseen token embeddings is eliminated, thus allowing researchers to try a wider range of cross-lingual experiments in languages with different scripts. Additionally, we release two models pretrained on non-natural language texts that can be used in sanity-check experiments. Experiments on QA and NLI tasks show that our monolingual models achieve competitive performance to the multilingual one, and hence can be served to strengthen our understanding of cross-lingual transferability in language models.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"14 1","pages":"3506-3513"},"PeriodicalIF":0.0,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86769915","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Subject Verb Agreement Error Patterns in Meaningless Sentences: Humans vs. BERT 无意义句子中的主谓一致错误模式:人类与BERT

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-21 DOI: 10.48550/arXiv.2209.10538

Karim Lasri, Olga Seminck, Alessandro Lenci, T. Poibeau

Both humans and neural language models are able to perform subject verb number agreement (SVA). In principle, semantics shouldn’t interfere with this task, which only requires syntactic knowledge. In this work we test whether meaning interferes with this type of agreement in English in syntactic structures of various complexities. To do so, we generate both semantically well-formed and nonsensical items. We compare the performance of BERT-base to that of humans, obtained with a psycholinguistic online crowdsourcing experiment. We find that BERT and humans are both sensitive to our semantic manipulation: They fail more often when presented with nonsensical items, especially when their syntactic structure features an attractor (a noun phrase between the subject and the verb that has not the same number as the subject). We also find that the effect of meaningfulness on SVA errors is stronger for BERT than for humans, showing higher lexical sensitivity of the former on this task.

人类和神经语言模型都能够执行主谓数一致(SVA)。原则上，语义不应该干扰这项任务，它只需要语法知识。在这项工作中，我们测试了在各种复杂的句法结构中，意义是否会干扰这种类型的一致性。为此，我们生成语义良好和无意义的项。我们将BERT-base的表现与人类的表现进行比较，这是通过心理语言学在线众包实验获得的。我们发现BERT和人类都对我们的语义操作很敏感:当出现无意义的项目时，它们更容易失败，尤其是当它们的句法结构具有吸引子(主语和动词之间的名词短语，与主语的数字不同)时。我们还发现BERT对SVA错误的影响比人类更强，在这个任务上表现出更高的词汇敏感性。

引用次数: 2

Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling 第二眼的偏见:深入研究德国教育同行评审数据建模的偏见

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-21 DOI: 10.48550/arXiv.2209.10335

Thiemo Wambsganss, Vinitra Swamy, Roman Rietsche, Tanja Käser

Natural Language Processing (NLP) has become increasingly utilized to provide adaptivity in educational applications. However, recent research has highlighted a variety of biases in pre-trained language models. While existing studies investigate bias in different domains, they are limited in addressing fine-grained analysis on educational corpora and text that is not English. In this work, we analyze bias across text and through multiple architectures on a corpus of 9,165 German peer-reviews collected from university students over five years. Notably, our corpus includes labels such as helpfulness, quality, and critical aspect ratings from the peer-review recipient as well as demographic attributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1) our collected corpus in connection with the clustered labels, (2) the most common pre-trained German language models (T5, BERT, and GPT-2) and GloVe embeddings, and (3) the language models after fine-tuning on our collected data-set. In contrast to our initial expectations, we found that our collected corpus does not reveal many biases in the co-occurrence analysis or in the GloVe embeddings. However, the pre-trained German language models find substantial conceptual, racial, and gender bias and have significant changes in bias across conceptual and racial axes during fine-tuning on the peer-review data. With our research, we aim to contribute to the fourth UN sustainability goal (quality education) with a novel dataset, an understanding of biases in natural language education data, and the potential harms of not counteracting biases in language models for educational tasks.

自然语言处理(NLP)在教育应用中越来越多地用于提供适应性。然而，最近的研究强调了预训练语言模型中的各种偏差。虽然现有的研究调查了不同领域的偏见，但它们在解决教育语料库和非英语文本的细粒度分析方面受到限制。在这项工作中，我们通过五年来从大学生收集的9,165个德国同行评议的语料库，跨文本和多个架构分析了偏见。值得注意的是，我们的语料库包括来自同行评审接受者的有用性、质量和关键方面评级等标签，以及人口统计属性。我们对以下内容进行了词嵌入关联测试(WEAT)分析:(1)收集到的与聚类标签相关的语料库，(2)最常见的预训练德语模型(T5、BERT和GPT-2)和GloVe嵌入，以及(3)对收集到的数据集进行微调后的语言模型。与我们最初的预期相反，我们发现我们收集的语料库在共现分析或GloVe嵌入中没有显示出许多偏差。然而，预训练的德语语言模型发现了大量的概念、种族和性别偏见，并且在对同行评审数据进行微调时，在概念和种族轴上的偏见发生了重大变化。通过我们的研究，我们的目标是通过一个新的数据集，理解自然语言教育数据中的偏见，以及不抵消教育任务中语言模型偏见的潜在危害，为联合国的第四个可持续发展目标(质量教育)做出贡献。

{"title":"Bias at a Second Glance: A Deep Dive into Bias for German Educational Peer-Review Data Modeling","authors":"Thiemo Wambsganss, Vinitra Swamy, Roman Rietsche, Tanja Käser","doi":"10.48550/arXiv.2209.10335","DOIUrl":"https://doi.org/10.48550/arXiv.2209.10335","url":null,"abstract":"Natural Language Processing (NLP) has become increasingly utilized to provide adaptivity in educational applications. However, recent research has highlighted a variety of biases in pre-trained language models. While existing studies investigate bias in different domains, they are limited in addressing fine-grained analysis on educational corpora and text that is not English. In this work, we analyze bias across text and through multiple architectures on a corpus of 9,165 German peer-reviews collected from university students over five years. Notably, our corpus includes labels such as helpfulness, quality, and critical aspect ratings from the peer-review recipient as well as demographic attributes. We conduct a Word Embedding Association Test (WEAT) analysis on (1) our collected corpus in connection with the clustered labels, (2) the most common pre-trained German language models (T5, BERT, and GPT-2) and GloVe embeddings, and (3) the language models after fine-tuning on our collected data-set. In contrast to our initial expectations, we found that our collected corpus does not reveal many biases in the co-occurrence analysis or in the GloVe embeddings. However, the pre-trained German language models find substantial conceptual, racial, and gender bias and have significant changes in bias across conceptual and racial axes during fine-tuning on the peer-review data. With our research, we aim to contribute to the fourth UN sustainability goal (quality education) with a novel dataset, an understanding of biases in natural language education data, and the potential harms of not counteracting biases in language models for educational tasks.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"2 1","pages":"1344-1356"},"PeriodicalIF":0.0,"publicationDate":"2022-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91184501","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Dynamic Relevance Graph Network for Knowledge-Aware Question Answering 面向知识感知问答的动态关联图网络

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-20 DOI: 10.48550/arXiv.2209.09947

Chen Zheng, Parisa Kordjamshidi

This work investigates the challenge of learning and reasoning for Commonsense Question Answering given an external source of knowledge in the form of a knowledge graph (KG). We propose a novel graph neural network architecture, called Dynamic Relevance Graph Network (DRGN). DRGN operates on a given KG subgraph based on the question and answers entities and uses the relevance scores between the nodes to establish new edges dynamically for learning node representations in the graph network. This explicit usage of relevance as graph edges has the following advantages, a) the model can exploit the existing relationships, re-scale the node weights, and influence the way the neighborhood nodes’ representations are aggregated in the KG subgraph, b) It potentially recovers the missing edges in KG that are needed for reasoning. Moreover, as a byproduct, our model improves handling the negative questions due to considering the relevance between the question node and the graph entities. Our proposed approach shows competitive performance on two QA benchmarks, CommonsenseQA and OpenbookQA, compared to the state-of-the-art published results.

这项工作研究了常识问答的学习和推理的挑战，给出了知识图(KG)形式的外部知识来源。我们提出了一种新的图神经网络架构，称为动态关联图网络(DRGN)。DRGN基于问题和答案实体对给定的KG子图进行操作，并使用节点之间的关联分数动态地建立新的边缘，以学习图网络中的节点表示。这种将相关性作为图边的明确使用具有以下优点:a)该模型可以利用现有关系，重新缩放节点权重，并影响相邻节点在KG子图中表示的聚合方式;b)它有可能恢复KG中推理所需的缺失边。此外，由于考虑了问题节点和图实体之间的相关性，我们的模型改进了对否定问题的处理。与最新发布的结果相比，我们提出的方法在CommonsenseQA和OpenbookQA这两个QA基准上显示了具有竞争力的性能。

{"title":"Dynamic Relevance Graph Network for Knowledge-Aware Question Answering","authors":"Chen Zheng, Parisa Kordjamshidi","doi":"10.48550/arXiv.2209.09947","DOIUrl":"https://doi.org/10.48550/arXiv.2209.09947","url":null,"abstract":"This work investigates the challenge of learning and reasoning for Commonsense Question Answering given an external source of knowledge in the form of a knowledge graph (KG). We propose a novel graph neural network architecture, called Dynamic Relevance Graph Network (DRGN). DRGN operates on a given KG subgraph based on the question and answers entities and uses the relevance scores between the nodes to establish new edges dynamically for learning node representations in the graph network. This explicit usage of relevance as graph edges has the following advantages, a) the model can exploit the existing relationships, re-scale the node weights, and influence the way the neighborhood nodes’ representations are aggregated in the KG subgraph, b) It potentially recovers the missing edges in KG that are needed for reasoning. Moreover, as a byproduct, our model improves handling the negative questions due to considering the relevance between the question node and the graph entities. Our proposed approach shows competitive performance on two QA benchmarks, CommonsenseQA and OpenbookQA, compared to the state-of-the-art published results.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"60 1","pages":"1357-1366"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79179631","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging 为意向分类和槽标记生成注释话语的语言模型指令调优

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-20 DOI: 10.48550/arXiv.2209.09900

Andrew Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, M. Boese

We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a wide margin, showing absolute improvement for the target intents of +1.9 points on IC Recall and +2.5 points on ST F1 Score. In the zero-shot cross-lingual setting of the mATIS++ dataset, LINGUIST out-performs a strong baseline of Machine Translation with Slot Alignment by +4.14 points absolute on ST F1 Score across 6 languages, while matching performance on IC. Finally, we verify our results on an internal large-scale multilingual dataset for conversational agent IC+ST and show significant improvements over a baseline which uses Back-Translation, Paraphrasing and Slot Catalog Resampling. To our knowledge, we are the first to demonstrate instruction fine-tuning of a large-scale seq2seq model to control the outputs of multilingual intent- and slot-labeled data generation.

我们提出了LINGUIST，一种在灵活的指令提示下，通过微调AlexaTM 5B(一个50亿个参数的多语言序列到序列(seq2seq)模型)，为意图分类和槽标记(IC+ST)生成注释数据的方法。在SNIPS数据集的10次新颖意图设置中，LINGUIST大大超过了最先进的方法(反向翻译和示例外推)，显示出对目标意图的绝对改进，在IC召回上提高了1.9分，在ST F1得分上提高了2.5分。在matis++数据集的零采样跨语言设置中，LINGUIST在6种语言的ST F1得分上比具有槽位校准的机器翻译的强基线高出+4.14分，同时在IC上的表现与IC上的表现相匹配。最后，我们在会话代理IC+ST的内部大规模多语言数据集上验证了我们的结果，并显示出比使用反向翻译，意译和槽位目录重新采样的基线有显著改善。据我们所知，我们首次展示了大规模seq2seq模型的指令微调，以控制多语言意图和槽标记数据生成的输出。

{"title":"LINGUIST: Language Model Instruction Tuning to Generate Annotated Utterances for Intent Classification and Slot Tagging","authors":"Andrew Rosenbaum, Saleh Soltan, Wael Hamza, Yannick Versley, M. Boese","doi":"10.48550/arXiv.2209.09900","DOIUrl":"https://doi.org/10.48550/arXiv.2209.09900","url":null,"abstract":"We present LINGUIST, a method for generating annotated data for Intent Classification and Slot Tagging (IC+ST), via fine-tuning AlexaTM 5B, a 5-billion-parameter multilingual sequence-to-sequence (seq2seq) model, on a flexible instruction prompt. In a 10-shot novel intent setting for the SNIPS dataset, LINGUIST surpasses state-of-the-art approaches (Back-Translation and Example Extrapolation) by a wide margin, showing absolute improvement for the target intents of +1.9 points on IC Recall and +2.5 points on ST F1 Score. In the zero-shot cross-lingual setting of the mATIS++ dataset, LINGUIST out-performs a strong baseline of Machine Translation with Slot Alignment by +4.14 points absolute on ST F1 Score across 6 languages, while matching performance on IC. Finally, we verify our results on an internal large-scale multilingual dataset for conversational agent IC+ST and show significant improvements over a baseline which uses Back-Translation, Paraphrasing and Slot Catalog Resampling. To our knowledge, we are the first to demonstrate instruction fine-tuning of a large-scale seq2seq model to control the outputs of multilingual intent- and slot-labeled data generation.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"1 1","pages":"218-241"},"PeriodicalIF":0.0,"publicationDate":"2022-09-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90894357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Target-Guided Open-Domain Conversation Planning 目标引导的开放域会话规划

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-20 DOI: 10.48550/arXiv.2209.09746

Yosuke Kishinami, Reina Akama, Shiki Sato, Ryoko Tokuhisa, Jun Suzuki, Kentaro Inui

Prior studies addressing target-oriented conversational tasks lack a crucial notion that has been intensively studied in the context of goal-oriented artificial intelligence agents, namely, planning. In this study, we propose the task of Target-Guided Open-Domain Conversation Planning (TGCP) task to evaluate whether neural conversational agents have goal-oriented conversation planning abilities. Using the TGCP task, we investigate the conversation planning abilities of existing retrieval models and recent strong generative models. The experimental results reveal the challenges facing current technology.

先前针对目标导向会话任务的研究缺乏一个关键概念，而这个概念在目标导向人工智能代理的背景下得到了深入研究，即规划。在本研究中，我们提出了目标导向的开放域会话规划(TGCP)任务来评估神经会话代理是否具有目标导向的会话规划能力。利用TGCP任务，我们研究了现有检索模型和最近的强生成模型的会话规划能力。实验结果揭示了当前技术面临的挑战。

引用次数: 6

CofeNet: Context and Former-Label Enhanced Net for Complicated Quotation Extraction CofeNet:复杂报价抽取的上下文和前标签增强网络

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-20 DOI: 10.48550/arXiv.2209.09432

Yequan Wang, Xiang Li, Aixin Sun, Xuying Meng, Huaming Liao, J. Guo

Quotation extraction aims to extract quotations from written text. There are three components in a quotation: source refers to the holder of the quotation, cue is the trigger word(s), and content is the main body. Existing solutions for quotation extraction mainly utilize rule-based approaches and sequence labeling models. While rule-based approaches often lead to low recalls, sequence labeling models cannot well handle quotations with complicated structures. In this paper, we propose the Context and Former-Label Enhanced Net () for quotation extraction. is able to extract complicated quotations with components of variable lengths and complicated structures. On two public datasets (and ) and one proprietary dataset (), we show that our achieves state-of-the-art performance on complicated quotation extraction.

引文提取的目的是从书面文本中提取引文。报价有三个组成部分:来源是指报价的持有者，提示是触发词，内容是主体。现有的报价提取方案主要采用基于规则的方法和序列标记模型。而基于规则的方法往往导致低召回，序列标记模型不能很好地处理复杂结构的报价。在本文中，我们提出了上下文和前标签增强网络()来提取引文。能够提取出多变长度和复杂结构成分的复杂报价。在两个公共数据集(和)和一个专有数据集()上，我们展示了我们在复杂的报价提取上实现了最先进的性能。

引用次数: 1

ALEXSIS-PT: A New Resource for Portuguese Lexical Simplification ALEXSIS-PT:葡萄牙语词汇简化新资源

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-19 DOI: 10.48550/arXiv.2209.09034

Kai North, Marcos Zampieri, Tharindu Ranasinghe

Lexical simplification (LS) is the task of automatically replacing complex words for easier ones making texts more accessible to various target populations (e.g. individuals with low literacy, individuals with learning disabilities, second language learners). To train and test models, LS systems usually require corpora that feature complex words in context along with their potential substitutions. To continue improving the performance of LS systems we introduce ALEXSIS-PT, a novel multi-candidate dataset for Brazilian Portuguese LS containing 9,605 candidate substitutions for 387 complex words. ALEXSIS-PT has been compiled following the ALEXSIS-ES protocol for Spanish opening exciting new avenues for cross-lingual models. ALEXSIS-PT is the first LS multi-candidate dataset that contains Brazilian newspaper articles. We evaluated three models for substitute generation on this dataset, namely mBERT, XLM-R, and BERTimbau. The latter achieved the highest performance across all evaluation metrics.

词汇简化(LS)是一项自动将复杂单词替换为简单单词的任务，使文本更容易被不同的目标人群(例如，读写能力低的人、有学习障碍的人、第二语言学习者)所理解。为了训练和测试模型，LS系统通常需要具有上下文中复杂单词及其潜在替换特征的语料库。为了继续提高LS系统的性能，我们引入了ALEXSIS-PT，这是一个新的巴西葡萄牙语LS多候选数据集，包含387个复杂单词的9605个候选替换。亚历克西斯- pt是按照亚历克西斯- es西班牙语协议编译的，为跨语言模型开辟了令人兴奋的新途径。alexis - pt是第一个包含巴西报纸文章的LS多候选数据集。我们在该数据集上评估了三种替代生成模型，即mBERT、XLM-R和BERTimbau。后者在所有评估指标中实现了最高的性能。

引用次数: 6

Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances 通过区分表面相似实例克服视觉问答中的语言先验

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-18 DOI: 10.48550/arXiv.2209.08529

Yike Wu, Yu Zhao, Shiwan Zhao, Ying Zhang, Xiaojie Yuan, Guoqing Zhao, Ning Jiang

Despite the great progress of Visual Question Answering (VQA), current VQA models heavily rely on the superficial correlation between the question type and its corresponding frequent answers (i.e., language priors) to make predictions, without really understanding the input. In this work, we define the training instances with the same question type but different answers as superficially similar instances, and attribute the language priors to the confusion of VQA model on such instances. To solve this problem, we propose a novel training framework that explicitly encourages the VQA model to distinguish between the superficially similar instances. Specifically, for each training instance, we first construct a set that contains its superficially similar counterparts. Then we exploit the proposed distinguishing module to increase the distance between the instance and its counterparts in the answer space. In this way, the VQA model is forced to further focus on the other parts of the input beyond the question type, which helps to overcome the language priors. Experimental results show that our method achieves the state-of-the-art performance on VQA-CP v2. Codes are available at Distinguishing-VQA.

尽管视觉问答(VQA)取得了很大的进步，但目前的VQA模型严重依赖于问题类型与其对应的频繁答案(即语言先验)之间的表面相关性来进行预测，而没有真正理解输入。在这项工作中，我们将具有相同问题类型但不同答案的训练实例定义为表面相似的实例，并将语言先验归因于VQA模型在这些实例上的混淆。为了解决这个问题，我们提出了一个新的训练框架，明确地鼓励VQA模型区分表面相似的实例。具体来说，对于每个训练实例，我们首先构造一个包含其表面相似对应的集合。然后，我们利用所提出的区分模块来增加实例与答案空间中对应实例之间的距离。通过这种方式，VQA模型被迫进一步关注问题类型之外的输入的其他部分，这有助于克服语言先验。实验结果表明，该方法在VQA-CP v2上达到了最先进的性能。代码可在distinguished - vqa上获得。

{"title":"Overcoming Language Priors in Visual Question Answering via Distinguishing Superficially Similar Instances","authors":"Yike Wu, Yu Zhao, Shiwan Zhao, Ying Zhang, Xiaojie Yuan, Guoqing Zhao, Ning Jiang","doi":"10.48550/arXiv.2209.08529","DOIUrl":"https://doi.org/10.48550/arXiv.2209.08529","url":null,"abstract":"Despite the great progress of Visual Question Answering (VQA), current VQA models heavily rely on the superficial correlation between the question type and its corresponding frequent answers (i.e., language priors) to make predictions, without really understanding the input. In this work, we define the training instances with the same question type but different answers as superficially similar instances, and attribute the language priors to the confusion of VQA model on such instances. To solve this problem, we propose a novel training framework that explicitly encourages the VQA model to distinguish between the superficially similar instances. Specifically, for each training instance, we first construct a set that contains its superficially similar counterparts. Then we exploit the proposed distinguishing module to increase the distance between the instance and its counterparts in the answer space. In this way, the VQA model is forced to further focus on the other parts of the input beyond the question type, which helps to overcome the language priors. Experimental results show that our method achieves the state-of-the-art performance on VQA-CP v2. Codes are available at Distinguishing-VQA.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"88 1","pages":"5721-5729"},"PeriodicalIF":0.0,"publicationDate":"2022-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90670375","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Domain Classification-based Source-specific Term Penalization for Domain Adaptation in Hate-speech Detection 仇恨语音检测中基于域分类的源特定词惩罚

Proceedings of COLING. International Conference on Computational Linguistics

Pub Date : 2022-09-18 DOI: 10.48550/arXiv.2209.08681

Tulika Bose, Nikolaos Aletras, I. Illina, D. Fohr

State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings. This occurs, typically, due to classifiers overemphasizing source-specific information that negatively impacts its domain invariance. Prior work has attempted to penalize terms related to hate-speech from manually curated lists using feature attribution methods, which quantify the importance assigned to input terms by the classifier when making a prediction. We, instead, propose a domain adaptation approach that automatically extracts and penalizes source-specific terms using a domain classifier, which learns to differentiate between domains, and feature-attribution scores for hate-speech classes, yielding consistent improvements in cross-domain evaluation.

最先进的仇恨言论检测方法通常在域外设置中表现不佳。这种情况通常是由于分类器过分强调特定于源的信息，从而对其域不变性产生负面影响。先前的工作试图使用特征归因方法从人工管理的列表中惩罚与仇恨言论相关的术语，该方法在进行预测时量化分类器分配给输入术语的重要性。相反，我们提出了一种领域自适应方法，该方法使用领域分类器自动提取和惩罚特定于源的术语，该分类器学习区分领域和仇恨言论类的特征归因分数，从而在跨领域评估中产生一致的改进。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of COLING. International Conference on Computational Linguistics

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀