首页 > 最新文献

North American Chapter of the Association for Computational Linguistics最新文献

英文 中文
On Synthetic Data for Back Translation 关于反译的合成数据
Pub Date : 2023-10-20 DOI: 10.18653/v1/2022.naacl-main.32
Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu
Back translation (BT) is one of the most significant technologies in NMT research fields. Existing attempts on BT share a common characteristic: they employ either beam search or random sampling to generate synthetic data with a backward model but seldom work studies the role of synthetic data in the performance of BT. This motivates us to ask a fundamental question: what kind of synthetic data contributes to BT performance?Through both theoretical and empirical studies, we identify two key factors on synthetic data controlling the back-translation NMT performance, which are quality and importance. Furthermore, based on our findings, we propose a simple yet effective method to generate synthetic data to better trade off both factors so as to yield the better performance for BT. We run extensive experiments on WMT14 DE-EN, EN-DE, and RU-EN benchmark tasks. By employing our proposed method to generate synthetic data, our BT model significantly outperforms the standard BT baselines (i.e., beam and sampling based methods for data generation), which proves the effectiveness of our proposed methods.
反翻译(BT)是NMT研究领域中最重要的技术之一。现有的针对BT的尝试都有一个共同的特点:要么采用波束搜索,要么采用随机抽样的方法,利用后向模型生成合成数据,但很少研究合成数据在BT性能中的作用。这促使我们提出一个基本问题:什么样的合成数据对BT性能有贡献?通过理论和实证研究,我们确定了合成数据中控制反平移NMT性能的两个关键因素:质量和重要性。此外,基于我们的研究结果,我们提出了一种简单而有效的方法来生成合成数据,以更好地权衡这两个因素,从而为BT提供更好的性能。我们在WMT14 DE-EN, EN-DE和RU-EN基准任务上进行了广泛的实验。通过使用我们提出的方法生成合成数据,我们的BT模型显著优于标准的BT基线(即基于波束和采样的数据生成方法),这证明了我们提出的方法的有效性。
{"title":"On Synthetic Data for Back Translation","authors":"Jiahao Xu, Yubin Ruan, Wei Bi, Guoping Huang, Shuming Shi, Lihui Chen, Lemao Liu","doi":"10.18653/v1/2022.naacl-main.32","DOIUrl":"https://doi.org/10.18653/v1/2022.naacl-main.32","url":null,"abstract":"Back translation (BT) is one of the most significant technologies in NMT research fields. Existing attempts on BT share a common characteristic: they employ either beam search or random sampling to generate synthetic data with a backward model but seldom work studies the role of synthetic data in the performance of BT. This motivates us to ask a fundamental question: what kind of synthetic data contributes to BT performance?Through both theoretical and empirical studies, we identify two key factors on synthetic data controlling the back-translation NMT performance, which are quality and importance. Furthermore, based on our findings, we propose a simple yet effective method to generate synthetic data to better trade off both factors so as to yield the better performance for BT. We run extensive experiments on WMT14 DE-EN, EN-DE, and RU-EN benchmark tasks. By employing our proposed method to generate synthetic data, our BT model significantly outperforms the standard BT baselines (i.e., beam and sampling based methods for data generation), which proves the effectiveness of our proposed methods.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"87 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129227045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomplete Utterance Rewriting 从不完整话语中挖掘线索:一个用于不完整话语重写的查询增强网络
Pub Date : 2023-07-03 DOI: 10.18653/v1/2022.naacl-main.356
Shuzheng Si, Shuang Zeng, Baobao Chang
Incomplete utterance rewriting has recently raised wide attention. However, previous works do not consider the semantic structural information between incomplete utterance and rewritten utterance or model the semantic structure implicitly and insufficiently. To address this problem, we propose a QUEry-Enhanced Network(QUEEN) to solve this problem. Firstly, our proposed query template explicitly brings guided semantic structural knowledge between the incomplete utterance and the rewritten utterance making model perceive where to refer back to or recover omitted tokens. Then, we adopt a fast and effective edit operation scoring network to model the relation between two tokens. Benefiting from extra information and the well-designed network, QUEEN achieves state-of-the-art performance on several public datasets.
不完全话语改写近年来引起了广泛关注。然而,以往的研究没有充分考虑不完整话语和改写话语之间的语义结构信息,也没有对语义结构进行隐式建模。为了解决这个问题,我们提出了一个查询增强网络(QUEEN)来解决这个问题。首先,我们提出的查询模板明确地在不完整的话语和重写的话语之间引入了引导语义结构知识,使模型感知到在哪里引用或恢复遗漏的标记。然后,我们采用快速有效的编辑操作评分网络对两个令牌之间的关系进行建模。受益于额外的信息和精心设计的网络,QUEEN在几个公共数据集上实现了最先进的性能。
{"title":"Mining Clues from Incomplete Utterance: A Query-enhanced Network for Incomplete Utterance Rewriting","authors":"Shuzheng Si, Shuang Zeng, Baobao Chang","doi":"10.18653/v1/2022.naacl-main.356","DOIUrl":"https://doi.org/10.18653/v1/2022.naacl-main.356","url":null,"abstract":"Incomplete utterance rewriting has recently raised wide attention. However, previous works do not consider the semantic structural information between incomplete utterance and rewritten utterance or model the semantic structure implicitly and insufficiently. To address this problem, we propose a QUEry-Enhanced Network(QUEEN) to solve this problem. Firstly, our proposed query template explicitly brings guided semantic structural knowledge between the incomplete utterance and the rewritten utterance making model perceive where to refer back to or recover omitted tokens. Then, we adopt a fast and effective edit operation scoring network to model the relation between two tokens. Benefiting from extra information and the well-designed network, QUEEN achieves state-of-the-art performance on several public datasets.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123971754","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Using Paraphrases to Study Properties of Contextual Embeddings 用释义研究语境嵌入的性质
Pub Date : 2022-07-12 DOI: 10.48550/arXiv.2207.05553
Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea
We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings. Using the Paraphrase Database’s alignments, we study words within paraphrases as well as phrase representations. We find that contextual embeddings effectively handle polysemous words, but give synonyms surprisingly different representations in many cases. We confirm previous findings that BERT is sensitive to word order, but find slightly different patterns than prior work in terms of the level of contextualization across BERT’s layers.
我们使用释义作为独特的数据来源来分析上下文嵌入,特别关注BERT。由于释义自然地编码一致的单词和短语语义,因此它们为研究嵌入的属性提供了独特的视角。使用释义数据库的对齐,我们研究释义中的单词以及短语表示。我们发现上下文嵌入可以有效地处理多义词,但在许多情况下会给同义词提供令人惊讶的不同表示。我们证实了之前的发现,BERT对词序很敏感,但在BERT各层的语境化水平方面,我们发现了与之前的研究略有不同的模式。
{"title":"Using Paraphrases to Study Properties of Contextual Embeddings","authors":"Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea","doi":"10.48550/arXiv.2207.05553","DOIUrl":"https://doi.org/10.48550/arXiv.2207.05553","url":null,"abstract":"We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT. Because paraphrases naturally encode consistent word and phrase semantics, they provide a unique lens for investigating properties of embeddings. Using the Paraphrase Database’s alignments, we study words within paraphrases as well as phrase representations. We find that contextual embeddings effectively handle polysemous words, but give synonyms surprisingly different representations in many cases. We confirm previous findings that BERT is sensitive to word order, but find slightly different patterns than prior work in terms of the level of contextualization across BERT’s layers.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114559308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
GMN: Generative Multi-modal Network for Practical Document Information Extraction 实用文档信息抽取的生成式多模态网络
Pub Date : 2022-07-11 DOI: 10.48550/arXiv.2207.04713
H. Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren
Document Information Extraction (DIE) has attracted increasing attention due to its various advanced applications in the real world. Although recent literature has already achieved competitive results, these approaches usually fail when dealing with complex documents with noisy OCR results or mutative layouts. This paper proposes Generative Multi-modal Network (GMN) for real-world scenarios to address these problems, which is a robust multi-modal generation method without predefined label categories. With the carefully designed spatial encoder and modal-aware mask module, GMN can deal with complex documents that are hard to serialized into sequential order. Moreover, GMN tolerates errors in OCR results and requires no character-level annotation, which is vital because fine-grained annotation of numerous documents is laborious and even requires annotators with specialized domain knowledge. Extensive experiments show that GMN achieves new state-of-the-art performance on several public DIE datasets and surpasses other methods by a large margin, especially in realistic scenes.
文档信息提取(DIE)由于其在现实世界中的各种先进应用而受到越来越多的关注。虽然最近的文献已经取得了有竞争力的结果,但这些方法在处理具有噪声OCR结果或变化布局的复杂文档时通常会失败。针对这些问题,本文提出了基于现实场景的生成式多模态网络(GMN),它是一种无需预定义标签类别的鲁棒多模态生成方法。通过精心设计的空间编码器和模态感知掩码模块,GMN可以处理难以序列化成顺序的复杂文档。此外,GMN容忍OCR结果中的错误,并且不需要字符级注释,这一点至关重要,因为对大量文档进行细粒度注释非常费力,甚至需要具有专门领域知识的注释者。大量的实验表明,GMN在几个公共DIE数据集上取得了新的最先进的性能,并且在很大程度上超过了其他方法,特别是在真实场景中。
{"title":"GMN: Generative Multi-modal Network for Practical Document Information Extraction","authors":"H. Cao, Jiefeng Ma, Antai Guo, Yiqing Hu, Hao Liu, Deqiang Jiang, Yinsong Liu, Bo Ren","doi":"10.48550/arXiv.2207.04713","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04713","url":null,"abstract":"Document Information Extraction (DIE) has attracted increasing attention due to its various advanced applications in the real world. Although recent literature has already achieved competitive results, these approaches usually fail when dealing with complex documents with noisy OCR results or mutative layouts. This paper proposes Generative Multi-modal Network (GMN) for real-world scenarios to address these problems, which is a robust multi-modal generation method without predefined label categories. With the carefully designed spatial encoder and modal-aware mask module, GMN can deal with complex documents that are hard to serialized into sequential order. Moreover, GMN tolerates errors in OCR results and requires no character-level annotation, which is vital because fine-grained annotation of numerous documents is laborious and even requires annotators with specialized domain knowledge. Extensive experiments show that GMN achieves new state-of-the-art performance on several public DIE datasets and surpasses other methods by a large margin, especially in realistic scenes.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130216053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Domain Confused Contrastive Learning for Unsupervised Domain Adaptation 无监督领域自适应的领域混淆对比学习
Pub Date : 2022-07-10 DOI: 10.48550/arXiv.2207.04564
Quanyu Long, Tianze Luo, Wenya Wang, Sinno Jialin Pan
In this work, we study Unsupervised Domain Adaptation (UDA) in a challenging self-supervised approach. One of the difficulties is how to learn task discrimination in the absence of target labels. Unlike previous literature which directly aligns cross-domain distributions or leverages reverse gradient, we propose Domain Confused Contrastive Learning (DCCL), which can bridge the source and target domains via domain puzzles, and retain discriminative representations after adaptation. Technically, DCCL searches for a most domain-challenging direction and exquisitely crafts domain confused augmentations as positive pairs, then it contrastively encourages the model to pull representations towards the other domain, thus learning more stable and effective domain invariances. We also investigate whether contrastive learning necessarily helps with UDA when performing other data augmentations. Extensive experiments demonstrate that DCCL significantly outperforms baselines, further ablation study and analysis also show the effectiveness and availability of DCCL.
在这项工作中,我们以一种具有挑战性的自监督方法研究了无监督域自适应(UDA)。在缺乏目标标签的情况下,如何学习任务识别是难点之一。与以往文献直接对齐跨域分布或利用反向梯度不同,我们提出了域混淆对比学习(DCCL),它可以通过域谜题来桥接源域和目标域,并在适应后保留判别表征。从技术上讲,DCCL寻找最具领域挑战性的方向,并巧妙地将领域混淆的增强作为正对,然后它对比地鼓励模型将表示拉向另一个领域,从而学习更稳定和有效的领域不变性。我们还研究了在执行其他数据增强时,对比学习是否一定有助于UDA。大量的实验表明,DCCL明显优于基线,进一步的消融研究和分析也表明了DCCL的有效性和可用性。
{"title":"Domain Confused Contrastive Learning for Unsupervised Domain Adaptation","authors":"Quanyu Long, Tianze Luo, Wenya Wang, Sinno Jialin Pan","doi":"10.48550/arXiv.2207.04564","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04564","url":null,"abstract":"In this work, we study Unsupervised Domain Adaptation (UDA) in a challenging self-supervised approach. One of the difficulties is how to learn task discrimination in the absence of target labels. Unlike previous literature which directly aligns cross-domain distributions or leverages reverse gradient, we propose Domain Confused Contrastive Learning (DCCL), which can bridge the source and target domains via domain puzzles, and retain discriminative representations after adaptation. Technically, DCCL searches for a most domain-challenging direction and exquisitely crafts domain confused augmentations as positive pairs, then it contrastively encourages the model to pull representations towards the other domain, thus learning more stable and effective domain invariances. We also investigate whether contrastive learning necessarily helps with UDA when performing other data augmentations. Extensive experiments demonstrate that DCCL significantly outperforms baselines, further ablation study and analysis also show the effectiveness and availability of DCCL.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"103 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116148577","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation 非自回归机器翻译中句法多模态的研究
Pub Date : 2022-07-09 DOI: 10.48550/arXiv.2207.04206
Kexun Zhang, Rui Wang, Xu Tan, Junliang Guo, Yi Ren, Tao Qin, Tie-Yan Liu
It is difficult for non-autoregressive translation (NAT) models to capture the multi-modal distribution of target translations due to their conditional independence assumption, which is known as the “multi-modality problem”, including the lexical multi-modality and the syntactic multi-modality. While the first one has been well studied, the syntactic multi-modality brings severe challenges to the standard cross entropy (XE) loss in NAT and is understudied. In this paper, we conduct a systematic study on the syntactic multi-modality problem. Specifically, we decompose it into short- and long-range syntactic multi-modalities and evaluate several recent NAT algorithms with advanced loss functions on both carefully designed synthesized datasets and real datasets. We find that the Connectionist Temporal Classification (CTC) loss and the Order-Agnostic Cross Entropy (OAXE) loss can better handle short- and long-range syntactic multi-modalities respectively. Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets. To facilitate practical usage, we provide a guide to using different loss functions for different kinds of syntactic multi-modality.
非自回归翻译(NAT)模型由于其条件独立假设而难以捕捉译文的多模态分布,即所谓的“多模态问题”,包括词汇多模态和句法多模态。虽然前者已经得到了很好的研究,但句法多模态给NAT中的标准交叉熵(XE)损失带来了严峻的挑战,研究尚不充分。本文对句法多模态问题进行了系统的研究。具体而言,我们将其分解为短期和长期语法多模态,并在精心设计的合成数据集和实际数据集上评估了几种具有高级损失函数的最新NAT算法。我们发现连接时间分类(CTC)损失和顺序不可知交叉熵(OAXE)损失分别可以更好地处理短时间和长时间的句法多模态。此外,我们将两者的优点结合起来,设计了一个新的损失函数来更好地处理真实数据集中复杂的语法多模态。为了方便实际使用,我们提供了针对不同类型的语法多模态使用不同损失函数的指南。
{"title":"A Study of Syntactic Multi-Modality in Non-Autoregressive Machine Translation","authors":"Kexun Zhang, Rui Wang, Xu Tan, Junliang Guo, Yi Ren, Tao Qin, Tie-Yan Liu","doi":"10.48550/arXiv.2207.04206","DOIUrl":"https://doi.org/10.48550/arXiv.2207.04206","url":null,"abstract":"It is difficult for non-autoregressive translation (NAT) models to capture the multi-modal distribution of target translations due to their conditional independence assumption, which is known as the “multi-modality problem”, including the lexical multi-modality and the syntactic multi-modality. While the first one has been well studied, the syntactic multi-modality brings severe challenges to the standard cross entropy (XE) loss in NAT and is understudied. In this paper, we conduct a systematic study on the syntactic multi-modality problem. Specifically, we decompose it into short- and long-range syntactic multi-modalities and evaluate several recent NAT algorithms with advanced loss functions on both carefully designed synthesized datasets and real datasets. We find that the Connectionist Temporal Classification (CTC) loss and the Order-Agnostic Cross Entropy (OAXE) loss can better handle short- and long-range syntactic multi-modalities respectively. Furthermore, we take the best of both and design a new loss function to better handle the complicated syntactic multi-modality in real-world datasets. To facilitate practical usage, we provide a guide to using different loss functions for different kinds of syntactic multi-modality.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"80 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121954709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination CoSIm:反事实场景想象的常识推理
Pub Date : 2022-07-08 DOI: 10.48550/arXiv.2207.03961
Hyounghun Kim, Abhaysinh Zala, Mohit Bansal
As humans, we can modify our assumptions about a scene by imagining alternative objects or concepts in our minds. For example, we can easily anticipate the implications of the sun being overcast by rain clouds (e.g., the street will get wet) and accordingly prepare for that. In this paper, we introduce a new dataset called Commonsense Reasoning for Counterfactual Scene Imagination (CoSIm) which is designed to evaluate the ability of AI systems to reason about scene change imagination. To be specific, in this multimodal task/dataset, models are given an image and an initial question-response pair about the image. Next, a counterfactual imagined scene change (in textual form) is applied, and the model has to predict the new response to the initial question based on this scene change. We collect 3.5K high-quality and challenging data instances, with each instance consisting of an image, a commonsense question with a response, a description of a counterfactual change, a new response to the question, and three distractor responses. Our dataset contains various complex scene change types (such as object addition/removal/state change, event description, environment change, etc.) that require models to imagine many different scenarios and reason about the changed scenes. We present a baseline model based on a vision-language Transformer (i.e., LXMERT) and ablation studies. Through human evaluation, we demonstrate a large human-model performance gap, suggesting room for promising future work on this challenging, counterfactual multimodal task.
作为人类,我们可以通过在脑海中想象不同的物体或概念来修改我们对场景的假设。例如,我们可以很容易地预测太阳被雨云遮蔽的影响(例如,街道会变湿),并相应地做好准备。在本文中,我们引入了一个名为“反事实场景想象常识推理”(CoSIm)的新数据集,该数据集旨在评估人工智能系统对场景变化想象的推理能力。具体来说,在这个多模态任务/数据集中,模型被给定一个图像和一个关于图像的初始问题-响应对。接下来,应用反事实的想象场景变化(以文本形式),模型必须根据这种场景变化预测对初始问题的新反应。我们收集了3.5K高质量且具有挑战性的数据实例,每个实例由一张图像、一个带响应的常识性问题、对反事实变化的描述、对问题的新响应和三个干扰响应组成。我们的数据集包含各种复杂的场景变化类型(例如对象添加/移除/状态变化,事件描述,环境变化等),这需要模型想象许多不同的场景并对变化的场景进行推理。我们提出了一个基于视觉语言转换器(即LXMERT)和消融研究的基线模型。通过人类评估,我们证明了人类模型的巨大性能差距,这表明在这个具有挑战性的反事实多模态任务上有希望的未来工作空间。
{"title":"CoSIm: Commonsense Reasoning for Counterfactual Scene Imagination","authors":"Hyounghun Kim, Abhaysinh Zala, Mohit Bansal","doi":"10.48550/arXiv.2207.03961","DOIUrl":"https://doi.org/10.48550/arXiv.2207.03961","url":null,"abstract":"As humans, we can modify our assumptions about a scene by imagining alternative objects or concepts in our minds. For example, we can easily anticipate the implications of the sun being overcast by rain clouds (e.g., the street will get wet) and accordingly prepare for that. In this paper, we introduce a new dataset called Commonsense Reasoning for Counterfactual Scene Imagination (CoSIm) which is designed to evaluate the ability of AI systems to reason about scene change imagination. To be specific, in this multimodal task/dataset, models are given an image and an initial question-response pair about the image. Next, a counterfactual imagined scene change (in textual form) is applied, and the model has to predict the new response to the initial question based on this scene change. We collect 3.5K high-quality and challenging data instances, with each instance consisting of an image, a commonsense question with a response, a description of a counterfactual change, a new response to the question, and three distractor responses. Our dataset contains various complex scene change types (such as object addition/removal/state change, event description, environment change, etc.) that require models to imagine many different scenarios and reason about the changed scenes. We present a baseline model based on a vision-language Transformer (i.e., LXMERT) and ablation studies. Through human evaluation, we demonstrate a large human-model performance gap, suggesting room for promising future work on this challenging, counterfactual multimodal task.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130300682","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering OmniTab:使用自然和合成数据进行预训练,用于几次基于表格的问答
Pub Date : 2022-07-08 DOI: 10.48550/arXiv.2207.03637
Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen
The information in tables can be an important complement to text, making table-based question answering (QA) systems of great value. The intrinsic complexity of handling tables often adds an extra burden to both model design and data annotation. In this paper, we aim to develop a simple table-based QA model with minimal annotation effort. Motivated by the fact that table-based QA requires both alignment between questions and tables and the ability to perform complicated reasoning over multiple table elements, we propose an omnivorous pretraining approach that consumes both natural and synthetic data to endow models with these respective abilities. Specifically, given freely available tables, we leverage retrieval to pair them with relevant natural sentences for mask-based pretraining, and synthesize NL questions by converting SQL sampled from tables for pretraining with a QA loss. We perform extensive experiments in both few-shot and full settings, and the results clearly demonstrate the superiority of our model OmniTab, with the best multitasking approach achieving an absolute gain of 16.2% and 2.7% in 128-shot and full settings respectively, also establishing a new state-of-the-art on WikiTableQuestions. Detailed ablations and analyses reveal different characteristics of natural and synthetic data, shedding light on future directions in omnivorous pretraining.
表格中的信息可以作为文本的重要补充,使基于表格的问答(QA)系统具有很大的价值。处理表固有的复杂性通常会给模型设计和数据注释增加额外的负担。在本文中,我们的目标是用最少的注释工作开发一个简单的基于表的QA模型。基于表的QA既需要问题和表之间的一致性,也需要对多个表元素执行复杂推理的能力,因此我们提出了一种杂食性的预训练方法,它使用自然数据和合成数据来赋予模型这些各自的能力。具体来说,给定免费可用的表,我们利用检索将它们与相关的自然句子配对进行基于掩码的预训练,并通过转换从表中采样的SQL进行预训练来合成NL问题。我们在少镜头和全镜头设置下进行了大量的实验,结果清楚地证明了我们的模型OmniTab的优越性,最佳的多任务处理方法在128镜头和全镜头设置下分别获得了16.2%和2.7%的绝对增益,同时也在WikiTableQuestions上建立了新的技术水平。详细的消融和分析揭示了自然数据和合成数据的不同特征,为杂食性预训练的未来方向指明了方向。
{"title":"OmniTab: Pretraining with Natural and Synthetic Data for Few-shot Table-based Question Answering","authors":"Zhengbao Jiang, Yi Mao, Pengcheng He, Graham Neubig, Weizhu Chen","doi":"10.48550/arXiv.2207.03637","DOIUrl":"https://doi.org/10.48550/arXiv.2207.03637","url":null,"abstract":"The information in tables can be an important complement to text, making table-based question answering (QA) systems of great value. The intrinsic complexity of handling tables often adds an extra burden to both model design and data annotation. In this paper, we aim to develop a simple table-based QA model with minimal annotation effort. Motivated by the fact that table-based QA requires both alignment between questions and tables and the ability to perform complicated reasoning over multiple table elements, we propose an omnivorous pretraining approach that consumes both natural and synthetic data to endow models with these respective abilities. Specifically, given freely available tables, we leverage retrieval to pair them with relevant natural sentences for mask-based pretraining, and synthesize NL questions by converting SQL sampled from tables for pretraining with a QA loss. We perform extensive experiments in both few-shot and full settings, and the results clearly demonstrate the superiority of our model OmniTab, with the best multitasking approach achieving an absolute gain of 16.2% and 2.7% in 128-shot and full settings respectively, also establishing a new state-of-the-art on WikiTableQuestions. Detailed ablations and analyses reveal different characteristics of natural and synthetic data, shedding light on future directions in omnivorous pretraining.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"284 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122962578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Compositional Generalization in Grounded Language Learning via Induced Model Sparsity 基于诱导模型稀疏性的基础语言学习中的组合泛化
Pub Date : 2022-07-06 DOI: 10.48550/arXiv.2207.02518
Sam Spilsbury, A. Ilin
We provide a study of how induced model sparsity can help achieve compositional generalization and better sample efficiency in grounded language learning problems. We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations. We show that standard neural architectures do not always yield compositional generalization. To address this, we design an agent that contains a goal identification module that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal. The output of the goal identification module is the input to a value iteration network planner. Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations. We examine the internal representations of our agent and find the correct correspondences between words in its dictionary and attributes in the environment.
我们提供了一个关于诱导模型稀疏性如何帮助在基础语言学习问题中实现组合泛化和更好的样本效率的研究。我们考虑简单的语言条件下的导航问题,在网格世界环境与解纠缠的观察。我们表明标准的神经结构并不总是产生组合泛化。为了解决这个问题,我们设计了一个包含目标识别模块的代理,该模块鼓励指令中的单词和对象属性之间的稀疏关联,将它们组合在一起以找到目标。目标识别模块的输出是值迭代网络规划器的输入。即使从少量演示中学习,我们的智能体在包含新属性组合的目标上也能保持高水平的性能。我们检查代理的内部表示,并找到其字典中的单词与环境中的属性之间的正确对应关系。
{"title":"Compositional Generalization in Grounded Language Learning via Induced Model Sparsity","authors":"Sam Spilsbury, A. Ilin","doi":"10.48550/arXiv.2207.02518","DOIUrl":"https://doi.org/10.48550/arXiv.2207.02518","url":null,"abstract":"We provide a study of how induced model sparsity can help achieve compositional generalization and better sample efficiency in grounded language learning problems. We consider simple language-conditioned navigation problems in a grid world environment with disentangled observations. We show that standard neural architectures do not always yield compositional generalization. To address this, we design an agent that contains a goal identification module that encourages sparse correlations between words in the instruction and attributes of objects, composing them together to find the goal. The output of the goal identification module is the input to a value iteration network planner. Our agent maintains a high level of performance on goals containing novel combinations of properties even when learning from a handful of demonstrations. We examine the internal representations of our agent and find the correct correspondences between words in its dictionary and attributes in the environment.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"133 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131196995","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia 将骗局置于情境中:识别黑手党游戏中的欺骗行为者
Pub Date : 2022-07-05 DOI: 10.48550/arXiv.2207.02253
Samee Ibraheem, G. Zhou, John DeNero
While neural networks demonstrate a remarkable ability to model linguistic content, capturing contextual information related to a speaker’s conversational role is an open area of research. In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. In addition to building a framework to collect a dataset of Mafia game records, we demonstrate that there are differences in the language produced by players with different roles. We confirm that classification models are able to rank deceptive players as more suspicious than honest ones based only on their use of language. Furthermore, we show that training models on two auxiliary tasks outperforms a standard BERT-based text classification approach. We also present methods for using our trained models to identify features that distinguish between player roles, which could be used to assist players during the Mafia game.
虽然神经网络在模拟语言内容方面表现出非凡的能力,但捕捉与说话者会话角色相关的上下文信息是一个开放的研究领域。在这项工作中,我们通过黑手党游戏来分析说话者角色对语言使用的影响,在该游戏中,参与者被分配为诚实或欺骗的角色。除了建立一个框架来收集黑手党游戏记录的数据集之外,我们还证明了不同角色的玩家所产生的语言存在差异。我们确认,分类模型能够根据欺骗玩家的语言使用情况,将他们评为比诚实玩家更可疑的玩家。此外,我们证明了两个辅助任务的训练模型优于标准的基于bert的文本分类方法。我们还提出了使用我们训练过的模型来识别区分玩家角色的特征的方法,这些特征可以用于在黑手党游戏中帮助玩家。
{"title":"Putting the Con in Context: Identifying Deceptive Actors in the Game of Mafia","authors":"Samee Ibraheem, G. Zhou, John DeNero","doi":"10.48550/arXiv.2207.02253","DOIUrl":"https://doi.org/10.48550/arXiv.2207.02253","url":null,"abstract":"While neural networks demonstrate a remarkable ability to model linguistic content, capturing contextual information related to a speaker’s conversational role is an open area of research. In this work, we analyze the effect of speaker role on language use through the game of Mafia, in which participants are assigned either an honest or a deceptive role. In addition to building a framework to collect a dataset of Mafia game records, we demonstrate that there are differences in the language produced by players with different roles. We confirm that classification models are able to rank deceptive players as more suspicious than honest ones based only on their use of language. Furthermore, we show that training models on two auxiliary tasks outperforms a standard BERT-based text classification approach. We also present methods for using our trained models to identify features that distinguish between player roles, which could be used to assist players during the Mafia game.","PeriodicalId":382084,"journal":{"name":"North American Chapter of the Association for Computational Linguistics","volume":"34 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127743239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
North American Chapter of the Association for Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1