Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献_第8页

AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning AutoCAD:自动生成反事实以减轻捷径学习

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.16202

Jiaxin Wen, Yeshuang Zhu, Jinchao Zhang, Jie Zhou, Minlie Huang

Recent studies have shown the impressive efficacy of counterfactually augmented data (CAD) for reducing NLU models' reliance on spurious features and improving their generalizability. However, current methods still heavily rely on human efforts or task-specific designs to generate counterfactuals, thereby impeding CAD's applicability to a broad range of NLU tasks. In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework. AutoCAD first leverages a classifier to unsupervisedly identify rationales as spans to be intervened, which disentangles spurious and causal features. Then, AutoCAD performs controllable generation enhanced by unlikelihood training to produce diverse counterfactuals. Extensive evaluations on multiple out-of-domain and challenge benchmarks demonstrate that AutoCAD consistently and significantly boosts the out-of-distribution performance of powerful pre-trained models across different NLU tasks, which is comparable or even better than previous state-of-the-art human-in-the-loop or task-specific CAD methods. The code is publicly available at https://github.com/thu-coai/AutoCAD.

最近的研究表明，反事实增强数据(CAD)在减少NLU模型对虚假特征的依赖和提高其泛化能力方面具有令人印象深刻的功效。然而，目前的方法仍然严重依赖于人类的努力或特定任务的设计来生成反事实，从而阻碍了CAD对广泛的NLU任务的适用性。在本文中，我们提出了AutoCAD，一个全自动和任务无关的CAD生成框架。AutoCAD首先利用分类器无监督地识别要干预的范围的基本原理，从而分离虚假和因果特征。然后，AutoCAD进行非似然训练增强的可控生成，生成多种反事实。对多个域外和挑战基准的广泛评估表明，AutoCAD在不同的NLU任务中持续且显著地提高了强大的预训练模型的分布外性能，这与以前最先进的人在环或特定任务的CAD方法相当甚至更好。该代码可在https://github.com/thu-coai/AutoCAD上公开获得。

{"title":"AutoCAD: Automatically Generating Counterfactuals for Mitigating Shortcut Learning","authors":"Jiaxin Wen, Yeshuang Zhu, Jinchao Zhang, Jie Zhou, Minlie Huang","doi":"10.48550/arXiv.2211.16202","DOIUrl":"https://doi.org/10.48550/arXiv.2211.16202","url":null,"abstract":"Recent studies have shown the impressive efficacy of counterfactually augmented data (CAD) for reducing NLU models' reliance on spurious features and improving their generalizability. However, current methods still heavily rely on human efforts or task-specific designs to generate counterfactuals, thereby impeding CAD's applicability to a broad range of NLU tasks. In this paper, we present AutoCAD, a fully automatic and task-agnostic CAD generation framework. AutoCAD first leverages a classifier to unsupervisedly identify rationales as spans to be intervened, which disentangles spurious and causal features. Then, AutoCAD performs controllable generation enhanced by unlikelihood training to produce diverse counterfactuals. Extensive evaluations on multiple out-of-domain and challenge benchmarks demonstrate that AutoCAD consistently and significantly boosts the out-of-distribution performance of powerful pre-trained models across different NLU tasks, which is comparable or even better than previous state-of-the-art human-in-the-loop or task-specific CAD methods. The code is publicly available at https://github.com/thu-coai/AutoCAD.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2 1","pages":"2302-2317"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84229886","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Textual Enhanced Contrastive Learning for Solving Math Word Problems 文本强化对比学习解决数学字题

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.16022

Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng, S. Kurohashi

Solving math word problems is the task that analyses the relation of quantities and requires an accurate understanding of contextual natural language information. Recent studies show that current models rely on shallow heuristics to predict solutions and could be easily misled by small textual perturbations. To address this problem, we propose a Textual Enhanced Contrastive Learning framework, which enforces the models to distinguish semantically similar examples while holding different mathematical logic. We adopt a self-supervised manner strategy to enrich examples with subtle textual variance by textual reordering or problem re-construction. We then retrieve the hardest to differentiate samples from both equation and textual perspectives and guide the model to learn their representations. Experimental results show that our method achieves state-of-the-art on both widely used benchmark datasets and also exquisitely designed challenge datasets in English and Chinese. footnote{Our code and data is available at url{https://github.com/yiyunya/Textual_CL_MWP}

解决数学字题是一项分析数量关系的任务，需要对上下文自然语言信息有准确的理解。最近的研究表明，目前的模型依赖于浅层启发式来预测解决方案，并且很容易被小的文本扰动所误导。为了解决这个问题，我们提出了一个文本增强对比学习框架，该框架强制模型区分语义相似的示例，同时持有不同的数学逻辑。我们采用自监督方式策略，通过文本重新排序或问题重构来丰富具有细微文本差异的示例。然后，我们从方程和文本的角度检索最难区分的样本，并指导模型学习它们的表示。实验结果表明，我们的方法在广泛使用的基准数据集和精心设计的中英文挑战数据集上都达到了最先进的水平。我们的代码和数据可在url{https://github.com/yiyunya/Textual_CL_MWP}

{"title":"Textual Enhanced Contrastive Learning for Solving Math Word Problems","authors":"Yibin Shen, Qianying Liu, Zhuoyuan Mao, Fei Cheng, S. Kurohashi","doi":"10.48550/arXiv.2211.16022","DOIUrl":"https://doi.org/10.48550/arXiv.2211.16022","url":null,"abstract":"Solving math word problems is the task that analyses the relation of quantities and requires an accurate understanding of contextual natural language information. Recent studies show that current models rely on shallow heuristics to predict solutions and could be easily misled by small textual perturbations. To address this problem, we propose a Textual Enhanced Contrastive Learning framework, which enforces the models to distinguish semantically similar examples while holding different mathematical logic. We adopt a self-supervised manner strategy to enrich examples with subtle textual variance by textual reordering or problem re-construction. We then retrieve the hardest to differentiate samples from both equation and textual perspectives and guide the model to learn their representations. Experimental results show that our method achieves state-of-the-art on both widely used benchmark datasets and also exquisitely designed challenge datasets in English and Chinese. footnote{Our code and data is available at url{https://github.com/yiyunya/Textual_CL_MWP}","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"209 1","pages":"4297-4307"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80588020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Chaining Simultaneous Thoughts for Numerical Reasoning 连接数字推理的同步思想

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.16482

Zhihong Shao, Fei Huang, Minlie Huang

Given that rich information is hidden behind ubiquitous numbers in text, numerical reasoning over text should be an essential skill of AI systems. To derive precise equations to solve numerical reasoning problems, previous work focused on modeling the structures of equations, and has proposed various structured decoders. Though structure modeling proves to be effective, these structured decoders construct a single equation in a pre-defined autoregressive order, potentially placing an unnecessary restriction on how a model should grasp the reasoning process. Intuitively, humans may have numerous pieces of thoughts popping up in no pre-defined order; thoughts are not limited to the problem at hand, and can even be concerned with other related problems. By comparing diverse thoughts and chaining relevant pieces, humans are less prone to errors. In this paper, we take this inspiration and propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph where we produce diverse reasoning steps simultaneously without pre-defined decoding dependencies, and compare and chain relevant ones to reach a solution. Extensive experiments demonstrated the effectiveness of CANTOR under both fully-supervised and weakly-supervised settings.

考虑到文本中无处不在的数字背后隐藏着丰富的信息，对文本进行数值推理应该是人工智能系统的一项基本技能。为了推导出精确的方程来解决数值推理问题，以前的工作主要集中在方程结构的建模上，并提出了各种结构化解码器。虽然结构建模被证明是有效的，但这些结构化解码器以预定义的自回归顺序构建单个方程，可能会对模型应该如何掌握推理过程施加不必要的限制。从直觉上讲，人类可能会有无数的想法以没有预先定义的顺序出现;思想不局限于手头的问题，甚至可以关注其他相关的问题。通过比较不同的想法和链接相关的片段，人类就不太容易出错。在本文中，我们受此启发并提出了CANTOR，这是一个数值推理器，它使用有向无环图来建模推理步骤，我们同时产生不同的推理步骤，没有预先定义的解码依赖，并比较和链接相关的步骤以达到解决方案。大量的实验证明了CANTOR在完全监督和弱监督设置下的有效性。

{"title":"Chaining Simultaneous Thoughts for Numerical Reasoning","authors":"Zhihong Shao, Fei Huang, Minlie Huang","doi":"10.48550/arXiv.2211.16482","DOIUrl":"https://doi.org/10.48550/arXiv.2211.16482","url":null,"abstract":"Given that rich information is hidden behind ubiquitous numbers in text, numerical reasoning over text should be an essential skill of AI systems. To derive precise equations to solve numerical reasoning problems, previous work focused on modeling the structures of equations, and has proposed various structured decoders. Though structure modeling proves to be effective, these structured decoders construct a single equation in a pre-defined autoregressive order, potentially placing an unnecessary restriction on how a model should grasp the reasoning process. Intuitively, humans may have numerous pieces of thoughts popping up in no pre-defined order; thoughts are not limited to the problem at hand, and can even be concerned with other related problems. By comparing diverse thoughts and chaining relevant pieces, humans are less prone to errors. In this paper, we take this inspiration and propose CANTOR, a numerical reasoner that models reasoning steps using a directed acyclic graph where we produce diverse reasoning steps simultaneously without pre-defined decoding dependencies, and compare and chain relevant ones to reach a solution. Extensive experiments demonstrated the effectiveness of CANTOR under both fully-supervised and weakly-supervised settings.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"19 1","pages":"2533-2547"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81861306","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Controlled Language Generation for Language Learning Items 语言学习项目的受控语言生成

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-28 DOI: 10.48550/arXiv.2211.15731

Kevin Stowe, Debanjan Ghosh, Mengxuan Zhao

This work aims to employ natural language generation (NLG) to rapidly generate items for English language learning applications: this requires both language models capable of generating fluent, high-quality English, and to control the output of the generation to match the requirements of the relevant items. We experiment with deep pretrained models for this task, developing novel methods for controlling items for factors relevant in language learning: diverse sentences for different proficiency levels and argument structure to test grammar. Human evaluation demonstrates high grammatically scores for all models (3.4 and above out of 4), and higher length (24%) and complexity (9%) over the baseline for the advanced proficiency model. Our results show that we can achieve strong performance while adding additional control to ensure diverse, tailored content for individual users.

这项工作旨在使用自然语言生成(NLG)来快速生成英语语言学习应用程序的项目:这需要语言模型能够生成流利、高质量的英语，并控制生成的输出以匹配相关项目的要求。我们为这项任务试验了深度预训练模型，开发了新的方法来控制与语言学习相关的因素:不同熟练程度的不同句子和测试语法的论点结构。人工评估显示所有模型的语法得分都很高(3.4分及以上，满分4分)，并且比高级熟练度模型的基线长度(24%)和复杂性(9%)更高。我们的结果表明，我们可以在增加额外控制的同时实现强大的性能，以确保为个人用户提供多样化、量身定制的内容。

引用次数: 0

Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality 互斥性训练与原始增强诱导组合性

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-28 DOI: 10.48550/arXiv.2211.15578

Yichen Jiang, Xiang Zhou, Mohit Bansal

Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. In this work, we analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias (one target sequence can only be mapped to one source sequence), and the tendency to memorize whole examples rather than separating structures from contents. We propose two techniques to address these two issues respectively: Mutual Exclusivity Training that prevents the model from producing seen generations when facing novel examples via an unlikelihood-based loss, and prim2primX data augmentation that automatically diversifies the arguments of every syntactic function to prevent memorizing and provide a compositional inductive bias without exposing test-set data. Combining these two techniques, we show substantial empirical improvements using standard sequence-to-sequence models (LSTMs and Transformers) on two widely-used compositionality datasets: SCAN and COGS. Finally, we provide analysis characterizing the improvements as well as the remaining challenges, and provide detailed ablations of our method.

最近的数据集暴露了标准序列到序列模型缺乏系统泛化能力。在这项工作中，我们分析了seq2seq模型的这种行为，并确定了两个影响因素:缺乏互排性偏差(一个目标序列只能映射到一个源序列)，以及倾向于记忆整个示例，而不是将结构与内容分离。我们提出了两种技术来分别解决这两个问题:互斥性训练(Mutual Exclusivity Training)和prim2primX数据增强(prim2primX data augmentation)。互斥性训练通过基于非可能性的损失来防止模型在面对新示例时产生未见代，以及prim2primX数据增强(prim2primX data augmentation)，自动使每个语法函数的参数多样化，以防止记忆，并在不暴露测试集数据的情况下提供组合归纳偏差。结合这两种技术，我们展示了在两种广泛使用的组合性数据集:SCAN和COGS上使用标准序列到序列模型(LSTMs和transformer)的实质性经验改进。最后，我们分析了改进的特点，以及仍然存在的挑战，并提供了详细的消融我们的方法。

{"title":"Mutual Exclusivity Training and Primitive Augmentation to Induce Compositionality","authors":"Yichen Jiang, Xiang Zhou, Mohit Bansal","doi":"10.48550/arXiv.2211.15578","DOIUrl":"https://doi.org/10.48550/arXiv.2211.15578","url":null,"abstract":"Recent datasets expose the lack of the systematic generalization ability in standard sequence-to-sequence models. In this work, we analyze this behavior of seq2seq models and identify two contributing factors: a lack of mutual exclusivity bias (one target sequence can only be mapped to one source sequence), and the tendency to memorize whole examples rather than separating structures from contents. We propose two techniques to address these two issues respectively: Mutual Exclusivity Training that prevents the model from producing seen generations when facing novel examples via an unlikelihood-based loss, and prim2primX data augmentation that automatically diversifies the arguments of every syntactic function to prevent memorizing and provide a compositional inductive bias without exposing test-set data. Combining these two techniques, we show substantial empirical improvements using standard sequence-to-sequence models (LSTMs and Transformers) on two widely-used compositionality datasets: SCAN and COGS. Finally, we provide analysis characterizing the improvements as well as the remaining challenges, and provide detailed ablations of our method.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"128 8 1","pages":"11778-11793"},"PeriodicalIF":0.0,"publicationDate":"2022-11-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77313549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding MGDoc:用于文档图像理解的多粒度层次预训练

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-27 DOI: 10.48550/arXiv.2211.14958

Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, A. Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu

Document images are a ubiquitous source of data where the text is organized in a complex hierarchical structure ranging from fine granularity (e.g., words), medium granularity (e.g., regions such as paragraphs or figures), to coarse granularity (e.g., the whole page). The spatial hierarchical relationships between content at different levels of granularity are crucial for document image understanding tasks. Existing methods learn features from either word-level or region-level but fail to consider both simultaneously. Word-level models are restricted by the fact that they originate from pure-text language models, which only encode the word-level context. In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features. To deal with these issues, we propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time. MGDoc uses a unified text-visual encoder to obtain multi-modal features across different granularities, which makes it possible to project the multi-granular features into the same hyperspace. To model the region-word correlation, we design a cross-granular attention mechanism and specific pre-training tasks for our model to reinforce the model of learning the hierarchy between regions and words. Experiments demonstrate that our proposed model can learn better features that perform well across granularities and lead to improvements in downstream tasks.

文档图像是一种无处不在的数据源，其中文本以复杂的层次结构组织，范围从细粒度(例如，单词)，中等粒度(例如，段落或图形等区域)到粗粒度(例如，整个页面)。不同粒度级别的内容之间的空间层次关系对于文档图像理解任务至关重要。现有的方法要么从词级学习特征，要么从区域级学习特征，但没有同时考虑这两个方面。词级模型受到纯文本语言模型的限制，纯文本语言模型只对词级上下文进行编码。相比之下，区域级模型试图将对应于段落或文本块的区域编码到单个嵌入中，但它们在附加词级特征时表现较差。为了解决这些问题，我们提出了一种新的多模态多粒度预训练框架MGDoc，它可以同时编码页面级、区域级和词级信息。MGDoc使用统一的文本-视觉编码器获得不同粒度的多模态特征，这使得将多粒度特征投影到同一超空间成为可能。为了建立区域-词相关的模型，我们设计了一个跨粒度的注意机制和特定的预训练任务，以加强区域和词之间的层次学习模型。实验表明，我们提出的模型可以学习更好的特征，这些特征在粒度上表现良好，并导致下游任务的改进。

{"title":"MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding","authors":"Zilong Wang, Jiuxiang Gu, Chris Tensmeyer, Nikolaos Barmpalios, A. Nenkova, Tong Sun, Jingbo Shang, Vlad I. Morariu","doi":"10.48550/arXiv.2211.14958","DOIUrl":"https://doi.org/10.48550/arXiv.2211.14958","url":null,"abstract":"Document images are a ubiquitous source of data where the text is organized in a complex hierarchical structure ranging from fine granularity (e.g., words), medium granularity (e.g., regions such as paragraphs or figures), to coarse granularity (e.g., the whole page). The spatial hierarchical relationships between content at different levels of granularity are crucial for document image understanding tasks. Existing methods learn features from either word-level or region-level but fail to consider both simultaneously. Word-level models are restricted by the fact that they originate from pure-text language models, which only encode the word-level context. In contrast, region-level models attempt to encode regions corresponding to paragraphs or text blocks into a single embedding, but they perform worse with additional word-level features. To deal with these issues, we propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time. MGDoc uses a unified text-visual encoder to obtain multi-modal features across different granularities, which makes it possible to project the multi-granular features into the same hyperspace. To model the region-word correlation, we design a cross-granular attention mechanism and specific pre-training tasks for our model to reinforce the model of learning the hierarchy between regions and words. Experiments demonstrate that our proposed model can learn better features that perform well across granularities and lead to improvements in downstream tasks.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"12 1","pages":"3984-3993"},"PeriodicalIF":0.0,"publicationDate":"2022-11-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76004242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 6

Towards Better Document-level Relation Extraction via Iterative Inference 通过迭代推理实现更好的文档级关系提取

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-26 DOI: 10.48550/arXiv.2211.14470

L. Zhang, Jinsong Su, Yidong Chen, Zhongjian Miao, Zijun Min, Qingguo Hu, X. Shi

Document-level relation extraction (RE) aims to extract the relations between entities from the input document that usually containing many difficultly-predicted entity pairs whose relations can only be predicted through relational inference. Existing methods usually directly predict the relations of all entity pairs of input document in a one-pass manner, ignoring the fact that predictions of some entity pairs heavily depend on the predicted results of other pairs. To deal with this issue, in this paper, we propose a novel document-level RE model with iterative inference. Our model is mainly composed of two modules: 1) a base module expected to provide preliminary relation predictions on entity pairs; 2) an inference module introduced to refine these preliminary predictions by iteratively dealing with difficultly-predicted entity pairs depending on other pairs in an easy-to-hard manner. Unlike previous methods which only consider feature information of entity pairs, our inference module is equipped with two Extended Cross Attention units, allowing it to exploit both feature information and previous predictions of entity pairs during relational inference. Furthermore, we adopt a two-stage strategy to train our model. At the first stage, we only train our base module. During the second stage, we train the whole model, where contrastive learning is introduced to enhance the training of inference module. Experimental results on three commonly-used datasets show that our model consistently outperforms other competitive baselines.

文档级关系提取(RE)旨在从输入文档中提取实体之间的关系，这些实体通常包含许多难以预测的实体对，这些实体对的关系只能通过关系推理来预测。现有的方法通常以一遍的方式直接预测输入文档中所有实体对的关系，忽略了某些实体对的预测严重依赖于其他实体对的预测结果的事实。为了解决这一问题，本文提出了一种具有迭代推理的文档级RE模型。我们的模型主要由两个模块组成:1)基础模块，用于对实体对提供初步的关系预测;2)引入推理模块，通过以易难的方式迭代处理依赖于其他实体对的难以预测的实体对，来细化这些初步预测。与以往的方法只考虑实体对的特征信息不同，我们的推理模块配备了两个扩展交叉注意单元，允许它在关系推理过程中同时利用特征信息和实体对的先前预测。此外，我们采用两阶段策略来训练我们的模型。在第一阶段，我们只训练我们的基础模块。在第二阶段，我们对整个模型进行训练，其中引入对比学习来增强推理模块的训练。在三个常用数据集上的实验结果表明，我们的模型始终优于其他具有竞争力的基线。

{"title":"Towards Better Document-level Relation Extraction via Iterative Inference","authors":"L. Zhang, Jinsong Su, Yidong Chen, Zhongjian Miao, Zijun Min, Qingguo Hu, X. Shi","doi":"10.48550/arXiv.2211.14470","DOIUrl":"https://doi.org/10.48550/arXiv.2211.14470","url":null,"abstract":"Document-level relation extraction (RE) aims to extract the relations between entities from the input document that usually containing many difficultly-predicted entity pairs whose relations can only be predicted through relational inference. Existing methods usually directly predict the relations of all entity pairs of input document in a one-pass manner, ignoring the fact that predictions of some entity pairs heavily depend on the predicted results of other pairs. To deal with this issue, in this paper, we propose a novel document-level RE model with iterative inference. Our model is mainly composed of two modules: 1) a base module expected to provide preliminary relation predictions on entity pairs; 2) an inference module introduced to refine these preliminary predictions by iteratively dealing with difficultly-predicted entity pairs depending on other pairs in an easy-to-hard manner. Unlike previous methods which only consider feature information of entity pairs, our inference module is equipped with two Extended Cross Attention units, allowing it to exploit both feature information and previous predictions of entity pairs during relational inference. Furthermore, we adopt a two-stage strategy to train our model. At the first stage, we only train our base module. During the second stage, we train the whole model, where contrastive learning is introduced to enhance the training of inference module. Experimental results on three commonly-used datasets show that our model consistently outperforms other competitive baselines.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"21 1","pages":"8306-8317"},"PeriodicalIF":0.0,"publicationDate":"2022-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78812023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

CodeExp: Explanatory Code Document Generation 解释性代码文档生成

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-25 DOI: 10.48550/arXiv.2211.15395

Haotian Cui, Chenglong Wang, Junjie Huang, J. Inala, Todd Mytkowicz, Bolong Wang, Jian Gao, Nan Duan

Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code that do not capture implementation-level choices essential for these scenarios. To fill in this gap, we propose the code explanation generation task. We first conducted a human study to identify the criteria for high-quality explanatory docstring for code. Based on that, we collected and refined a large-scale code docstring corpus and formulated automatic evaluation metrics that best match human assessments. Finally, we present a multi-stage fine-tuning strategy and baseline models for the task. Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones. We envision our training dataset, human-evaluation protocol, recommended metrics, and fine-tuning strategy can boost future code explanation research. The code and annotated data are available at https://github.com/subercui/CodeExp.

开发能够自动生成详细代码解释的模型可以极大地有利于软件维护和编程教育。然而，现有的代码到文本生成模型通常只生成代码的高级摘要，而不能捕获这些场景所必需的实现级选择。为了填补这一空白，我们提出了代码解释生成任务。我们首先进行了一项人类研究，以确定用于代码的高质量解释性文档字符串的标准。在此基础上，我们收集并细化了一个大规模的代码文档字符串语料库，并制定了最符合人类评估的自动评估指标。最后，我们提出了一个多阶段的微调策略和基线模型。我们的实验表明:(1)与较大的未精炼数据(大15倍)相比，我们的精炼训练数据集让模型在解释生成任务中获得更好的性能，(2)微调模型可以生成结构良好的长文档字符串，与人类编写的文档字符串相当。我们设想我们的训练数据集、人类评估协议、推荐指标和微调策略可以促进未来的代码解释研究。代码和带注释的数据可在https://github.com/subercui/CodeExp上获得。

{"title":"CodeExp: Explanatory Code Document Generation","authors":"Haotian Cui, Chenglong Wang, Junjie Huang, J. Inala, Todd Mytkowicz, Bolong Wang, Jian Gao, Nan Duan","doi":"10.48550/arXiv.2211.15395","DOIUrl":"https://doi.org/10.48550/arXiv.2211.15395","url":null,"abstract":"Developing models that can automatically generate detailed code explanation can greatly benefit software maintenance and programming education. However, existing code-to-text generation models often produce only high-level summaries of code that do not capture implementation-level choices essential for these scenarios. To fill in this gap, we propose the code explanation generation task. We first conducted a human study to identify the criteria for high-quality explanatory docstring for code. Based on that, we collected and refined a large-scale code docstring corpus and formulated automatic evaluation metrics that best match human assessments. Finally, we present a multi-stage fine-tuning strategy and baseline models for the task. Our experiments show that (1) our refined training dataset lets models achieve better performance in the explanation generation tasks compared to larger unrefined data (15x larger), and (2) fine-tuned models can generate well-structured long docstrings comparable to human-written ones. We envision our training dataset, human-evaluation protocol, recommended metrics, and fine-tuning strategy can boost future code explanation research. The code and annotated data are available at https://github.com/subercui/CodeExp.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"62 1","pages":"2342-2354"},"PeriodicalIF":0.0,"publicationDate":"2022-11-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73050221","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering 无监督常识性问答的两阶段生成提示

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-24 DOI: 10.48550/arXiv.2211.13515

Yueqing Sun, Yu Zhang, Le Qi, Qi Shi

Unsupervised commonsense question answering requires mining effective commonsense knowledge without the rely on the labeled task data. Previous methods typically retrieved from traditional knowledge bases or used pre-trained language models (PrLMs) to generate fixed types of knowledge, which have poor generalization ability. In this paper, we aim to address the above limitation by leveraging the implicit knowledge stored in PrLMs and propose a two-stage prompt-based unsupervised commonsense question answering framework (TSGP). Specifically, we first use knowledge generation prompts to generate the knowledge required for questions with unlimited types and possible candidate answers independent of specified choices. Then, we further utilize answer generation prompts to generate possible candidate answers independent of specified choices. Experimental results and analysis on three different commonsense reasoning tasks, CommonsenseQA, OpenBookQA, and SocialIQA, demonstrate that TSGP significantly improves the reasoning ability of language models in unsupervised settings. Our code is available at: https://github.com/Yueqing-Sun/TSGP.

无监督常识性问答需要挖掘有效的常识性知识，而不依赖于标记的任务数据。以往的方法一般是从传统知识库中检索或使用预训练语言模型(prlm)生成固定类型的知识，泛化能力较差。在本文中，我们旨在通过利用存储在prlm中的隐式知识来解决上述限制，并提出了一个两阶段基于提示的无监督常识问答框架(TSGP)。具体来说，我们首先使用知识生成提示来生成具有无限类型和独立于指定选项的可能候选答案的问题所需的知识。然后，我们进一步利用答案生成提示来生成独立于指定选项的可能候选答案。在CommonsenseQA、OpenBookQA和SocialIQA三种不同的常识性推理任务上的实验结果和分析表明，TSGP显著提高了语言模型在无监督环境下的推理能力。我们的代码可在:https://github.com/Yueqing-Sun/TSGP。

{"title":"TSGP: Two-Stage Generative Prompting for Unsupervised Commonsense Question Answering","authors":"Yueqing Sun, Yu Zhang, Le Qi, Qi Shi","doi":"10.48550/arXiv.2211.13515","DOIUrl":"https://doi.org/10.48550/arXiv.2211.13515","url":null,"abstract":"Unsupervised commonsense question answering requires mining effective commonsense knowledge without the rely on the labeled task data. Previous methods typically retrieved from traditional knowledge bases or used pre-trained language models (PrLMs) to generate fixed types of knowledge, which have poor generalization ability. In this paper, we aim to address the above limitation by leveraging the implicit knowledge stored in PrLMs and propose a two-stage prompt-based unsupervised commonsense question answering framework (TSGP). Specifically, we first use knowledge generation prompts to generate the knowledge required for questions with unlimited types and possible candidate answers independent of specified choices. Then, we further utilize answer generation prompts to generate possible candidate answers independent of specified choices. Experimental results and analysis on three different commonsense reasoning tasks, CommonsenseQA, OpenBookQA, and SocialIQA, demonstrate that TSGP significantly improves the reasoning ability of language models in unsupervised settings. Our code is available at: https://github.com/Yueqing-Sun/TSGP.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"71 1","pages":"968-980"},"PeriodicalIF":0.0,"publicationDate":"2022-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77006874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems 数学应用题教学中苏格拉底式子题的自动生成

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-23 DOI: 10.48550/arXiv.2211.12835

K. Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan

Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers.In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning.On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.

苏格拉底式提问是一种教育方法，它允许学生通过提出一系列深思熟虑的问题来发现复杂问题的答案。生成具有教学意义的合理问题是具有挑战性的，需要理解问题中涉及的推理过程。我们假设这种提问策略不仅可以提高人类的表现，而且可以帮助数学单词问题(MWP)的解决者。在这项工作中，我们探索了大型语言模型(LMs)在生成顺序问题以指导数学单词解决问题方面的能力。我们提出了各种基于输入条件和强化学习的引导问题生成方案。在自动和人工质量评估中，我们发现具有理想问题属性的LMs生成了更好的问题，并提高了数学单词问题解决器的整体性能。我们进行了初步的用户研究，以检验这些问题生成模型在教育领域的潜在价值。结果表明，问题的难度水平在决定问题是提高还是阻碍人的表现方面起着重要作用。我们讨论了在教育中使用这种提问策略的未来。

{"title":"Automatic Generation of Socratic Subquestions for Teaching Math Word Problems","authors":"K. Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan","doi":"10.48550/arXiv.2211.12835","DOIUrl":"https://doi.org/10.48550/arXiv.2211.12835","url":null,"abstract":"Socratic questioning is an educational method that allows students to discover answers to complex problems by asking them a series of thoughtful questions. Generation of didactically sound questions is challenging, requiring understanding of the reasoning process involved in the problem. We hypothesize that such questioning strategy can not only enhance the human performance, but also assist the math word problem (MWP) solvers.In this work, we explore the ability of large language models (LMs) in generating sequential questions for guiding math word problem-solving. We propose various guided question generation schemes based on input conditioning and reinforcement learning.On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver. We conduct a preliminary user study to examine the potential value of such question generation models in the education domain. Results suggest that the difficulty level of problems plays an important role in determining whether questioning improves or hinders human performance. We discuss the future of using such questioning strategies in education.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"83 1","pages":"4136-4149"},"PeriodicalIF":0.0,"publicationDate":"2022-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89884272","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 13