首页 > 最新文献

Proceedings of COLING. International Conference on Computational Linguistics最新文献

英文 中文
CoLo: A Contrastive Learning Based Re-ranking Framework for One-Stage Summarization CoLo:一种基于对比学习的单阶段总结重新排序框架
Pub Date : 2022-09-29 DOI: 10.48550/arXiv.2209.14569
Chen An, Ming Zhong, Zhiyong Wu, Qinen Zhu, Xuanjing Huang, Xipeng Qiu
Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives. However, the output summary is always evaluated from summary-level which leads to the inconsistency in training and evaluation. In this paper, we propose a Contrastive Learning based re-ranking framework for one-stage summarization called CoLo. By modeling a contrastive objective, we show that the summarization model is able to directly generate summaries according to the summary-level score without additional modules and parameters. Extensive experiments demonstrate that CoLo boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score while preserving the parameter efficiency and inference efficiency. Compared with state-of-the-art multi-stage systems, we save more than 100 GPU training hours and obtaining 3x 8x speed-up ratio during inference while maintaining comparable results.
抽取和抽象摘要系统的传统训练范式总是只使用标记级或句子级的训练目标。然而,输出总结总是从总结层面进行评估,这导致了培训和评估的不一致。在本文中,我们提出了一种基于对比学习的单阶段总结重新排序框架,称为CoLo。通过对对比目标进行建模,我们证明了该总结模型能够根据总结水平得分直接生成摘要,而无需额外的模块和参数。大量的实验表明,CoLo在保持参数效率和推理效率的同时,将CNN/DailyMail基准上的一级系统的抽取和抽象结果提高到44.58分和46.33分ROUGE-1。与最先进的多级系统相比,我们节省了100多个GPU训练小时,在推理过程中获得了3 - 8倍的加速比,同时保持了可比的结果。
{"title":"CoLo: A Contrastive Learning Based Re-ranking Framework for One-Stage Summarization","authors":"Chen An, Ming Zhong, Zhiyong Wu, Qinen Zhu, Xuanjing Huang, Xipeng Qiu","doi":"10.48550/arXiv.2209.14569","DOIUrl":"https://doi.org/10.48550/arXiv.2209.14569","url":null,"abstract":"Traditional training paradigms for extractive and abstractive summarization systems always only use token-level or sentence-level training objectives. However, the output summary is always evaluated from summary-level which leads to the inconsistency in training and evaluation. In this paper, we propose a Contrastive Learning based re-ranking framework for one-stage summarization called CoLo. By modeling a contrastive objective, we show that the summarization model is able to directly generate summaries according to the summary-level score without additional modules and parameters. Extensive experiments demonstrate that CoLo boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score while preserving the parameter efficiency and inference efficiency. Compared with state-of-the-art multi-stage systems, we save more than 100 GPU training hours and obtaining 3x 8x speed-up ratio during inference while maintaining comparable results.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"22 1","pages":"5783-5793"},"PeriodicalIF":0.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88046517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Coarse-to-fine Cascaded Evidence-Distillation Neural Network for Explainable Fake News Detection 用于可解释假新闻检测的粗到细级联证据蒸馏神经网络
Pub Date : 2022-09-29 DOI: 10.48550/arXiv.2209.14642
Zhiwei Yang, Jing Ma, Hechang Chen, Hongzhan Lin, Ziyang Luo, Yi Chang
Existing fake news detection methods aim to classify a piece of news as true or false and provide veracity explanations, achieving remarkable performances. However, they often tailor automated solutions on manual fact-checked reports, suffering from limited news coverage and debunking delays. When a piece of news has not yet been fact-checked or debunked, certain amounts of relevant raw reports are usually disseminated on various media outlets, containing the wisdom of crowds to verify the news claim and explain its verdict. In this paper, we propose a novel Coarse-to-fine Cascaded Evidence-Distillation (CofCED) neural network for explainable fake news detection based on such raw reports, alleviating the dependency on fact-checked ones. Specifically, we first utilize a hierarchical encoder for web text representation, and then develop two cascaded selectors to select the most explainable sentences for verdicts on top of the selected top-K reports in a coarse-to-fine manner. Besides, we construct two explainable fake news datasets, which is publicly available. Experimental results demonstrate that our model significantly outperforms state-of-the-art detection baselines and generates high-quality explanations from diverse evaluation perspectives.
现有的假新闻检测方法旨在对一条新闻进行真假分类,并提供真实性解释,取得了令人瞩目的成绩。然而,他们经常在人工事实核查报告的基础上定制自动化解决方案,遭受新闻报道有限和揭露延误的困扰。当一条新闻尚未被事实核查或揭穿时,一定数量的相关原始报道通常会在各种媒体上传播,其中包含了群众的智慧,以核实新闻主张并解释其结论。在本文中,我们提出了一种新的粗到细级联证据蒸馏(CofCED)神经网络,用于基于此类原始报道的可解释假新闻检测,减轻了对事实核查的依赖。具体来说,我们首先使用分层编码器进行web文本表示,然后开发两个级联选择器,以从粗到精的方式从所选的top- k报告中选择最可解释的句子作为判决。此外,我们构建了两个可解释的假新闻数据集,这些数据集是公开的。实验结果表明,我们的模型显著优于最先进的检测基线,并从不同的评估角度产生高质量的解释。
{"title":"A Coarse-to-fine Cascaded Evidence-Distillation Neural Network for Explainable Fake News Detection","authors":"Zhiwei Yang, Jing Ma, Hechang Chen, Hongzhan Lin, Ziyang Luo, Yi Chang","doi":"10.48550/arXiv.2209.14642","DOIUrl":"https://doi.org/10.48550/arXiv.2209.14642","url":null,"abstract":"Existing fake news detection methods aim to classify a piece of news as true or false and provide veracity explanations, achieving remarkable performances. However, they often tailor automated solutions on manual fact-checked reports, suffering from limited news coverage and debunking delays. When a piece of news has not yet been fact-checked or debunked, certain amounts of relevant raw reports are usually disseminated on various media outlets, containing the wisdom of crowds to verify the news claim and explain its verdict. In this paper, we propose a novel Coarse-to-fine Cascaded Evidence-Distillation (CofCED) neural network for explainable fake news detection based on such raw reports, alleviating the dependency on fact-checked ones. Specifically, we first utilize a hierarchical encoder for web text representation, and then develop two cascaded selectors to select the most explainable sentences for verdicts on top of the selected top-K reports in a coarse-to-fine manner. Besides, we construct two explainable fake news datasets, which is publicly available. Experimental results demonstrate that our model significantly outperforms state-of-the-art detection baselines and generates high-quality explanations from diverse evaluation perspectives.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"1 1","pages":"2608-2621"},"PeriodicalIF":0.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74330774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Generate-and-Retrieve: Use Your Predictions to Improve Retrieval for Semantic Parsing 生成和检索:使用您的预测来改进语义解析的检索
Pub Date : 2022-09-29 DOI: 10.48550/arXiv.2209.14899
Yury Zemlyanskiy, Michiel de Jong, J. Ainslie, Panupong Pasupat, Peter Shaw, Linlu Qiu, Sumit K. Sanghai, Fei Sha
A common recent approach to semantic parsing augments sequence-to-sequence models by retrieving and appending a set of training samples, called exemplars. The effectiveness of this recipe is limited by the ability to retrieve informative exemplars that help produce the correct parse, which is especially challenging in low-resource settings. Existing retrieval is commonly based on similarity of query and exemplar inputs. We propose GandR, a retrieval procedure that retrieves exemplars for which outputs are also similar. GandR first generates a preliminary prediction with input-based retrieval. Then, it retrieves exemplars with outputs similar to the preliminary prediction which are used to generate a final prediction. GandR sets the state of the art on multiple low-resource semantic parsing tasks.
最近常见的语义解析方法是通过检索和附加一组训练样本(称为范例)来增强序列到序列模型。该方法的有效性受到检索有助于生成正确解析的信息性示例的能力的限制,这在低资源设置中尤其具有挑战性。现有检索通常基于查询和范例输入的相似性。我们提出GandR,一个检索过程,检索输出也相似的示例。GandR首先使用基于输入的检索生成初步预测。然后,它检索输出与初步预测相似的样本,这些样本用于生成最终预测。GandR在多个低资源语义解析任务上设置了最先进的技术。
{"title":"Generate-and-Retrieve: Use Your Predictions to Improve Retrieval for Semantic Parsing","authors":"Yury Zemlyanskiy, Michiel de Jong, J. Ainslie, Panupong Pasupat, Peter Shaw, Linlu Qiu, Sumit K. Sanghai, Fei Sha","doi":"10.48550/arXiv.2209.14899","DOIUrl":"https://doi.org/10.48550/arXiv.2209.14899","url":null,"abstract":"A common recent approach to semantic parsing augments sequence-to-sequence models by retrieving and appending a set of training samples, called exemplars. The effectiveness of this recipe is limited by the ability to retrieve informative exemplars that help produce the correct parse, which is especially challenging in low-resource settings. Existing retrieval is commonly based on similarity of query and exemplar inputs. We propose GandR, a retrieval procedure that retrieves exemplars for which outputs are also similar. GandR first generates a preliminary prediction with input-based retrieval. Then, it retrieves exemplars with outputs similar to the preliminary prediction which are used to generate a final prediction. GandR sets the state of the art on multiple low-resource semantic parsing tasks.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"1 1","pages":"4946-4951"},"PeriodicalIF":0.0,"publicationDate":"2022-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89679063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Human-in-the-loop Robotic Grasping Using BERT Scene Representation 基于BERT场景表示的人在环机器人抓取
Pub Date : 2022-09-28 DOI: 10.48550/arXiv.2209.14026
Yaoxian Song, Penglei Sun, Pengfei Fang, Linyi Yang, Yanghua Xiao, Yue Zhang
Current NLP techniques have been greatly applied in different domains. In this paper, we propose a human-in-the-loop framework for robotic grasping in cluttered scenes, investigating a language interface to the grasping process, which allows the user to intervene by natural language commands. This framework is constructed on a state-of-the-art grasping baseline, where we substitute a scene-graph representation with a text representation of the scene using BERT. Experiments on both simulation and physical robot show that the proposed method outperforms conventional object-agnostic and scene-graph based methods in the literature. In addition, we find that with human intervention, performance can be significantly improved. Our dataset and code are available on our project website https://sites.google.com/view/hitl-grasping-bert.
目前的自然语言处理技术在不同的领域得到了广泛的应用。在本文中,我们提出了一个用于混乱场景中机器人抓取的人在环框架,研究了抓取过程的语言接口,允许用户通过自然语言命令进行干预。该框架是在最先进的抓取基线上构建的,其中我们使用BERT用场景的文本表示代替场景图表示。在仿真和物理机器人上的实验表明,该方法优于传统的基于场景图和物体不可知的方法。此外,我们发现,通过人为干预,性能可以显著提高。我们的数据集和代码可在我们的项目网站https://sites.google.com/view/hitl-grasping-bert上获得。
{"title":"Human-in-the-loop Robotic Grasping Using BERT Scene Representation","authors":"Yaoxian Song, Penglei Sun, Pengfei Fang, Linyi Yang, Yanghua Xiao, Yue Zhang","doi":"10.48550/arXiv.2209.14026","DOIUrl":"https://doi.org/10.48550/arXiv.2209.14026","url":null,"abstract":"Current NLP techniques have been greatly applied in different domains. In this paper, we propose a human-in-the-loop framework for robotic grasping in cluttered scenes, investigating a language interface to the grasping process, which allows the user to intervene by natural language commands. This framework is constructed on a state-of-the-art grasping baseline, where we substitute a scene-graph representation with a text representation of the scene using BERT. Experiments on both simulation and physical robot show that the proposed method outperforms conventional object-agnostic and scene-graph based methods in the literature. In addition, we find that with human intervention, performance can be significantly improved. Our dataset and code are available on our project website https://sites.google.com/view/hitl-grasping-bert.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"44 1","pages":"2992-3006"},"PeriodicalIF":0.0,"publicationDate":"2022-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89923721","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Assessing Digital Language Support on a Global Scale 评估全球范围内的数字语言支持
Pub Date : 2022-09-27 DOI: 10.48550/arXiv.2209.13515
Gary F. Simons, Abbey L. Thomas, Chad White
The users of endangered languages struggle to thrive in a digitally-mediated world. We have developed an automated method for assessing how well every language recognized by ISO 639 is faring in terms of digital language support. The assessment is based on scraping the names of supported languages from the websites of 143 digital tools selected to represent a full range of ways that digital technology can support languages. The method uses Mokken scale analysis to produce an explainable model for quantifying digital language support and monitoring it on a global scale.
濒危语言的使用者在数字媒介的世界中挣扎求生。我们已经开发了一种自动化的方法来评估ISO 639认可的每种语言在数字语言支持方面的表现。这项评估是基于从143个数字工具的网站上抓取支持的语言名称,这些工具被选中代表了数字技术可以支持语言的所有方式。该方法使用莫肯尺度分析来产生一个可解释的模型,用于量化数字语言支持并在全球范围内对其进行监控。
{"title":"Assessing Digital Language Support on a Global Scale","authors":"Gary F. Simons, Abbey L. Thomas, Chad White","doi":"10.48550/arXiv.2209.13515","DOIUrl":"https://doi.org/10.48550/arXiv.2209.13515","url":null,"abstract":"The users of endangered languages struggle to thrive in a digitally-mediated world. We have developed an automated method for assessing how well every language recognized by ISO 639 is faring in terms of digital language support. The assessment is based on scraping the names of supported languages from the websites of 143 digital tools selected to represent a full range of ways that digital technology can support languages. The method uses Mokken scale analysis to produce an explainable model for quantifying digital language support and monitoring it on a global scale.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"13 1","pages":"4299-4305"},"PeriodicalIF":0.0,"publicationDate":"2022-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87056214","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation 视觉和语言导航的学习方向和视觉信号
Pub Date : 2022-09-26 DOI: 10.48550/arXiv.2209.12723
Yue Zhang, Parisa Kordjamshidi
Understanding spatial and visual information is essential for a navigation agent who follows natural language instructions. The current Transformer-based VLN agents entangle the orientation and vision information, which limits the gain from the learning of each information source. In this paper, we design a neural agent with explicit Orientation and Vision modules. Those modules learn to ground spatial information and landmark mentions in the instructions to the visual environment more effectively. To strengthen the spatial reasoning and visual perception of the agent, we design specific pre-training tasks to feed and better utilize the corresponding modules in our final navigation model. We evaluate our approach on both Room2room (R2R) and Room4room (R4R) datasets and achieve the state of the art results on both benchmarks.
理解空间和视觉信息对于遵循自然语言指令的导航代理至关重要。目前基于transformer的VLN智能体将方向和视觉信息纠缠在一起,限制了每个信息源学习的增益。本文设计了一个具有显式方向和视觉模块的神经智能体。这些模块学习更有效地将指令中提到的空间信息和地标联系到视觉环境中。为了加强智能体的空间推理和视觉感知能力,我们设计了特定的预训练任务,以便在最终的导航模型中更好地利用相应的模块。我们在Room2room (R2R)和Room4room (R4R)数据集上评估了我们的方法,并在两个基准测试上获得了最先进的结果。
{"title":"LOViS: Learning Orientation and Visual Signals for Vision and Language Navigation","authors":"Yue Zhang, Parisa Kordjamshidi","doi":"10.48550/arXiv.2209.12723","DOIUrl":"https://doi.org/10.48550/arXiv.2209.12723","url":null,"abstract":"Understanding spatial and visual information is essential for a navigation agent who follows natural language instructions. The current Transformer-based VLN agents entangle the orientation and vision information, which limits the gain from the learning of each information source. In this paper, we design a neural agent with explicit Orientation and Vision modules. Those modules learn to ground spatial information and landmark mentions in the instructions to the visual environment more effectively. To strengthen the spatial reasoning and visual perception of the agent, we design specific pre-training tasks to feed and better utilize the corresponding modules in our final navigation model. We evaluate our approach on both Room2room (R2R) and Room4room (R4R) datasets and achieve the state of the art results on both benchmarks.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"39 1","pages":"5745-5754"},"PeriodicalIF":0.0,"publicationDate":"2022-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73908153","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Conversational QA Dataset Generation with Answer Revision 会话QA数据集生成与答案修订
Pub Date : 2022-09-23 DOI: 10.48550/arXiv.2209.11396
Seonjeong Hwang, G. G. Lee
Conversational question-answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.
会话问答生成是一种基于输入段落自动生成大规模会话问答数据集的任务。在本文中,我们引入了一个新的框架,从文章中提取有问题的短语,然后根据之前的对话生成相应的问题。特别是,我们的框架在生成问题后修改提取的答案,以便答案与成对的问题完全匹配。实验结果表明,我们的简单答案修正方法显著提高了合成数据的质量。此外,我们还证明了我们的框架可以有效地用于会话式问答的领域适应。
{"title":"Conversational QA Dataset Generation with Answer Revision","authors":"Seonjeong Hwang, G. G. Lee","doi":"10.48550/arXiv.2209.11396","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11396","url":null,"abstract":"Conversational question-answer generation is a task that automatically generates a large-scale conversational question answering dataset based on input passages. In this paper, we introduce a novel framework that extracts question-worthy phrases from a passage and then generates corresponding questions considering previous conversations. In particular, our framework revises the extracted answers after generating questions so that answers exactly match paired questions. Experimental results show that our simple answer revision approach leads to significant improvement in the quality of synthetic data. Moreover, we prove that our framework can be effectively utilized for domain adaptation of conversational question answering.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"16 1","pages":"1636-1644"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87358456","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
ET5: A Novel End-to-end Framework for Conversational Machine Reading Comprehension 会话式机器阅读理解的一种新颖的端到端框架
Pub Date : 2022-09-23 DOI: 10.48550/arXiv.2209.11484
Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao
Conversational machine reading comprehension (CMRC) aims to assist computers to understand an natural language text and thereafter engage in a multi-turn conversation to answer questions related to the text. Existing methods typically require three steps: (1) decision making based on entailment reasoning; (2) span extraction if required by the above decision; (3) question rephrasing based on the extracted span. However, for nearly all these methods, the span extraction and question rephrasing steps cannot fully exploit the fine-grained entailment reasoning information in decision making step because of their relative independence, which will further enlarge the information gap between decision making and question phrasing. Thus, to tackle this problem, we propose a novel end-to-end framework for conversational machine reading comprehension based on shared parameter mechanism, called entailment reasoning T5 (ET5). Despite the lightweight of our proposed framework, experimental results show that the proposed ET5 achieves new state-of-the-art results on the ShARC leaderboard with the BLEU-4 score of 55.2. Our model and code are publicly available.
对话式机器阅读理解(CMRC)旨在帮助计算机理解自然语言文本,然后进行多回合对话以回答与文本相关的问题。现有方法通常需要三个步骤:(1)基于蕴涵推理的决策;(2)根据上述决定要求提取跨度;(3)基于提取的跨度改写问题。然而,在几乎所有这些方法中,跨度提取和问题改写步骤由于其相对独立性,无法充分挖掘决策步骤中细粒度的蕴涵推理信息,这将进一步扩大决策与问题措辞之间的信息差距。因此,为了解决这个问题,我们提出了一个基于共享参数机制的会话机器阅读理解的新型端到端框架,称为蕴涵推理T5 (ET5)。尽管我们提出的框架很轻,但实验结果表明,提出的ET5在ShARC排行榜上取得了新的最先进的结果,BLEU-4得分为55.2。我们的模型和代码是公开的。
{"title":"ET5: A Novel End-to-end Framework for Conversational Machine Reading Comprehension","authors":"Xiao Zhang, Heyan Huang, Zewen Chi, Xian-Ling Mao","doi":"10.48550/arXiv.2209.11484","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11484","url":null,"abstract":"Conversational machine reading comprehension (CMRC) aims to assist computers to understand an natural language text and thereafter engage in a multi-turn conversation to answer questions related to the text. Existing methods typically require three steps: (1) decision making based on entailment reasoning; (2) span extraction if required by the above decision; (3) question rephrasing based on the extracted span. However, for nearly all these methods, the span extraction and question rephrasing steps cannot fully exploit the fine-grained entailment reasoning information in decision making step because of their relative independence, which will further enlarge the information gap between decision making and question phrasing. Thus, to tackle this problem, we propose a novel end-to-end framework for conversational machine reading comprehension based on shared parameter mechanism, called entailment reasoning T5 (ET5). Despite the lightweight of our proposed framework, experimental results show that the proposed ET5 achieves new state-of-the-art results on the ShARC leaderboard with the BLEU-4 score of 55.2. Our model and code are publicly available.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"95 1","pages":"570-579"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83703302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MetaPrompting: Learning to Learn Better Prompts metaprompts:学习更好的提示
Pub Date : 2022-09-23 DOI: 10.48550/arXiv.2209.11486
Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che
Prompting method is regarded as one of the crucial progress for few-shot nature language processing. Recent research on prompting moves from discrete tokens based “hard prompts” to continuous “soft prompts”, which employ learnable vectors as pseudo prompt tokens and achieve better performance. Though showing promising prospects, these soft-prompting methods are observed to rely heavily on good initialization to take effect. Unfortunately, obtaining a perfect initialization for soft prompts requires understanding of inner language models working and elaborate design, which is no easy task and has to restart from scratch for each new task. To remedy this, we propose a generalized soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm to automatically find better prompt initialization that facilitates fast adaptation to new prompting tasks. Extensive experiments show MetaPrompting tackles soft prompt initialization problem and brings significant improvement on three different datasets (over 6 points improvement in accuracy for 1-shot setting), achieving new state-of-the-art performance.
提示方法被认为是小片段自然语言处理的重要进展之一。最近的提示研究从基于离散标记的“硬提示”转向连续的“软提示”,使用可学习向量作为伪提示标记,从而获得更好的性能。虽然显示出很好的前景,但这些软提示方法在很大程度上依赖于良好的初始化才能生效。不幸的是,获得软提示的完美初始化需要了解内部语言模型的工作原理和精心设计,这不是一件容易的事情,每个新任务都必须从头开始。为了解决这个问题,我们提出了一种广义的软提示方法metaprompt,该方法采用公认的与模型无关的元学习算法来自动找到更好的提示初始化,从而促进快速适应新的提示任务。大量的实验表明,metaprompt解决了软提示初始化问题,并在三个不同的数据集上带来了显著的改进(1次射击设置的精度提高了6分以上),实现了新的最先进的性能。
{"title":"MetaPrompting: Learning to Learn Better Prompts","authors":"Yutai Hou, Hongyuan Dong, Xinghao Wang, Bohan Li, Wanxiang Che","doi":"10.48550/arXiv.2209.11486","DOIUrl":"https://doi.org/10.48550/arXiv.2209.11486","url":null,"abstract":"Prompting method is regarded as one of the crucial progress for few-shot nature language processing. Recent research on prompting moves from discrete tokens based “hard prompts” to continuous “soft prompts”, which employ learnable vectors as pseudo prompt tokens and achieve better performance. Though showing promising prospects, these soft-prompting methods are observed to rely heavily on good initialization to take effect. Unfortunately, obtaining a perfect initialization for soft prompts requires understanding of inner language models working and elaborate design, which is no easy task and has to restart from scratch for each new task. To remedy this, we propose a generalized soft prompting method called MetaPrompting, which adopts the well-recognized model-agnostic meta-learning algorithm to automatically find better prompt initialization that facilitates fast adaptation to new prompting tasks. Extensive experiments show MetaPrompting tackles soft prompt initialization problem and brings significant improvement on three different datasets (over 6 points improvement in accuracy for 1-shot setting), achieving new state-of-the-art performance.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"25 1","pages":"3251-3262"},"PeriodicalIF":0.0,"publicationDate":"2022-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81873666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Semantically Consistent Data Augmentation for Neural Machine Translation via Conditional Masked Language Model 基于条件屏蔽语言模型的神经机器翻译语义一致性数据增强
Pub Date : 2022-09-22 DOI: 10.48550/arXiv.2209.10875
Qiao Cheng, Jin Huang, Yitao Duan
This paper introduces a new data augmentation method for neural machine translation that can enforce stronger semantic consistency both within and across languages. Our method is based on Conditional Masked Language Model (CMLM) which is bi-directional and can be conditional on both left and right context, as well as the label. We demonstrate that CMLM is a good technique for generating context-dependent word distributions. In particular, we show that CMLM is capable of enforcing semantic consistency by conditioning on both source and target during substitution. In addition, to enhance diversity, we incorporate the idea of soft word substitution for data augmentation which replaces a word with a probabilistic distribution over the vocabulary. Experiments on four translation datasets of different scales show that the overall solution results in more realistic data augmentation and better translation quality. Our approach consistently achieves the best performance in comparison with strong and recent works and yields improvements of up to 1.90 BLEU points over the baseline.
本文介绍了一种新的神经机器翻译数据增强方法,可以增强语言内部和语言之间的语义一致性。我们的方法基于条件屏蔽语言模型(CMLM),该模型是双向的,可以对左右上下文以及标签都有条件。我们证明了CMLM是一种生成上下文相关词分布的好技术。特别是,我们证明了CMLM能够通过在替换过程中对源和目标都施加条件来强制语义一致性。此外,为了增强多样性,我们将软词替换的思想纳入数据增强,即用词汇表上的概率分布替换单词。在4个不同尺度的翻译数据集上进行的实验表明,整体解决方案增强了数据的真实感,提高了翻译质量。与强大的和最近的工作相比,我们的方法始终如一地实现了最佳性能,并在基线上提高了1.90 BLEU点。
{"title":"Semantically Consistent Data Augmentation for Neural Machine Translation via Conditional Masked Language Model","authors":"Qiao Cheng, Jin Huang, Yitao Duan","doi":"10.48550/arXiv.2209.10875","DOIUrl":"https://doi.org/10.48550/arXiv.2209.10875","url":null,"abstract":"This paper introduces a new data augmentation method for neural machine translation that can enforce stronger semantic consistency both within and across languages. Our method is based on Conditional Masked Language Model (CMLM) which is bi-directional and can be conditional on both left and right context, as well as the label. We demonstrate that CMLM is a good technique for generating context-dependent word distributions. In particular, we show that CMLM is capable of enforcing semantic consistency by conditioning on both source and target during substitution. In addition, to enhance diversity, we incorporate the idea of soft word substitution for data augmentation which replaces a word with a probabilistic distribution over the vocabulary. Experiments on four translation datasets of different scales show that the overall solution results in more realistic data augmentation and better translation quality. Our approach consistently achieves the best performance in comparison with strong and recent works and yields improvements of up to 1.90 BLEU points over the baseline.","PeriodicalId":91381,"journal":{"name":"Proceedings of COLING. International Conference on Computational Linguistics","volume":"56 1","pages":"5148-5157"},"PeriodicalIF":0.0,"publicationDate":"2022-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83582002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Proceedings of COLING. International Conference on Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1