首页 > 最新文献

Transactions of the Association for Computational Linguistics最新文献

英文 中文
Modeling Non-Cooperative Dialogue: Theoretical and Empirical Insights 非合作对话建模:理论与实证研究
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-15 DOI: 10.1162/tacl_a_00507
Anthony Sicilia, Tristan D. Maidment, Pat Healy, Malihe Alikhani
Abstract Investigating cooperativity of interlocutors is central in studying pragmatics of dialogue. Models of conversation that only assume cooperative agents fail to explain the dynamics of strategic conversations. Thus, we investigate the ability of agents to identify non-cooperative interlocutors while completing a concurrent visual-dialogue task. Within this novel setting, we study the optimality of communication strategies for achieving this multi-task objective. We use the tools of learning theory to develop a theoretical model for identifying non-cooperative interlocutors and apply this theory to analyze different communication strategies. We also introduce a corpus of non-cooperative conversations about images in the GuessWhat?! dataset proposed by De Vries et al. (2017). We use reinforcement learning to implement multiple communication strategies in this context and find that empirical results validate our theory.
研究对话者的合作性是研究对话语用学的核心。只假设合作主体的对话模型无法解释战略对话的动态。因此,我们研究了代理人在完成同时进行的视觉对话任务时识别非合作对话者的能力。在这种新颖的背景下,我们研究了实现这一多任务目标的沟通策略的最优性。我们利用学习理论的工具建立了一个识别非合作对话者的理论模型,并将该理论应用于分析不同的沟通策略。我们还介绍了一个关于GuessWhat?!中图像的非合作对话语料库?!De Vries等人提出的数据集(2017)。在这种背景下,我们使用强化学习来实施多种沟通策略,并发现实证结果验证了我们的理论。
{"title":"Modeling Non-Cooperative Dialogue: Theoretical and Empirical Insights","authors":"Anthony Sicilia, Tristan D. Maidment, Pat Healy, Malihe Alikhani","doi":"10.1162/tacl_a_00507","DOIUrl":"https://doi.org/10.1162/tacl_a_00507","url":null,"abstract":"Abstract Investigating cooperativity of interlocutors is central in studying pragmatics of dialogue. Models of conversation that only assume cooperative agents fail to explain the dynamics of strategic conversations. Thus, we investigate the ability of agents to identify non-cooperative interlocutors while completing a concurrent visual-dialogue task. Within this novel setting, we study the optimality of communication strategies for achieving this multi-task objective. We use the tools of learning theory to develop a theoretical model for identifying non-cooperative interlocutors and apply this theory to analyze different communication strategies. We also introduce a corpus of non-cooperative conversations about images in the GuessWhat?! dataset proposed by De Vries et al. (2017). We use reinforcement learning to implement multiple communication strategies in this context and find that empirical results validate our theory.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1084-1102"},"PeriodicalIF":10.9,"publicationDate":"2022-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43827179","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions 让BART搭上习语列车:学会表达习语
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-08 DOI: 10.1162/tacl_a_00510
Ziheng Zeng, S. Bhat
Abstract Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today’s state-of-the-art. Prior work has identified deficiencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25% higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.
摘要习语是自然语言的重要组成部分,具有非复合性的特点。它们一直是NLP的经典挑战,包括推动当今最先进技术的预先训练的语言模型。先前的工作已经发现,他们的情境化表现存在缺陷,源于表现的基本组成范式。在这项工作中,我们采用第一原则方法,使用适配器作为一名受过习语训练的轻量级非组合语言专家,将习语构建到BART中。通过内在和外在方法可以看出相对于基线(例如,BART)的改进能力,其中习语嵌入在嵌入聚类的同质性得分中高出0.19分,在IE意义消歧和跨度检测的习语处理任务中,序列准确率高出25%。
{"title":"Getting BART to Ride the Idiomatic Train: Learning to Represent Idiomatic Expressions","authors":"Ziheng Zeng, S. Bhat","doi":"10.1162/tacl_a_00510","DOIUrl":"https://doi.org/10.1162/tacl_a_00510","url":null,"abstract":"Abstract Idiomatic expressions (IEs), characterized by their non-compositionality, are an important part of natural language. They have been a classical challenge to NLP, including pre-trained language models that drive today’s state-of-the-art. Prior work has identified deficiencies in their contextualized representation stemming from the underlying compositional paradigm of representation. In this work, we take a first-principles approach to build idiomaticity into BART using an adapter as a lightweight non-compositional language expert trained on idiomatic sentences. The improved capability over baselines (e.g., BART) is seen via intrinsic and extrinsic methods, where idiom embeddings score 0.19 points higher in homogeneity score for embedding clustering, and up to 25% higher sequence accuracy on the idiom processing tasks of IE sense disambiguation and span detection.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1120-1137"},"PeriodicalIF":10.9,"publicationDate":"2022-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46430433","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation 元学习差异:为有效适应准备大型语言模型
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-07 DOI: 10.1162/tacl_a_00517
Zejiang Hou, Julian Salazar, George Polovets
Abstract Large pretrained language models (PLMs) are often domain- or task-adapted via finetuning or prompting. Finetuning requires modifying all of the parameters and having enough data to avoid overfitting while prompting requires no training and few examples but limits performance. Instead, we prepare PLMs for data- and parameter-efficient adaptation by learning to learn the difference between general and adapted PLMs. This difference is expressed in terms of model weights and sublayer structure through our proposed dynamic low-rank reparameterization and learned architecture controller. Experiments on few-shot dialogue completion, low-resource abstractive summarization, and multi-domain language modeling show improvements in adaptation time and performance over direct finetuning or preparation via domain-adaptive pretraining. Ablations show our task-adaptive reparameterization (TARP) and model search (TAMS) components individually improve on other parameter-efficient transfer like adapters and structure-learning methods like learned sparsification.
大型预训练语言模型(PLMs)通常通过微调或提示来适应领域或任务。微调需要修改所有参数并拥有足够的数据以避免过拟合,而提示不需要训练和少量示例,但会限制性能。相反,我们通过学习了解一般plm和适应plm之间的差异,为数据和参数有效的适应准备plm。这种差异通过我们提出的动态低秩重参数化和学习体系结构控制器以模型权重和子层结构的形式表达出来。在少镜头对话补全、低资源抽象摘要和多领域语言建模方面的实验表明,与直接微调或通过领域自适应预训练进行准备相比,自适应时间和性能都有所提高。实验表明,我们的任务自适应重参数化(TARP)和模型搜索(TAMS)组件分别改进了其他参数高效转移(如适配器)和结构学习方法(如学习稀疏化)。
{"title":"Meta-Learning the Difference: Preparing Large Language Models for Efficient Adaptation","authors":"Zejiang Hou, Julian Salazar, George Polovets","doi":"10.1162/tacl_a_00517","DOIUrl":"https://doi.org/10.1162/tacl_a_00517","url":null,"abstract":"Abstract Large pretrained language models (PLMs) are often domain- or task-adapted via finetuning or prompting. Finetuning requires modifying all of the parameters and having enough data to avoid overfitting while prompting requires no training and few examples but limits performance. Instead, we prepare PLMs for data- and parameter-efficient adaptation by learning to learn the difference between general and adapted PLMs. This difference is expressed in terms of model weights and sublayer structure through our proposed dynamic low-rank reparameterization and learned architecture controller. Experiments on few-shot dialogue completion, low-resource abstractive summarization, and multi-domain language modeling show improvements in adaptation time and performance over direct finetuning or preparation via domain-adaptive pretraining. Ablations show our task-adaptive reparameterization (TARP) and model search (TAMS) components individually improve on other parameter-efficient transfer like adapters and structure-learning methods like learned sparsification.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1249-1265"},"PeriodicalIF":10.9,"publicationDate":"2022-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41946630","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
The Parallelism Tradeoff: Limitations of Log-Precision Transformers 并行性权衡:对数精度变压器的局限性
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-02 DOI: 10.1162/tacl_a_00562
William Cooper Merrill, Ashish Sabharwal
Despite their omnipresence in modern NLP, characterizing the computational power of transformer neural nets remains an interesting open question. We prove that transformers whose arithmetic precision is logarithmic in the number of input tokens (and whose feedforward nets are computable using space linear in their input) can be simulated by constant-depth logspace-uniform threshold circuits. This provides insight on the power of transformers using known results in complexity theory. For example, if L≠P (i.e., not all poly-time problems can be solved using logarithmic space), then transformers cannot even accurately solve linear equalities or check membership in an arbitrary context-free grammar with empty productions. Our result intuitively emerges from the transformer architecture’s high parallelizability. We thus speculatively introduce the idea of a fundamental parallelism tradeoff: any model architecture as parallelizable as the transformer will obey limitations similar to it. Since parallelism is key to training models at massive scale, this suggests a potential inherent weakness of the scaling paradigm.
尽管变换神经网络在现代NLP中无处不在,但其计算能力的表征仍然是一个有趣的悬而未决的问题。我们证明了算术精度在输入令牌数量上是对数的变换器(并且其前馈网络在其输入中使用空间线性可计算)可以通过恒定深度对数空间一致阈值电路来模拟。这利用复杂性理论中的已知结果提供了对变压器功率的深入了解。例如,如果L≠P(即,并非所有的多时间问题都可以使用对数空间来解决),那么变换器甚至不能准确地求解线性等式,也不能在具有空乘积的任意上下文无关语法中检查隶属度。我们的结果直观地体现在transformer架构的高并行性上。因此,我们推测性地引入了一个基本并行性权衡的想法:任何像transformer这样可并行的模型架构都将遵守类似的限制。由于并行性是大规模训练模型的关键,这表明了缩放范式的潜在内在弱点。
{"title":"The Parallelism Tradeoff: Limitations of Log-Precision Transformers","authors":"William Cooper Merrill, Ashish Sabharwal","doi":"10.1162/tacl_a_00562","DOIUrl":"https://doi.org/10.1162/tacl_a_00562","url":null,"abstract":"Despite their omnipresence in modern NLP, characterizing the computational power of transformer neural nets remains an interesting open question. We prove that transformers whose arithmetic precision is logarithmic in the number of input tokens (and whose feedforward nets are computable using space linear in their input) can be simulated by constant-depth logspace-uniform threshold circuits. This provides insight on the power of transformers using known results in complexity theory. For example, if L≠P (i.e., not all poly-time problems can be solved using logarithmic space), then transformers cannot even accurately solve linear equalities or check membership in an arbitrary context-free grammar with empty productions. Our result intuitively emerges from the transformer architecture’s high parallelizability. We thus speculatively introduce the idea of a fundamental parallelism tradeoff: any model architecture as parallelizable as the transformer will obey limitations similar to it. Since parallelism is key to training models at massive scale, this suggests a potential inherent weakness of the scaling paradigm.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"531-545"},"PeriodicalIF":10.9,"publicationDate":"2022-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46501624","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
InSCIt: Information-Seeking Conversations with Mixed-Initiative Interactions InSCIt:混合主动互动的信息寻求对话
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-02 DOI: 10.1162/tacl_a_00559
Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi
In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable. An ideal agent would interact by initiating different response types according to the available knowledge sources. However, most current studies either fail to or artificially incorporate such agent-side initiative. This work presents InSCIt, a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations where the agent searches over Wikipedia and either directly answers, asks for clarification, or provides relevant information to address user queries. The data supports two subtasks, evidence passage identification and response generation, as well as a human evaluation protocol to assess model performance. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering. Both systems significantly underperform humans, suggesting ample room for improvement in future studies.1
在寻求信息的对话中,用户可能会问一些没有明确说明或无法回答的问题。理想的智能体将根据可用的知识来源发起不同的响应类型来进行交互。然而,目前的大多数研究要么未能或人为地纳入这种代理方主动性。这项工作提出了InSCIt,一个用于混合主动交互的信息寻求对话的数据集。它包含了47k个用户代理转换,这些转换来自805个人与人之间的对话,代理在维基百科上搜索,并直接回答,要求澄清,或提供相关信息来解决用户查询。数据支持两个子任务,证据通道识别和响应生成,以及评估模型性能的人类评估协议。我们报告了基于会话知识识别和开放域问答的最先进模型的两个系统的结果。这两种系统的表现都明显落后于人类,这表明在未来的研究中有很大的改进空间
{"title":"InSCIt: Information-Seeking Conversations with Mixed-Initiative Interactions","authors":"Zeqiu Wu, Ryu Parish, Hao Cheng, Sewon Min, Prithviraj Ammanabrolu, Mari Ostendorf, Hannaneh Hajishirzi","doi":"10.1162/tacl_a_00559","DOIUrl":"https://doi.org/10.1162/tacl_a_00559","url":null,"abstract":"In an information-seeking conversation, a user may ask questions that are under-specified or unanswerable. An ideal agent would interact by initiating different response types according to the available knowledge sources. However, most current studies either fail to or artificially incorporate such agent-side initiative. This work presents InSCIt, a dataset for Information-Seeking Conversations with mixed-initiative Interactions. It contains 4.7K user-agent turns from 805 human-human conversations where the agent searches over Wikipedia and either directly answers, asks for clarification, or provides relevant information to address user queries. The data supports two subtasks, evidence passage identification and response generation, as well as a human evaluation protocol to assess model performance. We report results of two systems based on state-of-the-art models of conversational knowledge identification and open-domain question answering. Both systems significantly underperform humans, suggesting ample room for improvement in future studies.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"453-468"},"PeriodicalIF":10.9,"publicationDate":"2022-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43591966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Conditional Generation with a Question-Answering Blueprint 带问答蓝图的条件生成
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-07-01 DOI: 10.1162/tacl_a_00583
Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Dipanjan Das, Mirella Lapata
Abstract The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. We propose a new conceptualization of text plans as a sequence of question-answer (QA) pairs and enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for content selection (i.e., what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output.
摘要传达相关和忠实信息的能力对于条件生成中的许多任务至关重要,但对于神经序列到序列模型来说仍然难以捉摸,因为这些模型的输出往往会显示幻觉,并且无法正确覆盖重要细节。在这项工作中,我们主张将规划作为一种有用的中间表示,使条件生成不那么不透明,更接地气。我们提出了一种将文本计划概念化为问答(QA)对序列的新方法,并通过QA蓝图来增强现有数据集(例如,用于摘要),QA蓝图作为内容选择(即,说什么)和计划(即,按什么顺序)的代表。我们通过利用最先进的问题生成技术自动获得蓝图,并将输入-输出对转换为输入-蓝图-输出元组。我们开发了基于Transformer的模型,每个模型在生成的输出中结合蓝图的方式各不相同(例如,作为全局计划或迭代)。跨指标和数据集的评估表明,蓝图模型比不依赖规划并允许对发电输出进行更严格控制的替代方案更真实。
{"title":"Conditional Generation with a Question-Answering Blueprint","authors":"Shashi Narayan, Joshua Maynez, Reinald Kim Amplayo, Kuzman Ganchev, Annie Louis, Fantine Huot, Dipanjan Das, Mirella Lapata","doi":"10.1162/tacl_a_00583","DOIUrl":"https://doi.org/10.1162/tacl_a_00583","url":null,"abstract":"Abstract The ability to convey relevant and faithful information is critical for many tasks in conditional generation and yet remains elusive for neural seq-to-seq models whose outputs often reveal hallucinations and fail to correctly cover important details. In this work, we advocate planning as a useful intermediate representation for rendering conditional generation less opaque and more grounded. We propose a new conceptualization of text plans as a sequence of question-answer (QA) pairs and enhance existing datasets (e.g., for summarization) with a QA blueprint operating as a proxy for content selection (i.e., what to say) and planning (i.e., in what order). We obtain blueprints automatically by exploiting state-of-the-art question generation technology and convert input-output pairs into input-blueprint-output tuples. We develop Transformer-based models, each varying in how they incorporate the blueprint in the generated output (e.g., as a global plan or iteratively). Evaluation across metrics and datasets demonstrates that blueprint models are more factual than alternatives which do not resort to planning and allow tighter control of the generation output.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"974-996"},"PeriodicalIF":10.9,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45704432","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method 会话历史表征在会话问答中的稳健性研究——一种基于提示的新方法
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-29 DOI: 10.1162/tacl_a_00549
Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart
Most work on modeling the conversation history in Conversational Question Answering (CQA) reports a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g., from large to small sets) and domain. In this work, we design and conduct the first large-scale robustness study of history modeling approaches for CQA. We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings. Equipped with the insights from our study, we design a novel prompt-based history modeling approach and demonstrate its strong robustness across various settings. Our approach is inspired by existing methods that highlight historic answers in the passage. However, instead of highlighting by modifying the passage token embeddings, we add textual prompts directly in the passage text. Our approach is simple, easy to plug into practically any model, and highly effective, thus we recommend it as a starting point for future model developers. We also hope that our study and insights will raise awareness to the importance of robustness-focused evaluation, in addition to obtaining high leaderboard scores, leading to better CQA systems.1
在会话问答(CQA)中,大多数关于会话历史建模的工作都报告了一个通用CQA基准的单一主要结果。虽然现有模型在CQA排行榜上显示出了令人印象深刻的结果,但尚不清楚它们是否对设置(有时是更现实的设置)、训练数据大小(例如,从大集合到小集合)和领域的变化具有鲁棒性。在这项工作中,我们设计并进行了第一次大规模的CQA历史建模方法的稳健性研究。我们发现,高基准分数并不一定意味着强大的稳健性,而且在不同的设置下,各种方法的表现可能截然不同。根据我们研究的见解,我们设计了一种新颖的基于提示的历史建模方法,并在各种环境中证明了其强大的稳健性。我们的方法受到现有方法的启发,这些方法突出了文章中的历史答案。然而,我们没有通过修改段落标记嵌入来突出显示,而是直接在段落文本中添加文本提示。我们的方法简单,易于插入到几乎任何模型中,并且非常有效,因此我们建议将其作为未来模型开发人员的起点。我们还希望,我们的研究和见解将提高人们对以稳健性为重点的评估的重要性的认识,以及获得高排行榜分数,从而形成更好的CQA系统。1
{"title":"On the Robustness of Dialogue History Representation in Conversational Question Answering: A Comprehensive Study and a New Prompt-based Method","authors":"Zorik Gekhman, Nadav Oved, Orgad Keller, Idan Szpektor, Roi Reichart","doi":"10.1162/tacl_a_00549","DOIUrl":"https://doi.org/10.1162/tacl_a_00549","url":null,"abstract":"Most work on modeling the conversation history in Conversational Question Answering (CQA) reports a single main result on a common CQA benchmark. While existing models show impressive results on CQA leaderboards, it remains unclear whether they are robust to shifts in setting (sometimes to more realistic ones), training data size (e.g., from large to small sets) and domain. In this work, we design and conduct the first large-scale robustness study of history modeling approaches for CQA. We find that high benchmark scores do not necessarily translate to strong robustness, and that various methods can perform extremely differently under different settings. Equipped with the insights from our study, we design a novel prompt-based history modeling approach and demonstrate its strong robustness across various settings. Our approach is inspired by existing methods that highlight historic answers in the passage. However, instead of highlighting by modifying the passage token embeddings, we add textual prompts directly in the passage text. Our approach is simple, easy to plug into practically any model, and highly effective, thus we recommend it as a starting point for future model developers. We also hope that our study and insights will raise awareness to the importance of robustness-focused evaluation, in addition to obtaining high leaderboard scores, leading to better CQA systems.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"351-366"},"PeriodicalIF":10.9,"publicationDate":"2022-06-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45171232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Dependency Parsing with Backtracking using Deep Reinforcement Learning 依赖解析与回溯使用深度强化学习
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-28 DOI: 10.1162/tacl_a_00496
Franck Dary, M. Petit, Alexis Nasr
Abstract Greedy algorithms for NLP such as transition-based parsing are prone to error propagation. One way to overcome this problem is to allow the algorithm to backtrack and explore an alternative solution in cases where new evidence contradicts the solution explored so far. In order to implement such a behavior, we use reinforcement learning and let the algorithm backtrack in cases where such an action gets a better reward than continuing to explore the current solution. We test this idea on both POS tagging and dependency parsing and show that backtracking is an effective means to fight against error propagation.
摘要NLP中的贪心算法(如基于转换的解析)容易出现错误传播。克服这个问题的一种方法是,在新证据与目前所探索的解决方案相矛盾的情况下,允许算法回溯并探索替代解决方案。为了实现这样的行为,我们使用强化学习,并在这样的行为获得比继续探索当前解决方案更好的奖励的情况下让算法回溯。我们在词性标注和依赖项解析上测试了这个想法,并表明回溯是对抗错误传播的有效手段。
{"title":"Dependency Parsing with Backtracking using Deep Reinforcement Learning","authors":"Franck Dary, M. Petit, Alexis Nasr","doi":"10.1162/tacl_a_00496","DOIUrl":"https://doi.org/10.1162/tacl_a_00496","url":null,"abstract":"Abstract Greedy algorithms for NLP such as transition-based parsing are prone to error propagation. One way to overcome this problem is to allow the algorithm to backtrack and explore an alternative solution in cases where new evidence contradicts the solution explored so far. In order to implement such a behavior, we use reinforcement learning and let the algorithm backtrack in cases where such an action gets a better reward than continuing to explore the current solution. We test this idea on both POS tagging and dependency parsing and show that backtracking is an effective means to fight against error propagation.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"888-903"},"PeriodicalIF":10.9,"publicationDate":"2022-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46573267","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon DP解析:使用实例词典从原始语音中查找单词边界
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-22 DOI: 10.1162/tacl_a_00505
Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurenccon, Salah Zaiem, Abdel-rahman Mohamed, Benoît Sagot, E. Dupoux
Abstract Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a ‘space’ delimiter between words. Popular Bayesian non-parametric models for text segmentation (Goldwater et al., 2006, 2009) use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens, avoiding the clustering errors that arise with a lexicon of word types. On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-the-art in 5 languages. The algorithm monotonically improves with better input representations, achieving yet higher scores when fed with weakly supervised inputs. Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn semantic and syntactic representations as assessed by a new spoken word embedding benchmark. 1
摘要在连续语音中寻找单词边界是一项挑战,因为单词之间几乎没有或根本没有“空格”分隔符。用于文本分割的流行贝叶斯非参数模型(Goldwater等人,20062009)使用狄利克雷过程来联合分割句子并构建单词类型的词典。我们介绍了DP Parse,它使用了类似的原理,但只依赖于单词标记的实例词典,避免了单词类型词典中出现的聚类错误。在2017年零资源语音基准测试上,我们的模型在5种语言中设置了最先进的新语音分割。该算法通过更好的输入表示进行单调改进,在使用弱监督输入时获得更高的分数。尽管缺乏类型词典,但DP Parse可以被流水线传输到语言模型,并通过新的口语单词嵌入基准来学习语义和句法表示。1.
{"title":"DP-Parse: Finding Word Boundaries from Raw Speech with an Instance Lexicon","authors":"Robin Algayres, Tristan Ricoul, Julien Karadayi, Hugo Laurenccon, Salah Zaiem, Abdel-rahman Mohamed, Benoît Sagot, E. Dupoux","doi":"10.1162/tacl_a_00505","DOIUrl":"https://doi.org/10.1162/tacl_a_00505","url":null,"abstract":"Abstract Finding word boundaries in continuous speech is challenging as there is little or no equivalent of a ‘space’ delimiter between words. Popular Bayesian non-parametric models for text segmentation (Goldwater et al., 2006, 2009) use a Dirichlet process to jointly segment sentences and build a lexicon of word types. We introduce DP-Parse, which uses similar principles but only relies on an instance lexicon of word tokens, avoiding the clustering errors that arise with a lexicon of word types. On the Zero Resource Speech Benchmark 2017, our model sets a new speech segmentation state-of-the-art in 5 languages. The algorithm monotonically improves with better input representations, achieving yet higher scores when fed with weakly supervised inputs. Despite lacking a type lexicon, DP-Parse can be pipelined to a language model and learn semantic and syntactic representations as assessed by a new spoken word embedding benchmark. 1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1051-1065"},"PeriodicalIF":10.9,"publicationDate":"2022-06-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49380439","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Questions Are All You Need to Train a Dense Passage Retriever 训练密集通道寻回犬所需的全部问题
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-06-21 DOI: 10.1162/tacl_a_00564
Devendra Singh Sachan, M. Lewis, Dani Yogatama, Luke Zettlemoyer, J. Pineau, M. Zaheer
We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is a central challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in contrast, only requires access to unpaired inputs and outputs (e.g., questions and potential answer passages). It uses a new passage-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence passages, and (2) the passages are then used to compute the probability of reconstructing the original question. Training for retrieval based on question reconstruction enables effective unsupervised learning of both passage and question encoders, which can be later incorporated into complete Open QA systems without any further finetuning. Extensive experiments demonstrate that ART obtains state-of-the-art results on multiple QA retrieval benchmarks with only generic initialization from a pre-trained language model, removing the need for labeled data and task-specific losses.1 Our code and model checkpoints are available at: https://github.com/DevSinghSachan/art.
我们介绍了ART,一种新的语料库级自动编码方法,用于训练不需要任何标记训练数据的密集检索模型。密集检索是开放域任务的核心挑战,例如开放QA,其中最先进的方法通常需要大型监督数据集,并具有自定义硬负挖掘和正例去噪。相比之下,ART只需要访问未配对的输入和输出(例如,问题和潜在的答案段落)。它使用了一种新的段落检索自动编码方案,其中(1)使用输入的问题来检索一组证据段落,(2)然后使用这些段落来计算重构原始问题的概率。基于问题重构的检索训练可以有效地对段落和问题编码器进行无监督学习,这可以在以后整合到完整的Open QA系统中,而无需进一步的微调。大量的实验表明,ART在多个QA检索基准上获得了最先进的结果,只需要从预训练的语言模型中进行通用初始化,从而消除了对标记数据和特定任务损失的需求我们的代码和模型检查点可以在:https://github.com/DevSinghSachan/art上获得。
{"title":"Questions Are All You Need to Train a Dense Passage Retriever","authors":"Devendra Singh Sachan, M. Lewis, Dani Yogatama, Luke Zettlemoyer, J. Pineau, M. Zaheer","doi":"10.1162/tacl_a_00564","DOIUrl":"https://doi.org/10.1162/tacl_a_00564","url":null,"abstract":"We introduce ART, a new corpus-level autoencoding approach for training dense retrieval models that does not require any labeled training data. Dense retrieval is a central challenge for open-domain tasks, such as Open QA, where state-of-the-art methods typically require large supervised datasets with custom hard-negative mining and denoising of positive examples. ART, in contrast, only requires access to unpaired inputs and outputs (e.g., questions and potential answer passages). It uses a new passage-retrieval autoencoding scheme, where (1) an input question is used to retrieve a set of evidence passages, and (2) the passages are then used to compute the probability of reconstructing the original question. Training for retrieval based on question reconstruction enables effective unsupervised learning of both passage and question encoders, which can be later incorporated into complete Open QA systems without any further finetuning. Extensive experiments demonstrate that ART obtains state-of-the-art results on multiple QA retrieval benchmarks with only generic initialization from a pre-trained language model, removing the need for labeled data and task-specific losses.1 Our code and model checkpoints are available at: https://github.com/DevSinghSachan/art.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"600-616"},"PeriodicalIF":10.9,"publicationDate":"2022-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43642220","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
期刊
Transactions of the Association for Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1