Computational Linguistics最新文献_第2页

Gradual Modifications and Abrupt Replacements: Two Stochastic Lexical Ingredients of Language Evolution 渐进式修饰与突发性替换:语言演化的两种随机词汇成分

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2023-01-13 DOI: 10.1162/coli_a_00471

M. Pasquini, M. Serva, D. Vergni

The evolution of the vocabulary of a language is characterized by two different random processes: abrupt lexical replacements, when a complete new word emerges to represent a given concept (which was at the basis of the Swadesh foundation of glottochronology in the 1950s), and gradual lexical modifications that progressively alter words over the centuries, considered here in detail for the first time. The main discriminant between these two processes is their impact on cognacy within a family of languages or dialects, since the former modifies the subsets of cognate terms and the latter does not. The automated cognate detection, which is here performed following a new approach inspired by graph theory, is a key preliminary step that allows us to later measure the effects of the slow modification process. We test our dual approach on the family of Malagasy dialects using a cladistic analysis, which provides strong evidence that lexical replacements and gradual lexical modifications are two random processes that separately drive the evolution of languages.

一门语言词汇的演变以两种不同的随机过程为特征:突然的词汇替换，当一个完整的新词出现来代表一个给定的概念时(这是20世纪50年代斯瓦德什语言年代学的基础)，以及逐渐的词汇修改，在几个世纪里逐渐改变单词，这里第一次详细讨论。这两个过程之间的主要区别在于它们对语言或方言家族中的同源性的影响，因为前者会修改同源术语的子集，而后者则不会。自动同源检测，在这里是按照一种受图论启发的新方法进行的，是一个关键的初步步骤，它允许我们以后测量缓慢修改过程的影响。我们使用词根分析对马达加斯加方言家族的双重方法进行了测试，该分析提供了强有力的证据，表明词汇替换和逐渐的词汇修改是两个随机过程，分别推动了语言的进化。

引用次数: 1

Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots by Michael McTear 对话人工智能：对话系统、对话代理和聊天机器人

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2023-01-11 DOI: 10.1162/coli_r_00470

Olga Seminck

This book has appeared in the series Synthesis Lectures on Human Language Technologies: monographs from 50 up to 150 pages about specific topics subjects in computational linguistics. The intended audience of the book are researchers and graduate students in NLP, AI, and related fields. I define myself as a computational linguist; my review is from a perspective of a “random” computational linguistics researcher wanting to learn more about this topic or looking for a good guide to teach a course on dialogue systems. I found the book very easy to read and interesting and therefore I believe that McTear fully achieved his purpose to write “a readable introduction to the various concepts, issues and technologies of Conversational AI.” He succeeds remarkably well in staying on the right level of technical details, never losing the purpose of giving an overview, and the reader does not get lost in numerous details about specific algorithms. Additionally, for people who are experts in Conversational AI, the book could still be very useful because its bibliography is exceptionally complete: a very large number of early works and recent studies are cited and commented through the whole book. The book is well structured into six chapters. After an introduction, there are two chapters about specific types of dialogue systems: rule-based systems (Chapter 2) and statistical systems (Chapter 3). This is followed by a chapter about evaluation methods (Chapter 4), after which the more recent neural end-to-end systems are reviewed (Chapter 5). The book ends with a chapter on various challenges and future directions for the research on Conversational AI (Chapter 6). I found that it was meaningful to distinguish the three types of dialogue systems: rule-based systems, statistical but modular systems, and end-to-end neural systems. It might, at first, seem strange that the topic on system evaluation methods is placed between the chapter about modular statistical dialogue systems and neural end-to-end systems, but as a reader, I believe that the discussion about system evaluation comes around at the right place in the book, because it helps to better understand the difference between modular and sequence to sequence systems. In this review, I will discuss the chapters one by one in the same order as they appear in the book.

这本书出现在《人类语言技术综合讲座》系列中：50到150页的关于计算语言学特定主题的专著。这本书的目标读者是NLP、人工智能和相关领域的研究人员和研究生。我把自己定义为一个计算语言学家；我的评论是从一个“随机”计算语言学研究人员的角度出发的，他想了解更多关于这个主题的信息，或者寻找一个很好的指导来教授对话系统课程。我觉得这本书读起来很容易，也很有趣，因此我相信McTear完全达到了他的目的，写了一本“对话人工智能的各种概念、问题和技术的可读介绍”。他非常成功地保持了正确的技术细节水平，从未失去概述的目的，并且读者不会迷失在关于特定算法的许多细节中。此外，对于会话人工智能专家来说，这本书仍然非常有用，因为它的参考书目非常完整：整本书都引用和评论了大量早期作品和最近的研究。这本书分为六章。在介绍之后，关于具体类型的对话系统分为两章：基于规则的系统（第2章）和统计系统（第3章）。接下来是关于评估方法的一章（第4章），之后回顾了最近的神经端到端系统（第5章）。本书最后一章介绍了会话人工智能研究的各种挑战和未来方向（第6章）。我发现区分三种类型的对话系统是有意义的：基于规则的系统、统计但模块化的系统和端到端的神经系统。起初，关于系统评估方法的话题被放在关于模块化统计对话系统和神经端到端系统的章节之间，这可能看起来很奇怪，但作为一名读者，我相信关于系统评估的讨论在书中的正确位置，因为它有助于更好地理解模块化系统和序列到序列系统之间的区别。在这篇综述中，我将按照书中出现的顺序逐一讨论这些章节。

{"title":"Conversational AI: Dialogue Systems, Conversational Agents, and Chatbots by Michael McTear","authors":"Olga Seminck","doi":"10.1162/coli_r_00470","DOIUrl":"https://doi.org/10.1162/coli_r_00470","url":null,"abstract":"This book has appeared in the series Synthesis Lectures on Human Language Technologies: monographs from 50 up to 150 pages about specific topics subjects in computational linguistics. The intended audience of the book are researchers and graduate students in NLP, AI, and related fields. I define myself as a computational linguist; my review is from a perspective of a “random” computational linguistics researcher wanting to learn more about this topic or looking for a good guide to teach a course on dialogue systems. I found the book very easy to read and interesting and therefore I believe that McTear fully achieved his purpose to write “a readable introduction to the various concepts, issues and technologies of Conversational AI.” He succeeds remarkably well in staying on the right level of technical details, never losing the purpose of giving an overview, and the reader does not get lost in numerous details about specific algorithms. Additionally, for people who are experts in Conversational AI, the book could still be very useful because its bibliography is exceptionally complete: a very large number of early works and recent studies are cited and commented through the whole book. The book is well structured into six chapters. After an introduction, there are two chapters about specific types of dialogue systems: rule-based systems (Chapter 2) and statistical systems (Chapter 3). This is followed by a chapter about evaluation methods (Chapter 4), after which the more recent neural end-to-end systems are reviewed (Chapter 5). The book ends with a chapter on various challenges and future directions for the research on Conversational AI (Chapter 6). I found that it was meaningful to distinguish the three types of dialogue systems: rule-based systems, statistical but modular systems, and end-to-end neural systems. It might, at first, seem strange that the topic on system evaluation methods is placed between the chapter about modular statistical dialogue systems and neural end-to-end systems, but as a reader, I believe that the discussion about system evaluation comes around at the right place in the book, because it helps to better understand the difference between modular and sequence to sequence systems. In this review, I will discuss the chapters one by one in the same order as they appear in the book.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"257-259"},"PeriodicalIF":9.3,"publicationDate":"2023-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44657355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation 从字型到符号再到词型:词义表示与解释方法综述

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-12-16 DOI: 10.1162/coli_a_00474

Marianna Apidianaki

Vector-based word representation paradigms situate lexical meaning at different levels of abstraction. Distributional and static embedding models generate a single vector per word type, which is an aggregate across the instances of the word in a corpus. Contextual language models, on the contrary, directly capture the meaning of individual word instances. The goal of this survey is to provide an overview of word meaning representation methods, and of the strategies that have been proposed for improving the quality of the generated vectors. These often involve injecting external knowledge about lexical semantic relationships, or refining the vectors to describe different senses. The survey also covers recent approaches for obtaining word type-level representations from token-level ones, and for combining static and contextualized representations. Special focus is given to probing and interpretation studies aimed at discovering the lexical semantic knowledge that is encoded in contextualized representations. The challenges posed by this exploration have motivated the interest towards static embedding derivation from contextualized embeddings, and for methods aimed at improving the similarity estimates that can be drawn from the space of contextual language models.

基于向量的单词表示范式将词汇意义置于不同的抽象层次。分布式和静态嵌入模型为每个单词类型生成一个向量，该向量是语料库中单词实例的集合。相反，语境语言模型直接捕捉单个单词实例的含义。这项调查的目的是概述词义表示方法，以及为提高生成向量的质量而提出的策略。这些通常涉及注入关于词汇语义关系的外部知识，或者精炼向量来描述不同的感觉。该调查还涵盖了从表征级表征中获得词类型级表征以及将静态表征和上下文化表征相结合的最新方法。特别关注的是探究和解释研究，旨在发现语境化表征中编码的词汇语义知识。这一探索带来的挑战激发了人们对静态嵌入的兴趣，即从上下文嵌入中派生静态嵌入，以及对旨在改进上下文语言模型空间中的相似性估计的方法的兴趣。

{"title":"From Word Types to Tokens and Back: A Survey of Approaches to Word Meaning Representation and Interpretation","authors":"Marianna Apidianaki","doi":"10.1162/coli_a_00474","DOIUrl":"https://doi.org/10.1162/coli_a_00474","url":null,"abstract":"Vector-based word representation paradigms situate lexical meaning at different levels of abstraction. Distributional and static embedding models generate a single vector per word type, which is an aggregate across the instances of the word in a corpus. Contextual language models, on the contrary, directly capture the meaning of individual word instances. The goal of this survey is to provide an overview of word meaning representation methods, and of the strategies that have been proposed for improving the quality of the generated vectors. These often involve injecting external knowledge about lexical semantic relationships, or refining the vectors to describe different senses. The survey also covers recent approaches for obtaining word type-level representations from token-level ones, and for combining static and contextualized representations. Special focus is given to probing and interpretation studies aimed at discovering the lexical semantic knowledge that is encoded in contextualized representations. The challenges posed by this exploration have motivated the interest towards static embedding derivation from contextualized embeddings, and for methods aimed at improving the similarity estimates that can be drawn from the space of contextual language models.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"465-523"},"PeriodicalIF":9.3,"publicationDate":"2022-12-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49266755","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 7

A Metrological Perspective on Reproducibility in NLP* NLP再现性的计量学视角*

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-12-01 DOI: 10.1162/coli_a_00448

Anya Belz

Abstract Reproducibility has become an increasingly debated topic in NLP and ML over recent years, but so far, no commonly accepted definitions of even basic terms or concepts have emerged. The range of different definitions proposed within NLP/ML not only do not agree with each other, they are also not aligned with standard scientific definitions. This article examines the standard definitions of repeatability and reproducibility provided by the meta-science of metrology, and explores what they imply in terms of how to assess reproducibility, and what adopting them would mean for reproducibility assessment in NLP/ML. It turns out the standard definitions lead directly to a method for assessing reproducibility in quantified terms that renders results from reproduction studies comparable across multiple reproductions of the same original study, as well as reproductions of different original studies. The article considers where this method sits in relation to other aspects of NLP work one might wish to assess in the context of reproducibility.

近年来，可重复性已经成为NLP和ML中越来越有争议的话题，但到目前为止，即使是基本的术语或概念也没有普遍接受的定义。NLP/ML中提出的不同定义的范围不仅彼此不一致，而且与标准的科学定义也不一致。本文考察了计量学元科学提供的可重复性和可重复性的标准定义，并探讨了它们在如何评估可重复性方面的含义，以及采用它们对NLP/ML中的可重复性评估意味着什么。事实证明，标准定义直接导致了一种量化评估再现性的方法，这种方法使得再现研究的结果在同一原始研究的多次再现中具有可比性，以及不同原始研究的再现。本文考虑了这种方法与NLP工作的其他方面的关系，人们可能希望在可重复性的背景下进行评估。

引用次数: 5

Erratum: Annotation Curricula to Implicitly Train Non-Expert Annotators 勘误表：隐性培养非专家注释者的注释课程

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-12-01 DOI: 10.1162/coli_x_00469

Ji-Ung Lee, Jan-Christoph Klie, Iryna Gurevych

Abstract The authors of this work (“Annotation Curricula to Implicitly Train Non-Expert Annotators” by Ji-Ung Lee, Jan-Christoph Klie, and Iryna Gurevych in Computational Linguistics 48:2 https://doi.org/10.1162/coli_a_00436) discovered an incorrect inequality symbol in section 5.3 (page 360). The paper stated that the differences in the annotation times for the control instances result in a p-value of 0.200 which is smaller than 0.05 (p = 0.200 < 0.05). As 0.200 is of course larger than 0.05, the correct inequality symbol is p = 0.200 > 0.05, which is in line with the conclusion that follows in the text. The paper has been updated accordingly.

摘要这项工作的作者（Ji Ung Lee、Jan Christoph Klie和Iryna Gurevych在计算语言学中的“隐含训练非专家注释者的注释课程”48:2https://doi.org/10.1162/coli_a_00436)在第5.3节（第360页）中发现了一个不正确的不等式符号。本文指出，控制实例注释时间的差异导致p值为0.200，小于0.05（p=0.200<0.05）。由于0.200当然大于0.05，因此正确的不等式符号为p=0.200>0.05，这与文中的结论一致。该文件已作了相应的更新。

引用次数: 0

Information Theory–based Compositional Distributional Semantics 基于信息论的组合分布语义

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-12-01 DOI: 10.1162/coli_a_00454

Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí

Abstract In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory–based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. Our theoretical analysis and empirical results show that fulfilling formal properties affects positively the accuracy of text representation models in terms of correspondence (isometry) between the embedding and meaning spaces.

摘要在文本表征的背景下，组合分布语义模型旨在融合分布假设和组合原则。文本嵌入是基于共存在分布的，而表示又通过考虑文本结构的组合函数进行组合。然而，组成函数的理论基础仍然是一个悬而未决的问题。在本文中，我们定义并研究了基于信息论的组合分布语义（ICDS）的概念：（i）我们首先基于香农信息论建立了嵌入、组合和相似函数的形式性质；（ii）我们在这个棱镜下分析现有的方法，检查它们是否符合既定的理想性质；（iii）我们提出了两个可参数化的组合和相似函数，它们在满足形式性质的同时推广了传统方法；最后（iv）我们对几个文本相似性数据集进行了实证研究，这些数据集包括具有高和低词汇重叠的句子，以及单词之间的相似性及其描述。我们的理论分析和实证结果表明，在嵌入空间和意义空间之间的对应（等距）方面，实现形式属性对文本表示模型的准确性产生了积极影响。

{"title":"Information Theory–based Compositional Distributional Semantics","authors":"Enrique Amigó, Alejandro Ariza-Casabona, V. Fresno, M. A. Martí","doi":"10.1162/coli_a_00454","DOIUrl":"https://doi.org/10.1162/coli_a_00454","url":null,"abstract":"Abstract In the context of text representation, Compositional Distributional Semantics models aim to fuse the Distributional Hypothesis and the Principle of Compositionality. Text embedding is based on co-ocurrence distributions and the representations are in turn combined by compositional functions taking into account the text structure. However, the theoretical basis of compositional functions is still an open issue. In this article we define and study the notion of Information Theory–based Compositional Distributional Semantics (ICDS): (i) We first establish formal properties for embedding, composition, and similarity functions based on Shannon’s Information Theory; (ii) we analyze the existing approaches under this prism, checking whether or not they comply with the established desirable properties; (iii) we propose two parameterizable composition and similarity functions that generalize traditional approaches while fulfilling the formal properties; and finally (iv) we perform an empirical study on several textual similarity datasets that include sentences with a high and low lexical overlap, and on the similarity between words and their description. Our theoretical analysis and empirical results show that fulfilling formal properties affects positively the accuracy of text representation models in terms of correspondence (isometry) between the embedding and meaning spaces.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":" ","pages":"907-948"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49351690","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nucleus Composition in Transition-based Dependency Parsing 基于转换的依赖解析中的核组合

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-12-01 DOI: 10.1162/coli_a_00450

Joakim Nivre, A. Basirat, Luise Durlich, A. Moss

Abstract Dependency-based approaches to syntactic analysis assume that syntactic structure can be analyzed in terms of binary asymmetric dependency relations holding between elementary syntactic units. Computational models for dependency parsing almost universally assume that an elementary syntactic unit is a word, while the influential theory of Lucien Tesnière instead posits a more abstract notion of nucleus, which may be realized as one or more words. In this article, we investigate the effect of enriching computational parsing models with a concept of nucleus inspired by Tesnière. We begin by reviewing how the concept of nucleus can be defined in the framework of Universal Dependencies, which has become the de facto standard for training and evaluating supervised dependency parsers, and explaining how composition functions can be used to make neural transition-based dependency parsers aware of the nuclei thus defined. We then perform an extensive experimental study, using data from 20 languages to assess the impact of nucleus composition across languages with different typological characteristics, and utilizing a variety of analytical tools including ablation, linear mixed-effects models, diagnostic classifiers, and dimensionality reduction. The analysis reveals that nucleus composition gives small but consistent improvements in parsing accuracy for most languages, and that the improvement mainly concerns the analysis of main predicates, nominal dependents, clausal dependents, and coordination structures. Significant factors explaining the rate of improvement across languages include entropy in coordination structures and frequency of certain function words, in particular determiners. Analysis using dimensionality reduction and diagnostic classifiers suggests that nucleus composition increases the similarity of vectors representing nuclei of the same syntactic type.

基于依赖的句法分析方法假设句法结构可以根据基本句法单元之间的二元非对称依赖关系进行分析。依赖关系分析的计算模型几乎普遍假定一个基本句法单位是一个词，而吕西安·特斯尼 (Lucien tesni)颇具影响力的理论则假设了一个更抽象的核概念，它可以被实现为一个或多个词。在本文中，我们研究了由tesni启发的核概念丰富计算解析模型的效果。我们首先回顾如何在通用依赖的框架中定义核的概念，它已经成为训练和评估监督依赖解析器的事实标准，并解释如何使用组合函数使基于神经转换的依赖解析器意识到因此定义的核。然后，我们进行了一项广泛的实验研究，使用来自20种语言的数据来评估具有不同类型特征的语言的核组成的影响，并利用各种分析工具，包括消融、线性混合效应模型、诊断分类器和降维。分析表明，对于大多数语言来说，核组成在解析精度方面提供了微小但一致的改进，并且这种改进主要涉及对主要谓词、名义依赖项、小句依赖项和协调结构的分析。解释不同语言的进步速度的重要因素包括配位结构的熵和某些虚词的频率，特别是限定词。使用降维和诊断分类器的分析表明，核组成增加了表示相同句法类型核的向量的相似性。

{"title":"Nucleus Composition in Transition-based Dependency Parsing","authors":"Joakim Nivre, A. Basirat, Luise Durlich, A. Moss","doi":"10.1162/coli_a_00450","DOIUrl":"https://doi.org/10.1162/coli_a_00450","url":null,"abstract":"Abstract Dependency-based approaches to syntactic analysis assume that syntactic structure can be analyzed in terms of binary asymmetric dependency relations holding between elementary syntactic units. Computational models for dependency parsing almost universally assume that an elementary syntactic unit is a word, while the influential theory of Lucien Tesnière instead posits a more abstract notion of nucleus, which may be realized as one or more words. In this article, we investigate the effect of enriching computational parsing models with a concept of nucleus inspired by Tesnière. We begin by reviewing how the concept of nucleus can be defined in the framework of Universal Dependencies, which has become the de facto standard for training and evaluating supervised dependency parsers, and explaining how composition functions can be used to make neural transition-based dependency parsers aware of the nuclei thus defined. We then perform an extensive experimental study, using data from 20 languages to assess the impact of nucleus composition across languages with different typological characteristics, and utilizing a variety of analytical tools including ablation, linear mixed-effects models, diagnostic classifiers, and dimensionality reduction. The analysis reveals that nucleus composition gives small but consistent improvements in parsing accuracy for most languages, and that the improvement mainly concerns the analysis of main predicates, nominal dependents, clausal dependents, and coordination structures. Significant factors explaining the rate of improvement across languages include entropy in coordination structures and frequency of certain function words, in particular determiners. Analysis using dimensionality reduction and diagnostic classifiers suggests that nucleus composition increases the similarity of vectors representing nuclei of the same syntactic type.","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"48 1","pages":"849-886"},"PeriodicalIF":9.3,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47076502","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

Pretrained Transformers for Text Ranking: BERT and Beyond 预训练的文本排名转换器：BERT及其后

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-11-07 DOI: 10.1162/coli_r_00468

S. Verberne

Text ranking takes a central place in Information Retrieval (IR), with Web search as its best-known application. More generally, text ranking models are applicable to any Natural Language Processing (NLP) task in which relevance of information plays a role, from filtering and recommendation applications to question answering and semantic similarity comparisons. Since the rise of BERT in 2019, Transformer models have become the most used and studied architectures in both NLP and IR, and they have been applied to basically any task in our research fields—including text ranking. In a fast-changing research context, it can be challenging to keep lecture materials up to date. Lecturers in NLP are grateful for Dan Jurafsky and James Martin for yearly updating the 3rd edition of their textbook, making Speech and Language Processing the most comprehensive, modern textbook for NLP. The IR field is less fortunate, still relying on older textbooks, extended with a collection of recent materials that address neural models. The textbook Pretrained Transformers for Text Ranking: BERT and Beyond by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates is a great effort to collect the recent developments in the use of Transformers for text ranking. The introduction of the book is well-scoped with clear guidance for the reader about topics that are out of scope (such as user aspects). This is followed by an excellent history section, stating for example:

文本排序在信息检索(Information Retrieval, IR)中占有中心地位，其中Web搜索是其最著名的应用。更一般地说，文本排序模型适用于任何自然语言处理(NLP)任务，其中信息的相关性起着作用，从过滤和推荐应用到问答和语义相似性比较。自2019年BERT兴起以来，Transformer模型已成为NLP和IR中使用最多和研究最多的架构，它们已被应用于我们研究领域的基本任何任务，包括文本排名。在瞬息万变的研究环境中，使讲座材料保持最新可能是一项挑战。NLP的讲师感谢Dan Jurafsky和James Martin每年更新他们的第三版教科书，使语音和语言处理成为最全面，最现代的NLP教科书。红外领域就没那么幸运了，仍然依赖于旧的教科书，并扩展了一些关于神经模型的最新材料。由Jimmy Lin、Rodrigo Nogueira和Andrew Yates编写的教科书《用于文本排序的预训练变形金刚:BERT and Beyond》是收集使用变形金刚进行文本排序的最新发展的一项巨大努力。这本书的引言有很好的范围，对超出范围的主题(例如用户方面)为读者提供了明确的指导。接下来是一个优秀的历史部分，例如:

{"title":"Pretrained Transformers for Text Ranking: BERT and Beyond","authors":"S. Verberne","doi":"10.1162/coli_r_00468","DOIUrl":"https://doi.org/10.1162/coli_r_00468","url":null,"abstract":"Text ranking takes a central place in Information Retrieval (IR), with Web search as its best-known application. More generally, text ranking models are applicable to any Natural Language Processing (NLP) task in which relevance of information plays a role, from filtering and recommendation applications to question answering and semantic similarity comparisons. Since the rise of BERT in 2019, Transformer models have become the most used and studied architectures in both NLP and IR, and they have been applied to basically any task in our research fields—including text ranking. In a fast-changing research context, it can be challenging to keep lecture materials up to date. Lecturers in NLP are grateful for Dan Jurafsky and James Martin for yearly updating the 3rd edition of their textbook, making Speech and Language Processing the most comprehensive, modern textbook for NLP. The IR field is less fortunate, still relying on older textbooks, extended with a collection of recent materials that address neural models. The textbook Pretrained Transformers for Text Ranking: BERT and Beyond by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates is a great effort to collect the recent developments in the use of Transformers for text ranking. The introduction of the book is well-scoped with clear guidance for the reader about topics that are out of scope (such as user aspects). This is followed by an excellent history section, stating for example:","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"253-255"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46576476","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science 效度、信度与显著性:NLP与数据科学的实证方法

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-11-07 DOI: 10.1162/coli_r_00467

Richard Futrell

When we come up with a new model in NLP and machine learning more generally, we usually look at some performance metric (one number), compare it against the same performance metric for a strong baseline model (one number), and if the new model gets a better number, we mark it in bold and declare it the winner. For anyone with a background in statistics or a field where conclusions must be drawn on the basis of noisy data, this procedure is frankly shocking. Suppose model A gets a BLEU score one point higher than model B: Is that difference reliable? If you used a slightly different dataset for training and evaluation, would that one point difference still hold? Would the difference even survive running the same models on the same datasets but with different random seeds? In fields such as psychology and biology, it is standard to answer such questions using standardized statistical procedures to make sure that differences of interest are larger than some quantification of measurement noise. Making a claim based on a bare difference of two numbers is unthinkable. Yet statistical procedures remain rare in the evaluation of NLP models, whose performance metrics are arguably just as noisy. To these objections, NLP practitioners can respond that they have faithfully followed the hallowed train-(dev-)test split paradigm. As long as proper test set discipline has been followed, the theory goes, the evaluation is secure: By testing on held-out data, we can be sure that our models are performing well in a way that is independent of random accidents of the training data, and by testing on that data only once, we guard against making claims based on differences that would not replicate if we ran the models again. But does the train-test split paradigm really guard against all problems of validity and reliability? Into this situation comes the book under review, Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science, by Stefan Riezler and Michael Hagmann. The authors argue that the train-test split paradigm does not in fact insulate NLP from problems relating to the validity and reliability of its models, their features, and their performance metrics. They present numerous case studies to prove their point, and advocate and teach standard statistical methods as the solution, with rich examples

当我们在NLP和机器学习中提出一个新模型时，我们通常会查看一些性能指标（一个数字），将其与强基线模型的相同性能指标进行比较（一个数据），如果新模型得到了更好的数字，我们会用粗体标记它，并宣布它是赢家。对于任何有统计学背景或必须根据嘈杂数据得出结论的人来说，这个过程坦率地说是令人震惊的。假设模型A的BLEU得分比模型B高一分：这种差异可靠吗？如果你使用一个略有不同的数据集进行训练和评估，那么这一点的差异还会成立吗？在相同的数据集上运行相同的模型，但使用不同的随机种子，这种差异还会存在吗？在心理学和生物学等领域，标准的做法是使用标准化的统计程序来回答这些问题，以确保感兴趣的差异大于测量噪声的某些量化。基于两个数字的明显差异提出索赔是不可想象的。然而，统计程序在NLP模型的评估中仍然很少见，其性能指标可以说也同样嘈杂。对于这些反对意见，NLP从业者可以回应说，他们忠实地遵循了神圣的训练（开发）测试分裂范式。理论认为，只要遵循了适当的测试集规则，评估就是安全的：通过对保留的数据进行测试，我们可以确保我们的模型以一种独立于训练数据的随机事故的方式表现良好，并且通过只对该数据进行一次测试，我们防止基于差异提出索赔，如果我们再次运行模型，这些差异将不会复制。但是，列车测试分流范式真的能防止所有的有效性和可靠性问题吗？Stefan Riezler和Michael Hagmann的《有效性、可靠性和意义：NLP和数据科学的经验方法》一书正是在这种情况下出版的。作者认为，训练-测试分离范式实际上并没有将NLP与与其模型的有效性和可靠性、其特征和性能指标相关的问题隔离开来。他们提出了大量的案例研究来证明他们的观点，并提倡和教授标准的统计方法作为解决方案，并提供了丰富的例子

{"title":"Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science","authors":"Richard Futrell","doi":"10.1162/coli_r_00467","DOIUrl":"https://doi.org/10.1162/coli_r_00467","url":null,"abstract":"When we come up with a new model in NLP and machine learning more generally, we usually look at some performance metric (one number), compare it against the same performance metric for a strong baseline model (one number), and if the new model gets a better number, we mark it in bold and declare it the winner. For anyone with a background in statistics or a field where conclusions must be drawn on the basis of noisy data, this procedure is frankly shocking. Suppose model A gets a BLEU score one point higher than model B: Is that difference reliable? If you used a slightly different dataset for training and evaluation, would that one point difference still hold? Would the difference even survive running the same models on the same datasets but with different random seeds? In fields such as psychology and biology, it is standard to answer such questions using standardized statistical procedures to make sure that differences of interest are larger than some quantification of measurement noise. Making a claim based on a bare difference of two numbers is unthinkable. Yet statistical procedures remain rare in the evaluation of NLP models, whose performance metrics are arguably just as noisy. To these objections, NLP practitioners can respond that they have faithfully followed the hallowed train-(dev-)test split paradigm. As long as proper test set discipline has been followed, the theory goes, the evaluation is secure: By testing on held-out data, we can be sure that our models are performing well in a way that is independent of random accidents of the training data, and by testing on that data only once, we guard against making claims based on differences that would not replicate if we ran the models again. But does the train-test split paradigm really guard against all problems of validity and reliability? Into this situation comes the book under review, Validity, Reliability, and Significance: Empirical Methods for NLP and Data Science, by Stefan Riezler and Michael Hagmann. The authors argue that the train-test split paradigm does not in fact insulate NLP from problems relating to the validity and reliability of its models, their features, and their performance metrics. They present numerous case studies to prove their point, and advocate and teach standard statistical methods as the solution, with rich examples","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"249-251"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47568619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Finite-State Text Processing 有限状态文本处理

IF 9.3 2区计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Computational Linguistics

Pub Date : 2022-11-07 DOI: 10.1162/coli_r_00466

Aniello De Santo

The rise in popularity of neural network methods in computational linguistics has led to a richness of valuable books on the topic. On the other hand, there is arguably a shortage of recent materials on foundational computational linguistics methods like finite-state technologies. This is unfortunate, as finite-state approaches not only still find much use in applications for speech and text processing (the core focus of this book), but seem also to be valuable in interpreting and improving neural methods. Moreover, the study of finite-state machines (and their corresponding formal languages) is still proving insightful in theoretical linguistics analyses. In this sense, this book by Gorman and Sproat is a welcome, refreshing contribution aimed at a variety of readers. The book is organized in eight main chapters, and can be conceptually divided into two parts. The first half of the book serves as an introduction to core concepts in formal language and automata theory (Chapter 1), the basic design principles of the Python library used through the book (Chapter 2), and a variety of finite-state algorithms (Chapters 3 and 4). The rest of the book exemplifies the formal notions of the previous chapters with practical applications of finite-state technologies to linguistic problems like morphophonological analysis and text normalization (Chapter 5, 6, 7). The last chapter (Chapter 8) presents an interesting discussion of future steps from a variety of perspectives, connecting finite-state methods to current trends in the field. In what follows, I discuss the contents of each chapter in more detail. Chapter 1 provides an accessible introduction to formal languages and automata theory, starting with a concise but helpful historical review of the development of finite-state technologies and their ties to linguistics. The chapter balances intuitive explanations of technical concepts without sacrificing formal rigor, in particular in the presentation of finite-state automata/transducers and their formal definitions. Weighted finite-state automata play a prominent role in the book and the authors introduce them algebraically through the notion of semiring (Pin 1997). This is a welcome choice that makes the book somewhat unique among introductory/application-oriented materials on these topics. Unsurprisingly, this section is the densest part of the chapter and it could have benefitted from an explanation of the intuition behind the connection between wellformedness of a string, recognition by an automaton, and paths over semirings— especially considering how relevant such concepts become when the book transitions

{"title":"Finite-State Text Processing","authors":"Aniello De Santo","doi":"10.1162/coli_r_00466","DOIUrl":"https://doi.org/10.1162/coli_r_00466","url":null,"abstract":"The rise in popularity of neural network methods in computational linguistics has led to a richness of valuable books on the topic. On the other hand, there is arguably a shortage of recent materials on foundational computational linguistics methods like finite-state technologies. This is unfortunate, as finite-state approaches not only still find much use in applications for speech and text processing (the core focus of this book), but seem also to be valuable in interpreting and improving neural methods. Moreover, the study of finite-state machines (and their corresponding formal languages) is still proving insightful in theoretical linguistics analyses. In this sense, this book by Gorman and Sproat is a welcome, refreshing contribution aimed at a variety of readers. The book is organized in eight main chapters, and can be conceptually divided into two parts. The first half of the book serves as an introduction to core concepts in formal language and automata theory (Chapter 1), the basic design principles of the Python library used through the book (Chapter 2), and a variety of finite-state algorithms (Chapters 3 and 4). The rest of the book exemplifies the formal notions of the previous chapters with practical applications of finite-state technologies to linguistic problems like morphophonological analysis and text normalization (Chapter 5, 6, 7). The last chapter (Chapter 8) presents an interesting discussion of future steps from a variety of perspectives, connecting finite-state methods to current trends in the field. In what follows, I discuss the contents of each chapter in more detail. Chapter 1 provides an accessible introduction to formal languages and automata theory, starting with a concise but helpful historical review of the development of finite-state technologies and their ties to linguistics. The chapter balances intuitive explanations of technical concepts without sacrificing formal rigor, in particular in the presentation of finite-state automata/transducers and their formal definitions. Weighted finite-state automata play a prominent role in the book and the authors introduce them algebraically through the notion of semiring (Pin 1997). This is a welcome choice that makes the book somewhat unique among introductory/application-oriented materials on these topics. Unsurprisingly, this section is the densest part of the chapter and it could have benefitted from an explanation of the intuition behind the connection between wellformedness of a string, recognition by an automaton, and paths over semirings— especially considering how relevant such concepts become when the book transitions","PeriodicalId":55229,"journal":{"name":"Computational Linguistics","volume":"49 1","pages":"245-247"},"PeriodicalIF":9.3,"publicationDate":"2022-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64496956","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0