首页 > 最新文献

Transactions of the Association for Computational Linguistics最新文献

英文 中文
Introduction to Mathematical Language Processing: Informal Proofs, Word Problems, and Supporting Tasks 数学语言处理导论:非正式证明、文字问题和辅助任务
1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-01 DOI: 10.1162/tacl_a_00594
Jordan Meadows, André Freitas
Abstract Automating discovery in mathematics and science will require sophisticated methods of information extraction and abstract reasoning, including models that can convincingly process relationships between mathematical elements and natural language, to produce problem solutions of real-world value. We analyze mathematical language processing methods across five strategic sub-areas (identifier-definition extraction, formula retrieval, natural language premise selection, math word problem solving, and informal theorem proving) from recent years, highlighting prevailing methodologies, existing limitations, overarching trends, and promising avenues for future research.
数学和科学中的自动化发现将需要复杂的信息提取和抽象推理方法,包括能够令人信服地处理数学元素和自然语言之间关系的模型,以产生具有现实世界价值的问题解决方案。我们分析了近年来数学语言处理方法在五个战略子领域(标识符定义提取、公式检索、自然语言前提选择、数学单词问题解决和非正式定理证明),强调了流行的方法、现有的局限性、总体趋势和未来研究的有希望的途径。
{"title":"Introduction to Mathematical Language Processing: Informal Proofs, Word Problems, and Supporting Tasks","authors":"Jordan Meadows, André Freitas","doi":"10.1162/tacl_a_00594","DOIUrl":"https://doi.org/10.1162/tacl_a_00594","url":null,"abstract":"Abstract Automating discovery in mathematics and science will require sophisticated methods of information extraction and abstract reasoning, including models that can convincingly process relationships between mathematical elements and natural language, to produce problem solutions of real-world value. We analyze mathematical language processing methods across five strategic sub-areas (identifier-definition extraction, formula retrieval, natural language premise selection, math word problem solving, and informal theorem proving) from recent years, highlighting prevailing methodologies, existing limitations, overarching trends, and promising avenues for future research.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135596959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification T3L:跨语言文本分类的翻译-测试迁移学习
1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-01 DOI: 10.1162/tacl_a_00593
Inigo Jauregi Unanue, Gholamreza Haffari, Massimo Piccardi
Abstract Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/ few-shots cross-lingual transfer). Nowadays, cross-lingual text classifiers are typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest. However, the performance of these models varies significantly across languages and classification tasks, suggesting that the superposition of the language modelling and classification tasks is not always effective. For this reason, in this paper we propose revisiting the classic “translate-and-test” pipeline to neatly separate the translation and classification stages. The proposed approach couples 1) a neural machine translator translating from the targeted language to a high-resource language, with 2) a text classifier trained in the high-resource language, but the neural machine translator generates “soft” translations to permit end-to-end backpropagation during fine-tuning of the pipeline. Extensive experiments have been carried out over three cross-lingual text classification datasets (XNLI, MLDoc, and MultiEURLEX), with the results showing that the proposed approach has significantly improved performance over a competitive baseline.
跨语言文本分类利用在高资源语言中训练的文本分类器来执行其他语言的文本分类,而无需或最小的微调(零/几次跨语言迁移)。目前,跨语言文本分类器通常建立在大规模的多语言模型(LMs)上,该模型是在各种感兴趣的语言上进行预训练的。然而,这些模型的性能在不同的语言和分类任务之间差异很大,这表明语言建模和分类任务的叠加并不总是有效的。出于这个原因,在本文中,我们建议重新审视经典的“翻译-测试”管道,以整齐地分离翻译和分类阶段。该方法将1)神经机器翻译器从目标语言翻译为高资源语言,2)高资源语言训练的文本分类器,但神经机器翻译器生成“软”翻译,以便在管道微调期间允许端到端反向传播。在三个跨语言文本分类数据集(XNLI、MLDoc和MultiEURLEX)上进行了大量的实验,结果表明所提出的方法在竞争性基线上显著提高了性能。
{"title":"T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text Classification","authors":"Inigo Jauregi Unanue, Gholamreza Haffari, Massimo Piccardi","doi":"10.1162/tacl_a_00593","DOIUrl":"https://doi.org/10.1162/tacl_a_00593","url":null,"abstract":"Abstract Cross-lingual text classification leverages text classifiers trained in a high-resource language to perform text classification in other languages with no or minimal fine-tuning (zero/ few-shots cross-lingual transfer). Nowadays, cross-lingual text classifiers are typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest. However, the performance of these models varies significantly across languages and classification tasks, suggesting that the superposition of the language modelling and classification tasks is not always effective. For this reason, in this paper we propose revisiting the classic “translate-and-test” pipeline to neatly separate the translation and classification stages. The proposed approach couples 1) a neural machine translator translating from the targeted language to a high-resource language, with 2) a text classifier trained in the high-resource language, but the neural machine translator generates “soft” translations to permit end-to-end backpropagation during fine-tuning of the pipeline. Extensive experiments have been carried out over three cross-lingual text classification datasets (XNLI, MLDoc, and MultiEURLEX), with the results showing that the proposed approach has significantly improved performance over a competitive baseline.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135596945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DMDD: A Large-Scale Dataset for Dataset Mentions Detection DMDD:用于数据集提及检测的大规模数据集
1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2023-01-01 DOI: 10.1162/tacl_a_00592
Huitong Pan, Qi Zhang, Eduard Dragut, Cornelia Caragea, Longin Jan Latecki
Abstract The recognition of dataset names is a critical task for automatic information extraction in scientific literature, enabling researchers to understand and identify research opportunities. However, existing corpora for dataset mention detection are limited in size and naming diversity. In this paper, we introduce the Dataset Mentions Detection Dataset (DMDD), the largest publicly available corpus for this task. DMDD consists of the DMDD main corpus, comprising 31,219 scientific articles with over 449,000 dataset mentions weakly annotated in the format of in-text spans, and an evaluation set, which comprises 450 scientific articles manually annotated for evaluation purposes. We use DMDD to establish baseline performance for dataset mention detection and linking. By analyzing the performance of various models on DMDD, we are able to identify open problems in dataset mention detection. We invite the community to use our dataset as a challenge to develop novel dataset mention detection models.
摘要数据集名称识别是科学文献信息自动提取的关键任务,使研究人员能够理解和识别研究机会。然而,现有的用于数据集提及检测的语料库在大小和命名多样性方面受到限制。在本文中,我们介绍了数据集提及检测数据集(DMDD),这是该任务中最大的公开可用语料库。DMDD由DMDD主语料库和评估集组成,主语料库包括31,219篇科学文章和超过449,000个以文本跨度格式弱注释的数据集,评估集包括450篇为评估目的手工注释的科学文章。我们使用DMDD来建立数据集提及检测和链接的基准性能。通过分析各种模型在DMDD上的性能,我们能够识别数据集提及检测中的开放性问题。我们邀请社区使用我们的数据集作为挑战,开发新的数据集提及检测模型。
{"title":"DMDD: A Large-Scale Dataset for Dataset Mentions Detection","authors":"Huitong Pan, Qi Zhang, Eduard Dragut, Cornelia Caragea, Longin Jan Latecki","doi":"10.1162/tacl_a_00592","DOIUrl":"https://doi.org/10.1162/tacl_a_00592","url":null,"abstract":"Abstract The recognition of dataset names is a critical task for automatic information extraction in scientific literature, enabling researchers to understand and identify research opportunities. However, existing corpora for dataset mention detection are limited in size and naming diversity. In this paper, we introduce the Dataset Mentions Detection Dataset (DMDD), the largest publicly available corpus for this task. DMDD consists of the DMDD main corpus, comprising 31,219 scientific articles with over 449,000 dataset mentions weakly annotated in the format of in-text spans, and an evaluation set, which comprises 450 scientific articles manually annotated for evaluation purposes. We use DMDD to establish baseline performance for dataset mention detection and linking. By analyzing the performance of various models on DMDD, we are able to identify open problems in dataset mention detection. We invite the community to use our dataset as a challenge to develop novel dataset mention detection models.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135597175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing 实践中的配音:一个大规模的人类本地化研究——对自动配音的见解
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-23 DOI: 10.1162/tacl_a_00551
William Brannon, Yogesh Virkar, Brian Thompson
We investigate how humans perform the task of dubbing video content from one language into another, leveraging a novel corpus of 319.57 hours of video from 54 professionally produced titles. This is the first such large-scale study we are aware of. The results challenge a number of assumptions commonly made in both qualitative literature on human dubbing and machine-learning literature on automatic dubbing, arguing for the importance of vocal naturalness and translation quality over commonly emphasized isometric (character length) and lip-sync constraints, and for a more qualified view of the importance of isochronic (timing) constraints. We also find substantial influence of the source-side audio on human dubs through channels other than the words of the translation, pointing to the need for research on ways to preserve speech characteristics, as well as transfer of semantic properties such as emphasis and emotion, in automatic dubbing systems.
我们利用一个由54个专业制作的标题组成的319.57小时视频的新颖语料库,研究了人类如何执行将视频内容从一种语言翻译成另一种语言的任务。这是我们所知的第一个如此大规模的研究。这些结果挑战了关于人类配音的定性文献和关于自动配音的机器学习文献中常见的许多假设,认为人声自然度和翻译质量的重要性超过了通常强调的等长(字符长度)和唇同步约束,并对等时(时间)约束的重要性提出了更合格的看法。我们还发现,源端音频通过翻译单词以外的渠道对人类配音产生了实质性的影响,这表明有必要研究在自动配音系统中保持语音特征以及强调和情感等语义特性转移的方法。
{"title":"Dubbing in Practice: A Large Scale Study of Human Localization With Insights for Automatic Dubbing","authors":"William Brannon, Yogesh Virkar, Brian Thompson","doi":"10.1162/tacl_a_00551","DOIUrl":"https://doi.org/10.1162/tacl_a_00551","url":null,"abstract":"We investigate how humans perform the task of dubbing video content from one language into another, leveraging a novel corpus of 319.57 hours of video from 54 professionally produced titles. This is the first such large-scale study we are aware of. The results challenge a number of assumptions commonly made in both qualitative literature on human dubbing and machine-learning literature on automatic dubbing, arguing for the importance of vocal naturalness and translation quality over commonly emphasized isometric (character length) and lip-sync constraints, and for a more qualified view of the importance of isochronic (timing) constraints. We also find substantial influence of the source-side audio on human dubs through channels other than the words of the translation, pointing to the need for research on ways to preserve speech characteristics, as well as transfer of semantic properties such as emphasis and emotion, in automatic dubbing systems.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"7 2","pages":"419-435"},"PeriodicalIF":10.9,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"41245786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times? 为什么更大的基于变压器的语言模型提供的惊喜更不适合人类的阅读时间?
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-23 DOI: 10.1162/tacl_a_00548
Byung-Doh Oh, William Schuler
This work presents a linguistic analysis into why larger Transformer-based pre-trained language models with more parameters and lower perplexity nonetheless yield surprisal estimates that are less predictive of human reading times. First, regression analyses show a strictly monotonic, positive log-linear relationship between perplexity and fit to reading times for the more recently released five GPT-Neo variants and eight OPT variants on two separate datasets, replicating earlier results limited to just GPT-2 (Oh et al., 2022). Subsequently, analysis of residual errors reveals a systematic deviation of the larger variants, such as underpredicting reading times of named entities and making compensatory overpredictions for reading times of function words such as modals and conjunctions. These results suggest that the propensity of larger Transformer-based models to ‘memorize’ sequences during training makes their surprisal estimates diverge from humanlike expectations, which warrants caution in using pre-trained language models to study human language processing.
这项工作从语言学角度分析了为什么更大的基于Transformer的预训练语言模型具有更多的参数和更低的困惑,却产生了对人类阅读时间预测性较差的奇怪估计。首先,回归分析显示,在两个独立的数据集上,最近发布的五个GPT-Neo变体和八个OPT变体的困惑和适合阅读时间之间存在严格的单调、正对数线性关系,复制了仅限于GPT-2的早期结果(Oh et al.,2022)。随后,对残差的分析揭示了较大变体的系统偏差,例如低估命名实体的阅读时间,以及对语气词和连词等虚词的阅读时间进行补偿性高估。这些结果表明,较大的基于Transformer的模型在训练过程中倾向于“记忆”序列,这使得它们的奇怪估计偏离了类人期望,这在使用预先训练的语言模型研究人类语言处理时值得谨慎。
{"title":"Why Does Surprisal From Larger Transformer-Based Language Models Provide a Poorer Fit to Human Reading Times?","authors":"Byung-Doh Oh, William Schuler","doi":"10.1162/tacl_a_00548","DOIUrl":"https://doi.org/10.1162/tacl_a_00548","url":null,"abstract":"This work presents a linguistic analysis into why larger Transformer-based pre-trained language models with more parameters and lower perplexity nonetheless yield surprisal estimates that are less predictive of human reading times. First, regression analyses show a strictly monotonic, positive log-linear relationship between perplexity and fit to reading times for the more recently released five GPT-Neo variants and eight OPT variants on two separate datasets, replicating earlier results limited to just GPT-2 (Oh et al., 2022). Subsequently, analysis of residual errors reveals a systematic deviation of the larger variants, such as underpredicting reading times of named entities and making compensatory overpredictions for reading times of function words such as modals and conjunctions. These results suggest that the propensity of larger Transformer-based models to ‘memorize’ sequences during training makes their surprisal estimates diverge from humanlike expectations, which warrants caution in using pre-trained language models to study human language processing.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"336-350"},"PeriodicalIF":10.9,"publicationDate":"2022-12-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42493867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement 变压器抽象句法表征能力的评估——基于长距离一致性的对比分析
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-08 DOI: 10.1162/tacl_a_00531
Bingzhi Li, Guillaume Wisniewski, Benoit Crabb'e
Many studies have shown that transformers are able to predict subject-verb agreement, demonstrating their ability to uncover an abstract representation of the sentence in an unsupervised way. Recently, Li et al. (2021) found that transformers were also able to predict the object-past participle agreement in French, the modeling of which in formal grammar is fundamentally different from that of subject-verb agreement and relies on a movement and an anaphora resolution. To better understand transformers’ internal working, we propose to contrast how they handle these two kinds of agreement. Using probing and counterfactual analysis methods, our experiments on French agreements show that (i) the agreement task suffers from several confounders that partially question the conclusions drawn so far and (ii) transformers handle subject-verb and object-past participle agreements in a way that is consistent with their modeling in theoretical linguistics.
许多研究表明,transformer能够预测主动一致性,证明了他们能够以无监督的方式揭示句子的抽象表示。最近,李等人(2021)发现,transformers也能够预测法语中的宾语过去分词一致性,其在形式语法中的建模与主谓一致性有根本不同,依赖于动作和回指解析。为了更好地理解变压器的内部工作,我们建议对比它们如何处理这两种协议。使用探究和反事实分析方法,我们对法语协定的实验表明:(i)协定任务存在一些混淆因素,这些混淆因素对迄今为止得出的结论提出了部分质疑;(ii)变换器处理主词-动词和宾语-过去时分词协定的方式与它们在理论语言学中的建模一致。
{"title":"Assessing the Capacity of Transformer to Abstract Syntactic Representations: A Contrastive Analysis Based on Long-distance Agreement","authors":"Bingzhi Li, Guillaume Wisniewski, Benoit Crabb'e","doi":"10.1162/tacl_a_00531","DOIUrl":"https://doi.org/10.1162/tacl_a_00531","url":null,"abstract":"Many studies have shown that transformers are able to predict subject-verb agreement, demonstrating their ability to uncover an abstract representation of the sentence in an unsupervised way. Recently, Li et al. (2021) found that transformers were also able to predict the object-past participle agreement in French, the modeling of which in formal grammar is fundamentally different from that of subject-verb agreement and relies on a movement and an anaphora resolution. To better understand transformers’ internal working, we propose to contrast how they handle these two kinds of agreement. Using probing and counterfactual analysis methods, our experiments on French agreements show that (i) the agreement task suffers from several confounders that partially question the conclusions drawn so far and (ii) transformers handle subject-verb and object-past participle agreements in a way that is consistent with their modeling in theoretical linguistics.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"18-33"},"PeriodicalIF":10.9,"publicationDate":"2022-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44999855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
The Emergence of Argument Structure in Artificial Languages 论人工语言中论证结构的产生
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-12-01 DOI: 10.1162/tacl_a_00524
Tom Bosc, Pascal Vincent
Abstract Computational approaches to the study of language emergence can help us understand how natural languages are shaped by cognitive and sociocultural factors. Previous work focused on tasks where agents refer to a single entity. In contrast, we study how agents predicate, that is, how they express that some relation holds between several entities. We introduce a setup where agents talk about a variable number of entities that can be partially observed by the listener. In the presence of a least-effort pressure, they tend to discuss only entities that are not observed by the listener. Thus we can obtain artificial phrases that denote a single entity, as well as artificial sentences that denote several entities. In natural languages, if we ignore the verb, phrases are usually concatenated, either in a specific order or by adding case markers to form sentences. Our setup allows us to quantify how much this holds in emergent languages using a metric we call concatenability. We also measure transitivity, which quantifies the importance of word order. We demonstrate the usefulness of this new setup and metrics for studying factors that influence argument structure. We compare agents having access to input representations structured into pre-segmented objects with properties, versus unstructured representations. Our results indicate that the awareness of object structure yields a more natural sentence organization.
语言涌现研究的计算方法可以帮助我们理解自然语言是如何被认知和社会文化因素塑造的。以前的工作集中在代理引用单个实体的任务上。相反,我们研究智能体如何谓词,即它们如何表达几个实体之间的某种关系。我们将介绍一种设置,其中代理谈论侦听器可以部分观察到的可变数量的实体。在压力最小的情况下,他们倾向于只讨论听者没有观察到的实体。因此,我们可以获得表示单个实体的人工短语,以及表示多个实体的人工句子。在自然语言中,如果我们忽略动词,短语通常是连接在一起的,或者按照特定的顺序,或者通过添加大小写标记来组成句子。我们的设置允许我们使用我们称为可连接性的度量来量化这在紧急语言中的适用程度。我们还测量及物性,它量化了词序的重要性。我们证明了这种新的设置和度量对于研究影响论点结构的因素的有用性。我们比较了具有访问输入表示的代理,这些表示结构为带有属性的预分割对象,而非结构化表示。我们的研究结果表明,对象结构的意识产生了更自然的句子组织。
{"title":"The Emergence of Argument Structure in Artificial Languages","authors":"Tom Bosc, Pascal Vincent","doi":"10.1162/tacl_a_00524","DOIUrl":"https://doi.org/10.1162/tacl_a_00524","url":null,"abstract":"Abstract Computational approaches to the study of language emergence can help us understand how natural languages are shaped by cognitive and sociocultural factors. Previous work focused on tasks where agents refer to a single entity. In contrast, we study how agents predicate, that is, how they express that some relation holds between several entities. We introduce a setup where agents talk about a variable number of entities that can be partially observed by the listener. In the presence of a least-effort pressure, they tend to discuss only entities that are not observed by the listener. Thus we can obtain artificial phrases that denote a single entity, as well as artificial sentences that denote several entities. In natural languages, if we ignore the verb, phrases are usually concatenated, either in a specific order or by adding case markers to form sentences. Our setup allows us to quantify how much this holds in emergent languages using a metric we call concatenability. We also measure transitivity, which quantifies the importance of word order. We demonstrate the usefulness of this new setup and metrics for studying factors that influence argument structure. We compare agents having access to input representations structured into pre-segmented objects with properties, versus unstructured representations. Our results indicate that the awareness of object structure yields a more natural sentence organization.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1375-1391"},"PeriodicalIF":10.9,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42504304","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Coreference Resolution through a seq2seq Transition-Based System 基于seq2seq转换系统的共参解析
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-11-22 DOI: 10.1162/tacl_a_00543
Bernd Bohnet, Chris Alberti, Michael Collins
Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work [Dobrovolskii, 2021]) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work), and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot setting, a few-shot setting, and supervised setting using all available training data. We obtain substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested languages. We provide the code and models as open source.1
最新的共指解析系统在可能的跨度上使用搜索算法来识别提及并解析共指。相反,我们提出了一个共指消解系统,该系统使用文本对文本(seq2seq)范式来联合预测提及和链接。我们将共指系统实现为一个转换系统,并使用多语言T5作为底层语言模型。我们在CoNLL-2012数据集上获得了最先进的精度,其中英语的F1-score为83.3(比之前的工作高出2.3 F1-score[Dobrovolskii,2021]),仅使用CoNLL数据进行训练,阿拉伯语的F1-score68.5(比之前工作高出+4.1),汉语的F1-scre为74.3(+5.3),以及使用所有可用的训练数据进行监督设置。与之前的方法相比,我们在4种语言中的3种语言中获得了显著更高的零样本F1-分数,并且显著超过了之前监督的所有五种测试语言的最先进结果。我们以开放源代码的形式提供代码和模型。1
{"title":"Coreference Resolution through a seq2seq Transition-Based System","authors":"Bernd Bohnet, Chris Alberti, Michael Collins","doi":"10.1162/tacl_a_00543","DOIUrl":"https://doi.org/10.1162/tacl_a_00543","url":null,"abstract":"Most recent coreference resolution systems use search algorithms over possible spans to identify mentions and resolve coreference. We instead present a coreference resolution system that uses a text-to-text (seq2seq) paradigm to predict mentions and links jointly. We implement the coreference system as a transition system and use multilingual T5 as an underlying language model. We obtain state-of-the-art accuracy on the CoNLL-2012 datasets with 83.3 F1-score for English (a 2.3 higher F1-score than previous work [Dobrovolskii, 2021]) using only CoNLL data for training, 68.5 F1-score for Arabic (+4.1 higher than previous work), and 74.3 F1-score for Chinese (+5.3). In addition we use the SemEval-2010 data sets for experiments in the zero-shot setting, a few-shot setting, and supervised setting using all available training data. We obtain substantially higher zero-shot F1-scores for 3 out of 4 languages than previous approaches and significantly exceed previous supervised state-of-the-art results for all five tested languages. We provide the code and models as open source.1","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"212-226"},"PeriodicalIF":10.9,"publicationDate":"2022-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"43867978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
MACSum: Controllable Summarization with Mixed Attributes MACSum:混合属性的可控摘要
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-11-09 DOI: 10.1162/tacl_a_00575
Yusen Zhang, Yang Liu, Ziyi Yang, Yuwei Fang, Yulong Chen, Dragomir R. Radev, Chenguang Zhu, Michael Zeng, Rui Zhang
Abstract Controllable summarization allows users to generate customized summaries with specified attributes. However, due to the lack of designated annotations of controlled summaries, existing work has to craft pseudo datasets by adapting generic summarization benchmarks. Furthermore, most research focuses on controlling single attributes individually (e.g., a short summary or a highly abstractive summary) rather than controlling a mix of attributes together (e.g., a short and highly abstractive summary). In this paper, we propose MACSum, the first human-annotated summarization dataset for controlling mixed attributes. It contains source texts from two domains, news articles and dialogues, with human-annotated summaries controlled by five designed attributes (Length, Extractiveness, Specificity, Topic, and Speaker). We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning. Results and analysis demonstrate that hard prompt models yield the best performance on most metrics and human evaluations. However, mixed-attribute control is still challenging for summarization tasks. Our dataset and code are available at https://github.com/psunlpgroup/MACSum.
摘要可控摘要允许用户生成具有指定属性的自定义摘要。然而,由于缺乏受控摘要的指定注释,现有的工作必须通过调整通用摘要基准来制作伪数据集。此外,大多数研究侧重于单独控制单个属性(例如,简短的摘要或高度抽象的摘要),而不是控制混合属性(例如,简短而高度抽象的摘要)。在本文中,我们提出了MACSum,这是第一个用于控制混合属性的人工注释摘要数据集。它包含来自两个领域的源文本,新闻文章和对话,以及由五个设计属性(Length, extractivity, Specificity, Topic和Speaker)控制的人工注释摘要。针对混合可控摘要的新任务,提出了两种简单有效的基于硬提示调谐和软前缀调谐的参数高效方法。结果和分析表明,硬提示模型在大多数指标和人工评估上产生最佳性能。然而,混合属性控制对于摘要任务来说仍然是一个挑战。我们的数据集和代码可在https://github.com/psunlpgroup/MACSum上获得。
{"title":"MACSum: Controllable Summarization with Mixed Attributes","authors":"Yusen Zhang, Yang Liu, Ziyi Yang, Yuwei Fang, Yulong Chen, Dragomir R. Radev, Chenguang Zhu, Michael Zeng, Rui Zhang","doi":"10.1162/tacl_a_00575","DOIUrl":"https://doi.org/10.1162/tacl_a_00575","url":null,"abstract":"Abstract Controllable summarization allows users to generate customized summaries with specified attributes. However, due to the lack of designated annotations of controlled summaries, existing work has to craft pseudo datasets by adapting generic summarization benchmarks. Furthermore, most research focuses on controlling single attributes individually (e.g., a short summary or a highly abstractive summary) rather than controlling a mix of attributes together (e.g., a short and highly abstractive summary). In this paper, we propose MACSum, the first human-annotated summarization dataset for controlling mixed attributes. It contains source texts from two domains, news articles and dialogues, with human-annotated summaries controlled by five designed attributes (Length, Extractiveness, Specificity, Topic, and Speaker). We propose two simple and effective parameter-efficient approaches for the new task of mixed controllable summarization based on hard prompt tuning and soft prefix tuning. Results and analysis demonstrate that hard prompt models yield the best performance on most metrics and human evaluations. However, mixed-attribute control is still challenging for summarization tasks. Our dataset and code are available at https://github.com/psunlpgroup/MACSum.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"11 1","pages":"787-803"},"PeriodicalIF":10.9,"publicationDate":"2022-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46946847","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An End-to-End Contrastive Self-Supervised Learning Framework for Language Understanding 语言理解的端到端对比自监督学习框架
IF 10.9 1区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2022-11-01 DOI: 10.1162/tacl_a_00521
Hongchao Fang, P. Xie
Abstract Self-supervised learning (SSL) methods such as Word2vec, BERT, and GPT have shown great effectiveness in language understanding. Contrastive learning, as a recent SSL approach, has attracted increasing attention in NLP. Contrastive learning learns data representations by predicting whether two augmented data instances are generated from the same original data example. Previous contrastive learning methods perform data augmentation and contrastive learning separately. As a result, the augmented data may not be optimal for contrastive learning. To address this problem, we propose a four-level optimization framework that performs data augmentation and contrastive learning end-to-end, to enable the augmented data to be tailored to the contrastive learning task. This framework consists of four learning stages, including training machine translation models for sentence augmentation, pretraining a text encoder using contrastive learning, finetuning a text classification model, and updating weights of translation data by minimizing the validation loss of the classification model, which are performed in a unified way. Experiments on datasets in the GLUE benchmark (Wang et al., 2018a) and on datasets used in Gururangan et al. (2020) demonstrate the effectiveness of our method.
自监督学习(SSL)方法,如Word2vec、BERT和GPT,在语言理解方面显示出极大的有效性。对比学习作为一种新的SSL学习方法,在自然语言处理领域受到越来越多的关注。对比学习通过预测是否从相同的原始数据示例生成两个增强数据实例来学习数据表示。以往的对比学习方法分别进行数据增强和对比学习。因此,增强的数据可能不是对比学习的最佳选择。为了解决这个问题,我们提出了一个四层优化框架,该框架端到端执行数据增强和对比学习,以使增强的数据能够适应对比学习任务。该框架包括四个学习阶段,包括训练机器翻译模型进行句子增强、使用对比学习对文本编码器进行预训练、对文本分类模型进行微调以及通过最小化分类模型的验证损失来更新翻译数据的权值,这些阶段以统一的方式进行。在GLUE基准中的数据集(Wang et al., 2018a)和Gururangan et al.(2020)中使用的数据集上的实验证明了我们的方法的有效性。
{"title":"An End-to-End Contrastive Self-Supervised Learning Framework for Language Understanding","authors":"Hongchao Fang, P. Xie","doi":"10.1162/tacl_a_00521","DOIUrl":"https://doi.org/10.1162/tacl_a_00521","url":null,"abstract":"Abstract Self-supervised learning (SSL) methods such as Word2vec, BERT, and GPT have shown great effectiveness in language understanding. Contrastive learning, as a recent SSL approach, has attracted increasing attention in NLP. Contrastive learning learns data representations by predicting whether two augmented data instances are generated from the same original data example. Previous contrastive learning methods perform data augmentation and contrastive learning separately. As a result, the augmented data may not be optimal for contrastive learning. To address this problem, we propose a four-level optimization framework that performs data augmentation and contrastive learning end-to-end, to enable the augmented data to be tailored to the contrastive learning task. This framework consists of four learning stages, including training machine translation models for sentence augmentation, pretraining a text encoder using contrastive learning, finetuning a text classification model, and updating weights of translation data by minimizing the validation loss of the classification model, which are performed in a unified way. Experiments on datasets in the GLUE benchmark (Wang et al., 2018a) and on datasets used in Gururangan et al. (2020) demonstrate the effectiveness of our method.","PeriodicalId":33559,"journal":{"name":"Transactions of the Association for Computational Linguistics","volume":"10 1","pages":"1324-1340"},"PeriodicalIF":10.9,"publicationDate":"2022-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47193269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
期刊
Transactions of the Association for Computational Linguistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1