Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献_第6页

Subword-Delimited Downsampling for Better Character-Level Translation 子词分隔的下采样，以获得更好的字符级翻译

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-02 DOI: 10.48550/arXiv.2212.01304

Lukas Edman, Antonio Toral, Gertjan van Noord

Subword-level models have been the dominant paradigm in NLP. However, character-level models have the benefit of seeing each character individually, providing the model with more detailed information that ultimately could lead to better models. Recent works have shown character-level models to be competitive with subword models, but costly in terms of time and computation. Character-level models with a downsampling component alleviate this, but at the cost of quality, particularly for machine translation. This work analyzes the problems of previous downsampling methods and introduces a novel downsampling method which is informed by subwords. This new downsampling method not only outperforms existing downsampling methods, showing that downsampling characters can be done without sacrificing quality, but also leads to promising performance compared to subword models for translation.

子词级模型一直是自然语言处理的主流范式。然而，角色级模型的好处是可以单独看到每个角色，为模型提供更详细的信息，最终可以生成更好的模型。最近的研究表明，字符级模型可以与子词模型竞争，但在时间和计算方面代价高昂。带有下采样组件的字符级模型缓解了这一点，但代价是质量，特别是对于机器翻译。本文分析了以往下采样方法存在的问题，提出了一种新的基于子词的下采样方法。这种新的下采样方法不仅优于现有的下采样方法，表明下采样字符可以在不牺牲质量的情况下完成，而且与翻译的子词模型相比，它的性能也很好。

引用次数: 4

Semantic Role Labeling Meets Definition Modeling: Using Natural Language to Describe Predicate-Argument Structures 语义角色标注符合定义建模:使用自然语言描述谓词-参数结构

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-02 DOI: 10.48550/arXiv.2212.01094

Simone Conia, Edoardo Barba, Alessandro Sciré, Roberto Navigli

One of the common traits of past and present approaches for Semantic Role Labeling (SRL) is that they rely upon discrete labels drawn from a predefined linguistic inventory to classify predicate senses and their arguments. However, we argue this need not be the case. In this paper, we present an approach that leverages Definition Modeling to introduce a generalized formulation of SRL as the task of describing predicate-argument structures using natural language definitions instead of discrete labels. Our novel formulation takes a first step towards placing interpretability and flexibility foremost, and yet our experiments and analyses on PropBank-style and FrameNet-style, dependency-based and span-based SRL also demonstrate that a flexible model with an interpretable output does not necessarily come at the expense of performance. We release our software for research purposes at https://github.com/SapienzaNLP/dsrl.

过去和现在的语义角色标注(SRL)方法的一个共同特点是，它们依赖于从预定义的语言清单中提取的离散标签来对谓词意义及其参数进行分类。然而，我们认为事实并非如此。在本文中，我们提出了一种利用定义建模来引入SRL的广义公式的方法，作为使用自然语言定义而不是离散标签描述谓词-参数结构的任务。我们的新公式向将可解释性和灵活性放在首位迈出了第一步，然而我们对propbank风格和framework风格、基于依赖和基于跨度的SRL的实验和分析也表明，具有可解释性输出的灵活模型并不一定以牺牲性能为代价。我们发布我们的软件用于研究目的在https://github.com/SapienzaNLP/dsrl。

引用次数: 2

NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization narasum:用于抽象叙述摘要的大规模数据集

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-02 DOI: 10.48550/arXiv.2212.01476

Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha Chaturvedi

Narrative summarization aims to produce a distilled version of a narrative to describe its most salient events and characters. Summarizing a narrative is challenging as it requires an understanding of event causality and character behaviors. To encourage research in this direction, we propose NarraSum, a large-scale narrative summarization dataset. It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries. Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum. We hope that this dataset will promote future research in summarization, as well as broader studies of natural language understanding and generation. The dataset is available at https://github.com/zhaochaocs/narrasum.

叙述摘要的目的是提炼出一篇叙述的精华，以描述其中最突出的事件和人物。总结一个故事是很有挑战性的，因为它需要理解事件的因果关系和角色的行为。为了鼓励这方面的研究，我们提出了一个大型叙事摘要数据集narasum。它包含122K个叙事文件，这些文件收集了不同类型的电影和电视剧集的情节描述及其相应的抽象摘要。实验表明，在narasum上，人类和最先进的总结模型之间存在很大的性能差距。我们希望这个数据集能够促进未来的总结研究，以及更广泛的自然语言理解和生成研究。该数据集可在https://github.com/zhaochaocs/narrasum上获得。

引用次数: 1

Dynamic Augmentation Data Selection for Few-shot Text Classification. 为少量文本分类选择动态增强数据

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01

Guangliang Liu, Owen Yuan, Lifeng Jin, Jiayu Zhou

Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.

数据扩增一直是微调预训练语言模型以提高模型稳健性和性能的常用方法。扩增数据来自修改黄金训练数据（样本内扩增）或从一般领域的无标记数据中获取（样本外扩增），这些数据的质量是微调成功的关键。在本文中，我们提出了一种动态数据选择方法，根据模型的学习阶段，从不同的增强来源中选择有效的增强数据，通过识别一组增强样本来优化当前模型的学习过程。该方法首先通过课程学习策略过滤掉带有噪声伪标签的增强样本，然后在每次更新时通过其对当前模型的影响分数来估计保留的增强数据的有效性，从而使数据选择过程与模型参数紧密契合。两阶段增强策略在不同的学习阶段考虑了样本内增强和样本外增强。使用这两种增强数据对各种句子分类任务进行的实验表明，我们的方法优于强基准，证明了我们方法的有效性。分析证实了数据有效性的动态性质以及模型学习阶段在利用增强数据方面的重要性。

{"title":"Dynamic Augmentation Data Selection for Few-shot Text Classification.","authors":"Guangliang Liu, Owen Yuan, Lifeng Jin, Jiayu Zhou","doi":"","DOIUrl":"","url":null,"abstract":"Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"4870-4881"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10097500/pdf/nihms-1888098.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9302914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analogical Math Word Problems Solving with Enhanced Problem-Solution Association 用增强的问题-解决方案关联解决类比数学单词问题

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.48550/arXiv.2212.00837

Zhenwen Liang, Jipeng Zhang, Xiangliang Zhang

Math word problem (MWP) solving is an important task in question answering which requires human-like reasoning ability. Analogical reasoning has long been used in mathematical education, as it enables students to apply common relational structures of mathematical situations to solve new problems. In this paper, we propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver’s generalization ability across different kinds of MWPs. The key idea, named analogy identification, is to associate the analogical MWP pairs in a latent space, i.e., encoding an MWP close to another analogical MWP, while leaving away from the non-analogical ones. Moreover, a solution discriminator is integrated into the MWP solver to enhance the association between an MWP and its true solution. The evaluation results verify that our proposed analogical learning strategy promotes the performance of MWP-BERT on Math23k over the state-of-the-art model Generate2Rank, with 5 times fewer parameters in the encoder. We also find that our model has a stronger generalization ability in solving difficult MWPs due to the analogical learning from easy MWPs.

数学应用题求解是问答中的一项重要任务，需要具备类似人的推理能力。类比推理在数学教育中一直被使用，因为它使学生能够应用数学情境的常见关系结构来解决新问题。在本文中，我们提出利用类比MWP构建一个新的MWP求解器，这提高了求解器在不同类型MWP之间的泛化能力。其关键思想，称为类比识别，是在潜在空间中关联类比MWP对，即编码一个接近另一个类比MWP的MWP，而远离非类比MWP。此外，在MWP求解器中集成了解判别器，增强了MWP与其真解之间的关联。评估结果验证了我们提出的类比学习策略在Math23k上比最先进的模型Generate2Rank提高了MWP-BERT的性能，编码器中的参数减少了5倍。我们还发现，由于从简单的mwp中进行类比学习，我们的模型在解决困难的mwp时具有更强的泛化能力。

{"title":"Analogical Math Word Problems Solving with Enhanced Problem-Solution Association","authors":"Zhenwen Liang, Jipeng Zhang, Xiangliang Zhang","doi":"10.48550/arXiv.2212.00837","DOIUrl":"https://doi.org/10.48550/arXiv.2212.00837","url":null,"abstract":"Math word problem (MWP) solving is an important task in question answering which requires human-like reasoning ability. Analogical reasoning has long been used in mathematical education, as it enables students to apply common relational structures of mathematical situations to solve new problems. In this paper, we propose to build a novel MWP solver by leveraging analogical MWPs, which advance the solver’s generalization ability across different kinds of MWPs. The key idea, named analogy identification, is to associate the analogical MWP pairs in a latent space, i.e., encoding an MWP close to another analogical MWP, while leaving away from the non-analogical ones. Moreover, a solution discriminator is integrated into the MWP solver to enhance the association between an MWP and its true solution. The evaluation results verify that our proposed analogical learning strategy promotes the performance of MWP-BERT on Math23k over the state-of-the-art model Generate2Rank, with 5 times fewer parameters in the encoder. We also find that our model has a stronger generalization ability in solving difficult MWPs due to the analogical learning from easy MWPs.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"44 1","pages":"9454-9464"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79437917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection 多回合响应选择的隐式关系推理图网络

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.48550/arXiv.2212.00482

Jing-Hui Deng, Hengwei Dai, Xuewei Guo, Yuanchen Ju, Wei Peng

The task of response selection in multi-turn dialogue is to find the best option from all candidates. In order to improve the reasoning ability of the model, previous studies pay more attention to using explicit algorithms to model the dependencies between utterances, which are deterministic, limited and inflexible. In addition, few studies consider differences between the options before and after reasoning. In this paper, we propose an Implicit Relational Reasoning Graph Network to address these issues, which consists of the Utterance Relational Reasoner (URR) and the Option Dual Comparator (ODC). URR aims to implicitly extract dependencies between utterances, as well as utterances and options, and make reasoning with relational graph convolutional networks. ODC focuses on perceiving the difference between the options through dual comparison, which can eliminate the interference of the noise options. Experimental results on two multi-turn dialogue reasoning benchmark datasets MuTual and MuTualplus show that our method significantly improves the baseline of four pre-trained language models and achieves state-of-the-art performance. The model surpasses human performance for the first time on the MuTual dataset.

在多回合对话中，回答选择的任务是从所有的候选者中找出最佳选择。为了提高模型的推理能力，以往的研究更注重使用显式算法对话语之间的依赖关系进行建模，而话语之间的依赖关系具有确定性、有限性和不灵活性。此外，很少有研究考虑推理前后选项之间的差异。在本文中，我们提出了一个隐式关系推理图网络来解决这些问题，它由话语关系推理器(URR)和选项双比较器(ODC)组成。URR旨在隐式提取话语之间的依赖关系，以及话语和选项之间的依赖关系，并使用关系图卷积网络进行推理。ODC侧重于通过双重比较感知选项之间的差异，可以消除噪声选项的干扰。在MuTual和MuTualplus两个多回合对话推理基准数据集上的实验结果表明，我们的方法显著提高了四种预训练语言模型的基线，达到了最先进的性能。该模型首次在MuTual数据集上超越了人类的表现。

{"title":"IRRGN: An Implicit Relational Reasoning Graph Network for Multi-turn Response Selection","authors":"Jing-Hui Deng, Hengwei Dai, Xuewei Guo, Yuanchen Ju, Wei Peng","doi":"10.48550/arXiv.2212.00482","DOIUrl":"https://doi.org/10.48550/arXiv.2212.00482","url":null,"abstract":"The task of response selection in multi-turn dialogue is to find the best option from all candidates. In order to improve the reasoning ability of the model, previous studies pay more attention to using explicit algorithms to model the dependencies between utterances, which are deterministic, limited and inflexible. In addition, few studies consider differences between the options before and after reasoning. In this paper, we propose an Implicit Relational Reasoning Graph Network to address these issues, which consists of the Utterance Relational Reasoner (URR) and the Option Dual Comparator (ODC). URR aims to implicitly extract dependencies between utterances, as well as utterances and options, and make reasoning with relational graph convolutional networks. ODC focuses on perceiving the difference between the options through dual comparison, which can eliminate the interference of the noise options. Experimental results on two multi-turn dialogue reasoning benchmark datasets MuTual and MuTualplus show that our method significantly improves the baseline of four pre-trained language models and achieves state-of-the-art performance. The model surpasses human performance for the first time on the MuTual dataset.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"36 1","pages":"8529-8541"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90914055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic Augmentation Data Selection for Few-shot Text Classification 基于小样本文本分类的动态增强数据选择

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.18653/v1/2022.findings-emnlp.356

Guangliang Liu, Lifeng Jin, Owen Yuan, Jiayu Zhou

Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.

数据增强一直是一种流行的方法，用于微调预训练语言模型，以提高模型的鲁棒性和性能。由于增强数据来自修改黄金序列数据(样本内增强)或来自一般领域未标记数据(样本外增强)，因此这些数据的质量是成功微调的关键。本文提出了一种动态数据选择方法，通过识别一组最优促进最新模型学习过程的增强样本，根据模型的学习阶段，从不同的增强源中选择有效的增强数据。该方法首先通过课程学习策略过滤掉带有噪声伪标签的增强样本，然后通过每次更新时保留的增强数据对当前模型的影响分数来估计其有效性，从而使数据选择过程与模型参数紧密匹配。两阶段增强策略考虑了不同学习阶段的样本内增强和样本外增强。对两种增强数据在多种句子分类任务上的实验表明，我们的方法优于强基线，证明了我们的方法的有效性。分析证实了数据有效性的动态性质和模型学习阶段在利用增强数据中的重要性。

{"title":"Dynamic Augmentation Data Selection for Few-shot Text Classification","authors":"Guangliang Liu, Lifeng Jin, Owen Yuan, Jiayu Zhou","doi":"10.18653/v1/2022.findings-emnlp.356","DOIUrl":"https://doi.org/10.18653/v1/2022.findings-emnlp.356","url":null,"abstract":"Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"176 5 1","pages":"4870-4881"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75520104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score. MedJEx:一个具有Wiki超链接跨度和上下文化掩码语言模型分数的医学术语提取模型。

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01

Sunjae Kwon, Zonghai Yao, Harmon S Jordan, David A Levy, Brian Corner, Hong Yu

This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. We first present a novel and publicly available dataset with expert-annotated medical jargon terms from 18K+ EHR note sentences (MedJ). Then, we introduce a novel medical jargon extraction (MedJEx) model which has been shown to outperform existing state-of-the-art NLP models. First, MedJEx improved the overall performance when it was trained on an auxiliary Wikipedia hyperlink span dataset, where hyperlink spans provide additional Wikipedia articles to explain the spans (or terms), and then fine-tuned on the annotated MedJ data. Secondly, we found that a contextualized masked language model score was beneficial for detecting domain-specific unfamiliar jargon terms. Moreover, our results show that training on the auxiliary Wikipedia hyperlink span datasets improved six out of eight biomedical named entity recognition benchmark datasets. Both MedJ and MedJEx are publicly available.

本文提出了一种新的自然语言处理(NLP)应用程序，用于从电子健康记录(EHR)笔记中识别患者可能难以理解的医学术语。我们首先提出了一个新的和公开可用的数据集，其中包含来自18K+ EHR笔记句子(MedJ)的专家注释的医学术语。然后，我们引入了一种新的医学术语提取(MedJEx)模型，该模型已被证明优于现有的最先进的NLP模型。首先，在辅助Wikipedia超链接跨度数据集上训练MedJEx时，MedJEx提高了整体性能，其中超链接跨度提供了额外的Wikipedia文章来解释跨度(或术语)，然后对带注释的MedJ数据进行微调。其次，我们发现上下文化的掩蔽语言模型分数有助于检测特定领域的不熟悉术语。此外，我们的结果表明，在辅助维基百科超链接跨度数据集上的训练提高了8个生物医学命名实体识别基准数据集中的6个。MedJ和MedJEx都是公开的。

{"title":"MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score.","authors":"Sunjae Kwon, Zonghai Yao, Harmon S Jordan, David A Levy, Brian Corner, Hong Yu","doi":"","DOIUrl":"","url":null,"abstract":"This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. We first present a novel and publicly available dataset with expert-annotated medical jargon terms from 18K+ EHR note sentences (MedJ). Then, we introduce a novel medical jargon extraction (MedJEx) model which has been shown to outperform existing state-of-the-art NLP models. First, MedJEx improved the overall performance when it was trained on an auxiliary Wikipedia hyperlink span dataset, where hyperlink spans provide additional Wikipedia articles to explain the spans (or terms), and then fine-tuned on the annotated MedJ data. Secondly, we found that a contextualized masked language model score was beneficial for detecting domain-specific unfamiliar jargon terms. Moreover, our results show that training on the auxiliary Wikipedia hyperlink span datasets improved six out of eight biomedical named entity recognition benchmark datasets. Both MedJ and MedJEx are publicly available.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"11733-11751"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10129059/pdf/nihms-1843448.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9384374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Focus! Relevant and Sufficient Context Selection for News Image Captioning 焦点!新闻图片标题的相关和充分的上下文选择

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.48550/arXiv.2212.00843

Mingyang Zhou, Grace Luo, Anna Rohrbach, Zhou Yu

News Image Captioning requires describing an image by leveraging additional context from a news article. Previous works only coarsely leverage the article to extract the necessary context, which makes it challenging for models to identify relevant events and named entities. In our paper, we first demonstrate that by combining more fine-grained context that captures the key named entities (obtained via an oracle) and the global context that summarizes the news, we can dramatically improve the model's ability to generate accurate news captions. This begs the question, how to automatically extract such key entities from an image? We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article and then capture the non-visual entities via an open relation extraction model. Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models and achieve new state-of-the-art performance on multiple benchmarks.

新闻图片字幕需要通过利用新闻文章中的附加上下文来描述图像。以前的工作只是粗略地利用文章来提取必要的上下文，这使得模型很难识别相关事件和命名实体。在我们的论文中，我们首先证明，通过结合捕获关键命名实体(通过oracle获得)的更细粒度上下文和总结新闻的全局上下文，我们可以显着提高模型生成准确新闻标题的能力。这就引出了一个问题，如何从图像中自动提取这些关键实体?我们建议使用预训练的视觉和语言检索模型CLIP来定位新闻文章中的视觉基础实体，然后通过开放关系提取模型捕获非视觉实体。我们的实验表明，通过简单地从文章中选择一个更好的上下文，我们可以显著提高现有模型的性能，并在多个基准测试中实现新的最先进的性能。

{"title":"Focus! Relevant and Sufficient Context Selection for News Image Captioning","authors":"Mingyang Zhou, Grace Luo, Anna Rohrbach, Zhou Yu","doi":"10.48550/arXiv.2212.00843","DOIUrl":"https://doi.org/10.48550/arXiv.2212.00843","url":null,"abstract":"News Image Captioning requires describing an image by leveraging additional context from a news article. Previous works only coarsely leverage the article to extract the necessary context, which makes it challenging for models to identify relevant events and named entities. In our paper, we first demonstrate that by combining more fine-grained context that captures the key named entities (obtained via an oracle) and the global context that summarizes the news, we can dramatically improve the model's ability to generate accurate news captions. This begs the question, how to automatically extract such key entities from an image? We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article and then capture the non-visual entities via an open relation extraction model. Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models and achieve new state-of-the-art performance on multiple benchmarks.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"141 1","pages":"6078-6088"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86247318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

MedCLIP: Contrastive Learning from Unpaired Medical Images and Text. MedCLIP：从非配对医学图像和文本中进行对比学习。

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.18653/v1/2022.emnlp-main.256

Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun

Existing vision-text contrastive learning like CLIP (Radford et al., 2021) aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using ≈200K data).

现有的视觉-文本对比学习，如 CLIP（Radford 等人，2021 年），旨在匹配配对的图像和标题嵌入，同时将其他图像和标题推开，从而提高表示的可转移性并支持零镜头预测。然而，医学图像-文本数据集比互联网上的普通图像和标题低几个数量级。此外，以前的方法会遇到许多假阴性，即来自不同患者的图像和报告可能具有相同的语义，但却被错误地视为阴性。在本文中，我们将图像和文本解耦，用于多模态对比学习，从而以较低的成本在组合量级上扩展可用的训练数据。我们还建议用基于医学知识的语义匹配损失取代 InfoNCE 损失，以消除对比学习中的假阴性。我们证明，MedCLIP 是一个简单而有效的框架：它在零镜头预测、监督分类和图像文本检索方面都优于最先进的方法。令人惊讶的是，我们发现，只需 20K 预训练数据，MedCLIP 就能战胜最先进的方法（使用 ≈200K 数据）。

{"title":"MedCLIP: Contrastive Learning from Unpaired Medical Images and Text.","authors":"Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun","doi":"10.18653/v1/2022.emnlp-main.256","DOIUrl":"10.18653/v1/2022.emnlp-main.256","url":null,"abstract":"Existing vision-text contrastive learning like CLIP (Radford et al., 2021) aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using ≈200K data).","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"3876-3887"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0