Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献

英文中文

Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding. 基于知识注入提示的多标签少镜头 ICD 编码微调。

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01

Zhichao Yang, Shufan Wang, Bhanu Pratap Singh Rawat, Avijit Mitra, Hong Yu

Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with average length of 3,000+ tokens. This task is challenging due to a high-dimensional space of multi-label assignment (tens of thousands of ICD codes) and the long-tail challenge: only a few codes (common diseases) are frequently assigned while most codes (rare diseases) are infrequently assigned. This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics, which has been shown to be effective under few-shot setting. To further enhance the performance in medical domain, we propose a knowledge-enhanced longformer by injecting three domain-specific knowledge: hierarchy, synonym, and abbreviation with additional pretraining using contrastive learning. Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-the-art method in 14.5% in marco F1 (from 10.3 to 11.8, P<0.001). To further test our model on few-shot setting, we created a new rare diseases coding dataset, MIMIC-III-rare50, on which our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.

国际疾病分类（ICD）自动编码旨在为平均长度超过 3,000 个标记的医疗记录分配多个 ICD 代码。由于多标签分配的高维空间（数以万计的 ICD 代码）和长尾挑战：只有少数代码（常见疾病）是经常分配的，而大多数代码（罕见疾病）是不经常分配的，因此这项任务具有挑战性。本研究通过调整基于提示和标签语义的微调技术来应对长尾挑战，该技术已被证明在少次搜索设置下是有效的。为了进一步提高医疗领域的性能，我们提出了一种知识增强型长尾词，通过注入三个特定领域的知识：层次结构、同义词和缩写，并使用对比学习进行额外的预训练。在代码分配的基准数据集 MIMIC-III-full 上进行的实验表明，我们提出的方法比以前最先进的方法高出 14.5%（从 10.3 到 11.8，P<0.05）。

{"title":"Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding.","authors":"Zhichao Yang, Shufan Wang, Bhanu Pratap Singh Rawat, Avijit Mitra, Hong Yu","doi":"","DOIUrl":"","url":null,"abstract":"Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with average length of 3,000+ tokens. This task is challenging due to a high-dimensional space of multi-label assignment (tens of thousands of ICD codes) and the long-tail challenge: only a few codes (common diseases) are frequently assigned while most codes (rare diseases) are infrequently assigned. This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics, which has been shown to be effective under few-shot setting. To further enhance the performance in medical domain, we propose a knowledge-enhanced longformer by injecting three domain-specific knowledge: hierarchy, synonym, and abbreviation with additional pretraining using contrastive learning. Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-the-art method in 14.5% in marco F1 (from 10.3 to 11.8, P<0.001). To further test our model on few-shot setting, we created a new rare diseases coding dataset, MIMIC-III-rare50, on which our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"1767-1781"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9958514/pdf/nihms-1875185.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10860154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data. 那是错误的肺!评估和改进医疗数据无监督多模态编码器的可解释性。

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01

Denis Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron C Wallace

Pretraining multimodal models on Electronic Health Records (EHRs) provides a means of learning representations that can transfer to downstream tasks with minimal supervision. Recent multimodal models induce soft local alignments between image regions and sentences. This is of particular interest in the medical domain, where alignments might highlight regions in an image relevant to specific phenomena described in free-text. While past work has suggested that attention "heatmaps" can be interpreted in this manner, there has been little evaluation of such alignments. We compare alignments from a state-of-the-art multimodal (image and text) model for EHR with human annotations that link image regions to sentences. Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. Moreover, synthetic modifications - such as substituting "left" for "right" - do not substantially influence highlights. Simple techniques such as allowing the model to opt out of attending to the image and few-shot finetuning show promise in terms of their ability to improve alignments with very little or no supervision. We make our code and checkpoints open-source.

在电子健康记录(EHRs)上预训练多模态模型提供了一种学习表征的方法，这种表征可以在最少的监督下转移到下游任务。最近的多模态模型诱导图像区域和句子之间的软局部对齐。这在医学领域是特别有趣的，其中对齐可能突出显示图像中与自由文本中描述的特定现象相关的区域。虽然过去的研究表明，注意力“热图”可以用这种方式解释，但对这种排列的评估很少。我们比较了电子病历中最先进的多模态(图像和文本)模型与将图像区域链接到句子的人工注释的对齐。我们的主要发现是，文本对注意力的影响通常是微弱的或非直觉的;排列不能一致地反映基本的解剖信息。此外，合成修饰——例如将“右”替换为“左”——不会对高亮显示产生实质性影响。简单的技术，如允许模型选择不关注图像和少量镜头微调，在很少或没有监督的情况下改善对齐的能力方面表现出了希望。我们将代码和检查点开源。

{"title":"That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data.","authors":"Denis Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron C Wallace","doi":"","DOIUrl":"","url":null,"abstract":"Pretraining multimodal models on Electronic Health Records (EHRs) provides a means of learning representations that can transfer to downstream tasks with minimal supervision. Recent multimodal models induce soft local alignments between image regions and sentences. This is of particular interest in the medical domain, where alignments might highlight regions in an image relevant to specific phenomena described in free-text. While past work has suggested that attention \"heatmaps\" can be interpreted in this manner, there has been little evaluation of such alignments. We compare alignments from a state-of-the-art multimodal (image and text) model for EHR with human annotations that link image regions to sentences. Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. Moreover, synthetic modifications - such as substituting \"left\" for \"right\" - do not substantially influence highlights. Simple techniques such as allowing the model to opt out of attending to the image and few-shot finetuning show promise in terms of their ability to improve alignments with very little or no supervision. We make our code and checkpoints open-source.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"3626-3648"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124183/pdf/nihms-1890274.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9384389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning.

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.18653/v1/2022.emnlp-main.185

Zifeng Wang, Jimeng Sun

Accessing longitudinal multimodal Electronic Healthcare Records (EHRs) is challenging due to privacy concerns, which hinders the use of ML for healthcare applications. Synthetic EHRs generation bypasses the need to share sensitive real patient records. However, existing methods generate single-modal EHRs by unconditional generation or by longitudinal inference, which falls short of low flexibility and makes unrealistic EHRs. In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. We also design prompt learning to control the generation conditioned by numerical and categorical demographic features. We evaluate synthetic EHRs quality by two perplexity measures accounting for their longitudinal pattern (longitudinal imputation perplexity, lpl) and the connections cross modalities (cross-modality imputation perplexity, mpl). Moreover, we utilize two adversaries: membership and attribute inference attacks for privacy-preserving evaluation. Experiments on MIMIC-III data demonstrate the superiority of our methods on realistic EHRs generation (53.1% decrease of lpl and 45.3% decrease of mpl on average compared to the best baselines) with low privacy risks.

{"title":"PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning.","authors":"Zifeng Wang, Jimeng Sun","doi":"10.18653/v1/2022.emnlp-main.185","DOIUrl":"10.18653/v1/2022.emnlp-main.185","url":null,"abstract":"Accessing longitudinal multimodal Electronic Healthcare Records (EHRs) is challenging due to privacy concerns, which hinders the use of ML for healthcare applications. Synthetic EHRs generation bypasses the need to share sensitive real patient records. However, existing methods generate single-modal EHRs by unconditional generation or by longitudinal inference, which falls short of low flexibility and makes unrealistic EHRs. In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. We also design prompt learning to control the generation conditioned by numerical and categorical demographic features. We evaluate synthetic EHRs quality by two perplexity measures accounting for their longitudinal pattern (longitudinal imputation perplexity, lpl) and the connections cross modalities (cross-modality imputation perplexity, mpl). Moreover, we utilize two adversaries: membership and attribute inference attacks for privacy-preserving evaluation. Experiments on MIMIC-III data demonstrate the superiority of our methods on realistic EHRs generation (53.1% decrease of lpl and 45.3% decrease of mpl on average compared to the best baselines) with low privacy risks.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"2873-2885"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11824924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143415859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework 企业的生物医学NER与蒸馏BERN2和Kazu框架

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-12-01 DOI: 10.48550/arXiv.2212.00223

Wonjin Yoon, Richard Jackson, Elliot Ford, V. Poroshin, Jaewoo Kang

In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU

为了协助药物发现/开发过程，制药公司经常在内部和公共语料库上应用生物医学NER和连接技术。数十年来对生物信息处理领域的研究产生了大量的算法、系统和数据集。然而，我们的经验是，没有一个单一的开源系统能够满足现代制药公司的所有需求。在这项工作中，我们根据我们的行业经验描述了这些需求，并提出了Kazu，一个高度可扩展、可扩展的开源框架，旨在支持制药行业的BioNLP。Kazu是围绕BERN2 NER模型(TinyBERN2)的计算效率版本构建的，随后将其他几种BioNLP技术整合到一个连贯的系统中。KAZU框架是开源的:https://github.com/AstraZeneca/KAZU

引用次数: 1

Quadapter: Adapter for GPT-2 Quantization Quadapter:用于GPT-2量化的适配器

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-30 DOI: 10.48550/arXiv.2211.16912

Minseop Park, J. You, Markus Nagel, Simyung Chang

Transformer language models such as GPT-2 are difficult to quantize because of outliers in activations leading to a large quantization error. To adapt to the error, one must use quantization-aware training, which entails a fine-tuning process based on the dataset and the training pipeline identical to those for the original model. Pretrained language models, however, often do not grant access to their datasets and training pipelines, forcing us to rely on arbitrary ones for fine-tuning. In that case, it is observed that quantization-aware training overfits the model to the fine-tuning data. For quantization without overfitting, we introduce a quantization adapter (Quadapter), a small set of parameters that are learned to make activations quantization-friendly by scaling them channel-wise. It keeps the model parameters unchanged. By applying our method to the challenging task of quantizing GPT-2, we demonstrate that it effectively prevents the overfitting and improves the quantization performance.

像GPT-2这样的转换语言模型很难量化，因为激活中的异常值会导致很大的量化误差。为了适应误差，必须使用量化感知训练，这需要基于与原始模型相同的数据集和训练管道的微调过程。然而，预训练的语言模型通常不允许访问它们的数据集和训练管道，迫使我们依赖任意的数据集和管道进行微调。在这种情况下，可以观察到量化感知训练将模型过度拟合到微调数据。对于没有过拟合的量化，我们引入了一个量化适配器(Quadapter)，这是一组被学习的参数，通过按通道缩放它们来使激活量化友好。它保持模型参数不变。通过将我们的方法应用于量化GPT-2的挑战性任务，我们证明了它有效地防止了过拟合并提高了量化性能。

引用次数: 5

Learning Label Modular Prompts for Text Classification in the Wild 学习标签模块化提示文本分类在野外

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-30 DOI: 10.48550/arXiv.2211.17142

Hailin Chen, Amrita Saha, Shafiq R. Joty, Steven C. H. Hoi

Machine learning models usually assume i.i.d data during training and testing, but data and tasks in real world often change over time. To emulate the transient nature of real world, we propose a challenging but practical task: text classification in-the-wild, which introduces different non-stationary training/testing stages. Decomposing a complex task into modular components can enable robust generalisation under such non-stationary environment. However, current modular approaches in NLP do not take advantage of recent advances in parameter efficient tuning of pretrained language models. To close this gap, we propose ModularPrompt, a label-modular prompt tuning framework for text classification tasks. In ModularPrompt, the input prompt consists of a sequence of soft label prompts, each encoding modular knowledge related to the corresponding class label. In two of most formidable settings, ModularPrompt outperforms relevant baselines by a large margin demonstrating strong generalisation ability. We also conduct comprehensive analysis to validate whether the learned prompts satisfy properties of a modular representation.

机器学习模型通常在训练和测试期间假设人工智能数据，但现实世界中的数据和任务通常会随着时间的推移而变化。为了模拟现实世界的瞬态性质，我们提出了一个具有挑战性但实用的任务:文本分类在野外，它引入了不同的非平稳训练/测试阶段。将复杂任务分解为模块组件可以实现这种非平稳环境下的鲁棒泛化。然而，目前NLP中的模块化方法并没有利用预训练语言模型的参数有效调优的最新进展。为了缩小这一差距，我们提出了ModularPrompt，这是一个用于文本分类任务的标签模块化提示调优框架。在ModularPrompt中，输入提示由一系列软标签提示组成，每个软标签提示编码与相应类标签相关的模块知识。在两个最令人生畏的设置中，ModularPrompt的表现远远超过相关基线，显示出强大的泛化能力。我们还进行了全面的分析，以验证学习到的提示是否满足模块化表示的属性。

{"title":"Learning Label Modular Prompts for Text Classification in the Wild","authors":"Hailin Chen, Amrita Saha, Shafiq R. Joty, Steven C. H. Hoi","doi":"10.48550/arXiv.2211.17142","DOIUrl":"https://doi.org/10.48550/arXiv.2211.17142","url":null,"abstract":"Machine learning models usually assume i.i.d data during training and testing, but data and tasks in real world often change over time. To emulate the transient nature of real world, we propose a challenging but practical task: text classification in-the-wild, which introduces different non-stationary training/testing stages. Decomposing a complex task into modular components can enable robust generalisation under such non-stationary environment. However, current modular approaches in NLP do not take advantage of recent advances in parameter efficient tuning of pretrained language models. To close this gap, we propose ModularPrompt, a label-modular prompt tuning framework for text classification tasks. In ModularPrompt, the input prompt consists of a sequence of soft label prompts, each encoding modular knowledge related to the corresponding class label. In two of most formidable settings, ModularPrompt outperforms relevant baselines by a large margin demonstrating strong generalisation ability. We also conduct comprehensive analysis to validate whether the learned prompts satisfy properties of a modular representation.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"19 1","pages":"1677-1690"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73991926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 1

Camelira: An Arabic Multi-Dialect Morphological Disambiguator Camelira:阿拉伯语多方言形态消歧器

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-30 DOI: 10.48550/arXiv.2211.16807

Ossama Obeid, Go Inoue, Nizar Habash

We present Camelira, a web-based Arabic multi-dialect morphological disambiguation tool that covers four major variants of Arabic: Modern Standard Arabic, Egyptian, Gulf, and Levantine. Camelira offers a user-friendly web interface that allows researchers and language learners to explore various linguistic information, such as part-of-speech, morphological features, and lemmas. Our system also provides an option to automatically choose an appropriate dialect-specific disambiguator based on the prediction of a dialect identification component. Camelira is publicly accessible at http://camelira.camel-lab.com.

我们提出Camelira，一个基于网络的阿拉伯语多方言形态消歧工具，涵盖了阿拉伯语的四个主要变体:现代标准阿拉伯语、埃及语、海湾语和黎凡特语。Camelira提供了一个用户友好的网络界面，允许研究人员和语言学习者探索各种语言信息，如词性、形态特征和引理。我们的系统还提供了一个选项，可以根据方言识别组件的预测自动选择合适的特定于方言的消歧器。Camelira可以在http://camelira.camel-lab.com上公开访问。

引用次数: 3

Open Relation and Event Type Discovery with Type Abstraction 具有类型抽象的开放关系和事件类型发现

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-30 DOI: 10.48550/arXiv.2212.00178

Sha Li, Heng Ji, Jiawei Han

Conventional “closed-world” information extraction (IE) approaches rely on human ontologies to define the scope for extraction. As a result, such approaches fall short when applied to new domains. This calls for systems that can automatically infer new types from given corpora, a task which we refer to as type discovery.To tackle this problem, we introduce the idea of type abstraction, where the model is prompted to generalize and name the type. Then we use the similarity between inferred names to induce clusters. Observing that this abstraction-based representation is often complementary to the entity/trigger token representation, we set up these two representations as two views and design our model as a co-training framework. Our experiments on multiple relation extraction and event extraction datasets consistently show the advantage of our type abstraction approach.

传统的“封闭世界”信息提取(IE)方法依赖于人类本体来定义提取的范围。因此，这种方法在应用于新领域时就会出现不足。这就要求系统能够从给定的语料库中自动推断出新的类型，我们将这一任务称为类型发现。为了解决这个问题，我们引入了类型抽象的思想，提示模型泛化并命名类型。然后，我们使用推断名称之间的相似性来归纳聚类。观察到这种基于抽象的表示通常是实体/触发令牌表示的补充，我们将这两种表示设置为两个视图，并将我们的模型设计为协同训练框架。我们在多个关系提取和事件提取数据集上的实验一致地显示了我们的类型抽象方法的优势。

引用次数: 4

Towards Generalized Open Information Extraction 面向广义开放信息提取

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.15987

Yu Bowen, Zhenyu Zhang, Jingyang Li, Haiyang Yu, Tingwen Liu, Jianguo Sun, Yongbin Li, Bin Wang

Open Information Extraction (OpenIE) facilitates the open-domain discovery of textual facts. However, the prevailing solutions evaluate OpenIE models on in-domain test sets aside from the training corpus, which certainly violates the initial task principle of domain-independence. In this paper, we propose to advance OpenIE towards a more realistic scenario: generalizing over unseen target domains with different data distributions from the source training domains, termed Generalized OpenIE. For this purpose, we first introduce GLOBE, a large-scale human-annotated multi-domain OpenIE benchmark, to examine the robustness of recent OpenIE models to domain shifts, and the relative performance degradation of up to 70% implies the challenges of generalized OpenIE. Then, we propose DragonIE, which explores a minimalist graph expression of textual fact: directed acyclic graph, to improve the OpenIE generalization. Extensive experiments demonstrate that DragonIE beats the previous methods in both in-domain and out-of-domain settings by as much as 6.0% in F1 score absolutely, but there is still ample room for improvement.

开放信息抽取(OpenIE)促进了文本事实的开放领域发现。然而，目前的解决方案是在领域内测试集上评估OpenIE模型，而不是在训练语料库上，这显然违反了领域独立的初始任务原则。在本文中，我们建议将OpenIE推进到一个更现实的场景:在不可见的目标域上泛化与源训练域不同的数据分布，称为广义OpenIE。为此，我们首先引入了GLOBE，一个大规模的人类注释的多域OpenIE基准，以检查最近的OpenIE模型对域转移的鲁棒性，并且高达70%的相对性能下降意味着广义OpenIE的挑战。然后，我们提出了DragonIE，它探索了文本事实的极简图表达:有向无环图，以提高OpenIE的泛化。大量的实验表明，DragonIE在域内和域外设置下的F1分数都比以前的方法高出6.0%，但仍有很大的改进空间。

{"title":"Towards Generalized Open Information Extraction","authors":"Yu Bowen, Zhenyu Zhang, Jingyang Li, Haiyang Yu, Tingwen Liu, Jianguo Sun, Yongbin Li, Bin Wang","doi":"10.48550/arXiv.2211.15987","DOIUrl":"https://doi.org/10.48550/arXiv.2211.15987","url":null,"abstract":"Open Information Extraction (OpenIE) facilitates the open-domain discovery of textual facts. However, the prevailing solutions evaluate OpenIE models on in-domain test sets aside from the training corpus, which certainly violates the initial task principle of domain-independence. In this paper, we propose to advance OpenIE towards a more realistic scenario: generalizing over unseen target domains with different data distributions from the source training domains, termed Generalized OpenIE. For this purpose, we first introduce GLOBE, a large-scale human-annotated multi-domain OpenIE benchmark, to examine the robustness of recent OpenIE models to domain shifts, and the relative performance degradation of up to 70% implies the challenges of generalized OpenIE. Then, we propose DragonIE, which explores a minimalist graph expression of textual fact: directed acyclic graph, to improve the OpenIE generalization. Extensive experiments demonstrate that DragonIE beats the previous methods in both in-domain and out-of-domain settings by as much as 6.0% in F1 score absolutely, but there is still ample room for improvement.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"45 1","pages":"1439-1453"},"PeriodicalIF":0.0,"publicationDate":"2022-11-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89403157","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Abstract Visual Reasoning with Tangram Shapes 抽象视觉推理与七巧板形状

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

Pub Date : 2022-11-29 DOI: 10.48550/arXiv.2211.16492

Anya Ji, Noriyuki Kojima, N. Rush, Alane Suhr, Wai Keen Vong, Robert D. Hawkins, Yoav Artzi

We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. Drawing on the history of tangram puzzles as stimuli in cognitive science, we build a richly annotated dataset that, with >1k distinct stimuli, is orders of magnitude larger and more diverse than prior resources. It is both visually and linguistically richer, moving beyond whole shape descriptions to include segmentation maps and part labels. We use this resource to evaluate the abstract visual reasoning capacities of recent multi-modal models. We observe that pre-trained weights demonstrate limited abstract reasoning, which dramatically improves with fine-tuning. We also observe that explicitly describing parts aids abstract reasoning for both humans and models, especially when jointly encoding the linguistic and visual inputs.

我们介绍了千克，一个用于研究人类和机器抽象视觉推理的资源。借鉴七合板拼图作为认知科学刺激的历史，我们建立了一个丰富的注释数据集，该数据集具有>1k种不同的刺激，比以前的资源更大，更多样化。它在视觉和语言上都更丰富，超越了整体形状描述，包括分割图和零件标签。我们使用这个资源来评估最近的多模态模型的抽象视觉推理能力。我们观察到，预训练的权重表现出有限的抽象推理，这在微调后得到了显著改善。我们还观察到，明确地描述部件有助于人类和模型的抽象推理，特别是在共同编码语言和视觉输入时。

引用次数: 17

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀