首页 > 最新文献

Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing最新文献

英文 中文
Dynamic Augmentation Data Selection for Few-shot Text Classification 基于小样本文本分类的动态增强数据选择
Guangliang Liu, Lifeng Jin, Owen Yuan, Jiayu Zhou
Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.
数据增强一直是一种流行的方法,用于微调预训练语言模型,以提高模型的鲁棒性和性能。由于增强数据来自修改黄金序列数据(样本内增强)或来自一般领域未标记数据(样本外增强),因此这些数据的质量是成功微调的关键。本文提出了一种动态数据选择方法,通过识别一组最优促进最新模型学习过程的增强样本,根据模型的学习阶段,从不同的增强源中选择有效的增强数据。该方法首先通过课程学习策略过滤掉带有噪声伪标签的增强样本,然后通过每次更新时保留的增强数据对当前模型的影响分数来估计其有效性,从而使数据选择过程与模型参数紧密匹配。两阶段增强策略考虑了不同学习阶段的样本内增强和样本外增强。对两种增强数据在多种句子分类任务上的实验表明,我们的方法优于强基线,证明了我们的方法的有效性。分析证实了数据有效性的动态性质和模型学习阶段在利用增强数据中的重要性。
{"title":"Dynamic Augmentation Data Selection for Few-shot Text Classification","authors":"Guangliang Liu, Lifeng Jin, Owen Yuan, Jiayu Zhou","doi":"10.18653/v1/2022.findings-emnlp.356","DOIUrl":"https://doi.org/10.18653/v1/2022.findings-emnlp.356","url":null,"abstract":"Data augmentation has been a popular method for fine-tuning pre-trained language models to increase model robustness and performance. With augmentation data coming from modifying gold train data (in-sample augmentation) or being harvested from general domain unlabeled data (out-of-sample augmentation), the quality of such data is the key to successful fine-tuning. In this paper, we propose a dynamic data selection method to select effective augmentation data from different augmentation sources according to the model's learning stage, by identifying a set of augmentation samples that optimally facilitates the learning process of the most current model. The method firstly filters out augmentation samples with noisy pseudo labels through a curriculum learning strategy, then estimates the effectiveness of reserved augmentation data by its influence scores on the current model at every update, allowing the data selection process tightly tailored to model parameters. And the two-stage augmentation strategy considers in-sample augmentation and out-of-sample augmentation in different learning stages. Experiments with both kinds of augmentation data on a variety of sentence classification tasks show that our method outperforms strong baselines, proving the effectiveness of our method. Analysis confirms the dynamic nature of the data effectiveness and the importance of model learning stages in utilization of augmentation data.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"176 5 1","pages":"4870-4881"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75520104","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score. MedJEx:一个具有Wiki超链接跨度和上下文化掩码语言模型分数的医学术语提取模型。
Sunjae Kwon, Zonghai Yao, Harmon S Jordan, David A Levy, Brian Corner, Hong Yu

This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. We first present a novel and publicly available dataset with expert-annotated medical jargon terms from 18K+ EHR note sentences (MedJ). Then, we introduce a novel medical jargon extraction (MedJEx) model which has been shown to outperform existing state-of-the-art NLP models. First, MedJEx improved the overall performance when it was trained on an auxiliary Wikipedia hyperlink span dataset, where hyperlink spans provide additional Wikipedia articles to explain the spans (or terms), and then fine-tuned on the annotated MedJ data. Secondly, we found that a contextualized masked language model score was beneficial for detecting domain-specific unfamiliar jargon terms. Moreover, our results show that training on the auxiliary Wikipedia hyperlink span datasets improved six out of eight biomedical named entity recognition benchmark datasets. Both MedJ and MedJEx are publicly available.

本文提出了一种新的自然语言处理(NLP)应用程序,用于从电子健康记录(EHR)笔记中识别患者可能难以理解的医学术语。我们首先提出了一个新的和公开可用的数据集,其中包含来自18K+ EHR笔记句子(MedJ)的专家注释的医学术语。然后,我们引入了一种新的医学术语提取(MedJEx)模型,该模型已被证明优于现有的最先进的NLP模型。首先,在辅助Wikipedia超链接跨度数据集上训练MedJEx时,MedJEx提高了整体性能,其中超链接跨度提供了额外的Wikipedia文章来解释跨度(或术语),然后对带注释的MedJ数据进行微调。其次,我们发现上下文化的掩蔽语言模型分数有助于检测特定领域的不熟悉术语。此外,我们的结果表明,在辅助维基百科超链接跨度数据集上的训练提高了8个生物医学命名实体识别基准数据集中的6个。MedJ和MedJEx都是公开的。
{"title":"MedJEx: A Medical Jargon Extraction Model with Wiki's Hyperlink Span and Contextualized Masked Language Model Score.","authors":"Sunjae Kwon,&nbsp;Zonghai Yao,&nbsp;Harmon S Jordan,&nbsp;David A Levy,&nbsp;Brian Corner,&nbsp;Hong Yu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>This paper proposes a new natural language processing (NLP) application for identifying medical jargon terms potentially difficult for patients to comprehend from electronic health record (EHR) notes. We first present a novel and publicly available dataset with expert-annotated medical jargon terms from 18K+ EHR note sentences (<i>MedJ</i>). Then, we introduce a novel medical jargon extraction (<i>MedJEx</i>) model which has been shown to outperform existing state-of-the-art NLP models. First, MedJEx improved the overall performance when it was trained on an auxiliary Wikipedia hyperlink span dataset, where hyperlink spans provide additional Wikipedia articles to explain the spans (or terms), and then fine-tuned on the annotated MedJ data. Secondly, we found that a contextualized masked language model score was beneficial for detecting domain-specific unfamiliar jargon terms. Moreover, our results show that training on the auxiliary Wikipedia hyperlink span datasets improved six out of eight biomedical named entity recognition benchmark datasets. Both MedJ and MedJEx are publicly available.</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"11733-11751"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10129059/pdf/nihms-1843448.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9384374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Focus! Relevant and Sufficient Context Selection for News Image Captioning 焦点!新闻图片标题的相关和充分的上下文选择
Mingyang Zhou, Grace Luo, Anna Rohrbach, Zhou Yu
News Image Captioning requires describing an image by leveraging additional context from a news article. Previous works only coarsely leverage the article to extract the necessary context, which makes it challenging for models to identify relevant events and named entities. In our paper, we first demonstrate that by combining more fine-grained context that captures the key named entities (obtained via an oracle) and the global context that summarizes the news, we can dramatically improve the model's ability to generate accurate news captions. This begs the question, how to automatically extract such key entities from an image? We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article and then capture the non-visual entities via an open relation extraction model. Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models and achieve new state-of-the-art performance on multiple benchmarks.
新闻图片字幕需要通过利用新闻文章中的附加上下文来描述图像。以前的工作只是粗略地利用文章来提取必要的上下文,这使得模型很难识别相关事件和命名实体。在我们的论文中,我们首先证明,通过结合捕获关键命名实体(通过oracle获得)的更细粒度上下文和总结新闻的全局上下文,我们可以显着提高模型生成准确新闻标题的能力。这就引出了一个问题,如何从图像中自动提取这些关键实体?我们建议使用预训练的视觉和语言检索模型CLIP来定位新闻文章中的视觉基础实体,然后通过开放关系提取模型捕获非视觉实体。我们的实验表明,通过简单地从文章中选择一个更好的上下文,我们可以显著提高现有模型的性能,并在多个基准测试中实现新的最先进的性能。
{"title":"Focus! Relevant and Sufficient Context Selection for News Image Captioning","authors":"Mingyang Zhou, Grace Luo, Anna Rohrbach, Zhou Yu","doi":"10.48550/arXiv.2212.00843","DOIUrl":"https://doi.org/10.48550/arXiv.2212.00843","url":null,"abstract":"News Image Captioning requires describing an image by leveraging additional context from a news article. Previous works only coarsely leverage the article to extract the necessary context, which makes it challenging for models to identify relevant events and named entities. In our paper, we first demonstrate that by combining more fine-grained context that captures the key named entities (obtained via an oracle) and the global context that summarizes the news, we can dramatically improve the model's ability to generate accurate news captions. This begs the question, how to automatically extract such key entities from an image? We propose to use the pre-trained vision and language retrieval model CLIP to localize the visually grounded entities in the news article and then capture the non-visual entities via an open relation extraction model. Our experiments demonstrate that by simply selecting a better context from the article, we can significantly improve the performance of existing models and achieve new state-of-the-art performance on multiple benchmarks.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"141 1","pages":"6078-6088"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86247318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
MedCLIP: Contrastive Learning from Unpaired Medical Images and Text. MedCLIP:从非配对医学图像和文本中进行对比学习。
Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun

Existing vision-text contrastive learning like CLIP (Radford et al., 2021) aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using ≈200K data).

现有的视觉-文本对比学习,如 CLIP(Radford 等人,2021 年),旨在匹配配对的图像和标题嵌入,同时将其他图像和标题推开,从而提高表示的可转移性并支持零镜头预测。然而,医学图像-文本数据集比互联网上的普通图像和标题低几个数量级。此外,以前的方法会遇到许多假阴性,即来自不同患者的图像和报告可能具有相同的语义,但却被错误地视为阴性。在本文中,我们将图像和文本解耦,用于多模态对比学习,从而以较低的成本在组合量级上扩展可用的训练数据。我们还建议用基于医学知识的语义匹配损失取代 InfoNCE 损失,以消除对比学习中的假阴性。我们证明,MedCLIP 是一个简单而有效的框架:它在零镜头预测、监督分类和图像文本检索方面都优于最先进的方法。令人惊讶的是,我们发现,只需 20K 预训练数据,MedCLIP 就能战胜最先进的方法(使用 ≈200K 数据)。
{"title":"MedCLIP: Contrastive Learning from Unpaired Medical Images and Text.","authors":"Zifeng Wang, Zhenbang Wu, Dinesh Agarwal, Jimeng Sun","doi":"10.18653/v1/2022.emnlp-main.256","DOIUrl":"10.18653/v1/2022.emnlp-main.256","url":null,"abstract":"<p><p>Existing vision-text contrastive learning like CLIP (Radford et al., 2021) aims to match the paired image and caption embeddings while pushing others apart, which improves representation transferability and supports zero-shot prediction. However, medical image-text datasets are orders of magnitude below the general images and captions from the internet. Moreover, previous methods encounter many false negatives, i.e., images and reports from separate patients probably carry the same semantics but are wrongly treated as negatives. In this paper, we decouple images and texts for multimodal contrastive learning thus scaling the usable training data in a combinatorial magnitude with low cost. We also propose to replace the InfoNCE loss with semantic matching loss based on medical knowledge to eliminate false negatives in contrastive learning. We prove that MedCLIP is a simple yet effective framework: it outperforms state-of-the-art methods on zero-shot prediction, supervised classification, and image-text retrieval. Surprisingly, we observe that with only 20K pre-training data, MedCLIP wins over the state-of-the-art method (using ≈200K data).</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"3876-3887"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11323634/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141984150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding. 基于知识注入提示的多标签少镜头 ICD 编码微调。
Zhichao Yang, Shufan Wang, Bhanu Pratap Singh Rawat, Avijit Mitra, Hong Yu

Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with average length of 3,000+ tokens. This task is challenging due to a high-dimensional space of multi-label assignment (tens of thousands of ICD codes) and the long-tail challenge: only a few codes (common diseases) are frequently assigned while most codes (rare diseases) are infrequently assigned. This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics, which has been shown to be effective under few-shot setting. To further enhance the performance in medical domain, we propose a knowledge-enhanced longformer by injecting three domain-specific knowledge: hierarchy, synonym, and abbreviation with additional pretraining using contrastive learning. Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-the-art method in 14.5% in marco F1 (from 10.3 to 11.8, P<0.001). To further test our model on few-shot setting, we created a new rare diseases coding dataset, MIMIC-III-rare50, on which our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.

国际疾病分类(ICD)自动编码旨在为平均长度超过 3,000 个标记的医疗记录分配多个 ICD 代码。由于多标签分配的高维空间(数以万计的 ICD 代码)和长尾挑战:只有少数代码(常见疾病)是经常分配的,而大多数代码(罕见疾病)是不经常分配的,因此这项任务具有挑战性。本研究通过调整基于提示和标签语义的微调技术来应对长尾挑战,该技术已被证明在少次搜索设置下是有效的。为了进一步提高医疗领域的性能,我们提出了一种知识增强型长尾词,通过注入三个特定领域的知识:层次结构、同义词和缩写,并使用对比学习进行额外的预训练。在代码分配的基准数据集 MIMIC-III-full 上进行的实验表明,我们提出的方法比以前最先进的方法高出 14.5%(从 10.3 到 11.8,P<0.05)。
{"title":"Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding.","authors":"Zhichao Yang, Shufan Wang, Bhanu Pratap Singh Rawat, Avijit Mitra, Hong Yu","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Automatic International Classification of Diseases (ICD) coding aims to assign multiple ICD codes to a medical note with average length of 3,000+ tokens. This task is challenging due to a high-dimensional space of multi-label assignment (tens of thousands of ICD codes) and the long-tail challenge: only a few codes (common diseases) are frequently assigned while most codes (rare diseases) are infrequently assigned. This study addresses the long-tail challenge by adapting a prompt-based fine-tuning technique with label semantics, which has been shown to be effective under few-shot setting. To further enhance the performance in medical domain, we propose a knowledge-enhanced longformer by injecting three domain-specific knowledge: hierarchy, synonym, and abbreviation with additional pretraining using contrastive learning. Experiments on MIMIC-III-full, a benchmark dataset of code assignment, show that our proposed method outperforms previous state-of-the-art method in 14.5% in marco F1 (from 10.3 to 11.8, P<0.001). To further test our model on few-shot setting, we created a new rare diseases coding dataset, MIMIC-III-rare50, on which our model improves marco F1 from 17.1 to 30.4 and micro F1 from 17.2 to 32.6 compared to previous method.</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"1767-1781"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9958514/pdf/nihms-1875185.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10860154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Beyond Additive Fusion: Learning Non-Additive Multimodal Interactions. 超越加性融合:学习非加性多模态相互作用。
Torsten Wörtwein, Lisa B Sheeber, Nicholas Allen, Jeffrey F Cohn, Louis-Philippe Morency

Multimodal fusion addresses the problem of analyzing spoken words in the multimodal context, including visual expressions and prosodic cues. Even when multimodal models lead to performance improvements, it is often unclear whether bimodal and trimodal interactions are learned or whether modalities are processed independently of each other. We propose Multimodal Residual Optimization (MRO) to separate unimodal, bimodal, and trimodal interactions in a multimodal model. This improves interpretability as the multimodal interaction can be quantified. Inspired by Occam's razor, the main intuition of MRO is that (simpler) unimodal contributions should be learned before learning (more complex) bimodal and trimodal interactions. For example, bimodal predictions should learn to correct the mistakes (residuals) of unimodal predictions, thereby letting the bimodal predictions focus on the remaining bimodal interactions. Empirically, we observe that MRO successfully separates unimodal, bimodal, and trimodal interactions while not degrading predictive performance. We complement our empirical results with a human perception study and observe that MRO learns multimodal interactions that align with human judgments.

多模态融合解决了在多模态语境中分析口语单词的问题,包括视觉表达和韵律线索。即使当多模态模型导致性能改善时,通常也不清楚是否学习了双峰态和三模态相互作用,或者模式是否彼此独立处理。我们提出多模态残差优化(MRO)来分离多模态模型中的单模态、双模态和三模态相互作用。这提高了可解释性,因为多模态相互作用可以量化。受奥卡姆剃刀的启发,MRO的主要直觉是,在学习(更复杂的)双峰和三峰交互之前,应该先学习(更简单的)单峰贡献。例如,双峰预测应该学会纠正单峰预测的错误(残差),从而使双峰预测集中在剩余的双峰相互作用上。根据经验,我们观察到MRO成功地分离了单峰、双峰和三峰相互作用,同时没有降低预测性能。我们用人类感知研究补充了我们的经验结果,并观察到MRO学习了与人类判断一致的多模态相互作用。
{"title":"Beyond Additive Fusion: Learning Non-Additive Multimodal Interactions.","authors":"Torsten Wörtwein, Lisa B Sheeber, Nicholas Allen, Jeffrey F Cohn, Louis-Philippe Morency","doi":"10.18653/v1/2022.findings-emnlp.344","DOIUrl":"10.18653/v1/2022.findings-emnlp.344","url":null,"abstract":"<p><p>Multimodal fusion addresses the problem of analyzing spoken words in the multimodal context, including visual expressions and prosodic cues. Even when multimodal models lead to performance improvements, it is often unclear whether bimodal and trimodal interactions are learned or whether modalities are processed independently of each other. We propose Multimodal Residual Optimization (MRO) to separate unimodal, bimodal, and trimodal interactions in a multimodal model. This improves interpretability as the multimodal interaction can be quantified. Inspired by Occam's razor, the main intuition of MRO is that (simpler) unimodal contributions should be learned before learning (more complex) bimodal and trimodal interactions. For example, bimodal predictions should learn to correct the mistakes (residuals) of unimodal predictions, thereby letting the bimodal predictions focus on the remaining bimodal interactions. Empirically, we observe that MRO successfully separates unimodal, bimodal, and trimodal interactions while not degrading predictive performance. We complement our empirical results with a human perception study and observe that MRO learns multimodal interactions that align with human judgments.</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"58 1","pages":"4681-4696"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665182/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80522280","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data. 那是错误的肺!评估和改进医疗数据无监督多模态编码器的可解释性。
Denis Jered McInerney, Geoffrey Young, Jan-Willem van de Meent, Byron C Wallace

Pretraining multimodal models on Electronic Health Records (EHRs) provides a means of learning representations that can transfer to downstream tasks with minimal supervision. Recent multimodal models induce soft local alignments between image regions and sentences. This is of particular interest in the medical domain, where alignments might highlight regions in an image relevant to specific phenomena described in free-text. While past work has suggested that attention "heatmaps" can be interpreted in this manner, there has been little evaluation of such alignments. We compare alignments from a state-of-the-art multimodal (image and text) model for EHR with human annotations that link image regions to sentences. Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. Moreover, synthetic modifications - such as substituting "left" for "right" - do not substantially influence highlights. Simple techniques such as allowing the model to opt out of attending to the image and few-shot finetuning show promise in terms of their ability to improve alignments with very little or no supervision. We make our code and checkpoints open-source.

在电子健康记录(EHRs)上预训练多模态模型提供了一种学习表征的方法,这种表征可以在最少的监督下转移到下游任务。最近的多模态模型诱导图像区域和句子之间的软局部对齐。这在医学领域是特别有趣的,其中对齐可能突出显示图像中与自由文本中描述的特定现象相关的区域。虽然过去的研究表明,注意力“热图”可以用这种方式解释,但对这种排列的评估很少。我们比较了电子病历中最先进的多模态(图像和文本)模型与将图像区域链接到句子的人工注释的对齐。我们的主要发现是,文本对注意力的影响通常是微弱的或非直觉的;排列不能一致地反映基本的解剖信息。此外,合成修饰——例如将“右”替换为“左”——不会对高亮显示产生实质性影响。简单的技术,如允许模型选择不关注图像和少量镜头微调,在很少或没有监督的情况下改善对齐的能力方面表现出了希望。我们将代码和检查点开源。
{"title":"That's the Wrong Lung! Evaluating and Improving the Interpretability of Unsupervised Multimodal Encoders for Medical Data.","authors":"Denis Jered McInerney,&nbsp;Geoffrey Young,&nbsp;Jan-Willem van de Meent,&nbsp;Byron C Wallace","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Pretraining multimodal models on Electronic Health Records (EHRs) provides a means of learning representations that can transfer to downstream tasks with minimal supervision. Recent multimodal models induce soft local alignments between image regions and sentences. This is of particular interest in the medical domain, where alignments might highlight regions in an image relevant to specific phenomena described in free-text. While past work has suggested that attention \"heatmaps\" can be interpreted in this manner, there has been little evaluation of such alignments. We compare alignments from a state-of-the-art multimodal (image and text) model for EHR with human annotations that link image regions to sentences. Our main finding is that the text has an often weak or unintuitive influence on attention; alignments do not consistently reflect basic anatomical information. Moreover, synthetic modifications - such as substituting \"left\" for \"right\" - do not substantially influence highlights. Simple techniques such as allowing the model to opt out of attending to the image and few-shot finetuning show promise in terms of their ability to improve alignments with very little or no supervision. We make our code and checkpoints open-source.</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"3626-3648"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10124183/pdf/nihms-1890274.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9384389","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning. PromptEHR:具有快速学习功能的条件电子医疗记录生成。
Zifeng Wang, Jimeng Sun

Accessing longitudinal multimodal Electronic Healthcare Records (EHRs) is challenging due to privacy concerns, which hinders the use of ML for healthcare applications. Synthetic EHRs generation bypasses the need to share sensitive real patient records. However, existing methods generate single-modal EHRs by unconditional generation or by longitudinal inference, which falls short of low flexibility and makes unrealistic EHRs. In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. We also design prompt learning to control the generation conditioned by numerical and categorical demographic features. We evaluate synthetic EHRs quality by two perplexity measures accounting for their longitudinal pattern (longitudinal imputation perplexity, lpl) and the connections cross modalities (cross-modality imputation perplexity, mpl). Moreover, we utilize two adversaries: membership and attribute inference attacks for privacy-preserving evaluation. Experiments on MIMIC-III data demonstrate the superiority of our methods on realistic EHRs generation (53.1% decrease of lpl and 45.3% decrease of mpl on average compared to the best baselines) with low privacy risks.

由于隐私问题,访问纵向多模式电子医疗记录(EHRs)具有挑战性,这阻碍了ML在医疗保健应用程序中的使用。合成电子病历的生成绕过了共享敏感的真实患者记录的需要。然而,现有方法通过无条件生成或纵向推理生成单模态电子病历,灵活性不高,产生的电子病历不切实际。在这项工作中,我们建议通过语言模型(LMs)将电子病历生成制定为文本到文本的翻译任务,这足以在生成过程中高度灵活地进行事件插入。我们还设计了提示学习来控制由数字和分类人口特征决定的生成。我们通过考虑其纵向模式(纵向imputation perplexity, lpl)和跨模态连接(跨模态imputation perplexity, mpl)的两种困惑度量来评估综合电子病历的质量。此外,我们利用两个对手:成员关系攻击和属性推理攻击来进行隐私保护评估。在MIMIC-III数据上的实验表明,我们的方法在真实的电子病历生成方面具有优势(与最佳基线相比,lpl平均降低53.1%,mpl平均降低45.3%),并且隐私风险低。
{"title":"PromptEHR: Conditional Electronic Healthcare Records Generation with Prompt Learning.","authors":"Zifeng Wang, Jimeng Sun","doi":"10.18653/v1/2022.emnlp-main.185","DOIUrl":"10.18653/v1/2022.emnlp-main.185","url":null,"abstract":"<p><p>Accessing longitudinal multimodal Electronic Healthcare Records (EHRs) is challenging due to privacy concerns, which hinders the use of ML for healthcare applications. Synthetic EHRs generation bypasses the need to share sensitive real patient records. However, existing methods generate single-modal EHRs by unconditional generation or by longitudinal inference, which falls short of low flexibility and makes unrealistic EHRs. In this work, we propose to formulate EHRs generation as a text-to-text translation task by language models (LMs), which suffices to highly flexible event imputation during generation. We also design prompt learning to control the generation conditioned by numerical and categorical demographic features. We evaluate synthetic EHRs quality by two perplexity measures accounting for their longitudinal pattern (longitudinal imputation perplexity, lpl) and the connections cross modalities (cross-modality imputation perplexity, mpl). Moreover, we utilize two adversaries: membership and attribute inference attacks for privacy-preserving evaluation. Experiments on MIMIC-III data demonstrate the superiority of our methods on realistic EHRs generation (53.1% decrease of lpl and 45.3% decrease of mpl on average compared to the best baselines) with low privacy risks.</p>","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"2022 ","pages":"2873-2885"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11824924/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143415859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework 企业的生物医学NER与蒸馏BERN2和Kazu框架
Wonjin Yoon, Richard Jackson, Elliot Ford, V. Poroshin, Jaewoo Kang
In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU
为了协助药物发现/开发过程,制药公司经常在内部和公共语料库上应用生物医学NER和连接技术。数十年来对生物信息处理领域的研究产生了大量的算法、系统和数据集。然而,我们的经验是,没有一个单一的开源系统能够满足现代制药公司的所有需求。在这项工作中,我们根据我们的行业经验描述了这些需求,并提出了Kazu,一个高度可扩展、可扩展的开源框架,旨在支持制药行业的BioNLP。Kazu是围绕BERN2 NER模型(TinyBERN2)的计算效率版本构建的,随后将其他几种BioNLP技术整合到一个连贯的系统中。KAZU框架是开源的:https://github.com/AstraZeneca/KAZU
{"title":"Biomedical NER for the Enterprise with Distillated BERN2 and the Kazu Framework","authors":"Wonjin Yoon, Richard Jackson, Elliot Ford, V. Poroshin, Jaewoo Kang","doi":"10.48550/arXiv.2212.00223","DOIUrl":"https://doi.org/10.48550/arXiv.2212.00223","url":null,"abstract":"In order to assist the drug discovery/development process, pharmaceutical companies often apply biomedical NER and linking techniques over internal and public corpora. Decades of study of the field of BioNLP has produced a plethora of algorithms, systems and datasets. However, our experience has been that no single open source system meets all the requirements of a modern pharmaceutical company. In this work, we describe these requirements according to our experience of the industry, and present Kazu, a highly extensible, scalable open source framework designed to support BioNLP for the pharmaceutical sector. Kazu is a built around a computationally efficient version of the BERN2 NER model (TinyBERN2), and subsequently wraps several other BioNLP technologies into one coherent system. KAZU framework is open-sourced: https://github.com/AstraZeneca/KAZU","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"1 1","pages":"619-626"},"PeriodicalIF":0.0,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83089284","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Quadapter: Adapter for GPT-2 Quantization Quadapter:用于GPT-2量化的适配器
Minseop Park, J. You, Markus Nagel, Simyung Chang
Transformer language models such as GPT-2 are difficult to quantize because of outliers in activations leading to a large quantization error. To adapt to the error, one must use quantization-aware training, which entails a fine-tuning process based on the dataset and the training pipeline identical to those for the original model. Pretrained language models, however, often do not grant access to their datasets and training pipelines, forcing us to rely on arbitrary ones for fine-tuning. In that case, it is observed that quantization-aware training overfits the model to the fine-tuning data. For quantization without overfitting, we introduce a quantization adapter (Quadapter), a small set of parameters that are learned to make activations quantization-friendly by scaling them channel-wise. It keeps the model parameters unchanged. By applying our method to the challenging task of quantizing GPT-2, we demonstrate that it effectively prevents the overfitting and improves the quantization performance.
像GPT-2这样的转换语言模型很难量化,因为激活中的异常值会导致很大的量化误差。为了适应误差,必须使用量化感知训练,这需要基于与原始模型相同的数据集和训练管道的微调过程。然而,预训练的语言模型通常不允许访问它们的数据集和训练管道,迫使我们依赖任意的数据集和管道进行微调。在这种情况下,可以观察到量化感知训练将模型过度拟合到微调数据。对于没有过拟合的量化,我们引入了一个量化适配器(Quadapter),这是一组被学习的参数,通过按通道缩放它们来使激活量化友好。它保持模型参数不变。通过将我们的方法应用于量化GPT-2的挑战性任务,我们证明了它有效地防止了过拟合并提高了量化性能。
{"title":"Quadapter: Adapter for GPT-2 Quantization","authors":"Minseop Park, J. You, Markus Nagel, Simyung Chang","doi":"10.48550/arXiv.2211.16912","DOIUrl":"https://doi.org/10.48550/arXiv.2211.16912","url":null,"abstract":"Transformer language models such as GPT-2 are difficult to quantize because of outliers in activations leading to a large quantization error. To adapt to the error, one must use quantization-aware training, which entails a fine-tuning process based on the dataset and the training pipeline identical to those for the original model. Pretrained language models, however, often do not grant access to their datasets and training pipelines, forcing us to rely on arbitrary ones for fine-tuning. In that case, it is observed that quantization-aware training overfits the model to the fine-tuning data. For quantization without overfitting, we introduce a quantization adapter (Quadapter), a small set of parameters that are learned to make activations quantization-friendly by scaling them channel-wise. It keeps the model parameters unchanged. By applying our method to the challenging task of quantizing GPT-2, we demonstrate that it effectively prevents the overfitting and improves the quantization performance.","PeriodicalId":74540,"journal":{"name":"Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing","volume":"66 1","pages":"2510-2517"},"PeriodicalIF":0.0,"publicationDate":"2022-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73526076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1