Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting最新文献

英文中文

Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research 翻译NLP:自然语言处理研究的新范式和一般原则

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2021-04-16 DOI: 10.18653/V1/2021.NAACL-MAIN.325

Denis Newman-Griffis, J. Lehman, C. Ros'e, H. Hochheiser

Natural language processing (NLP) research combines the study of universal principles, through basic science, with applied science targeting specific use cases and settings. However, the process of exchange between basic NLP and applications is often assumed to emerge naturally, resulting in many innovations going unapplied and many important questions left unstudied. We describe a new paradigm of Translational NLP, which aims to structure and facilitate the processes by which basic and applied NLP research inform one another. Translational NLP thus presents a third research paradigm, focused on understanding the challenges posed by application needs and how these challenges can drive innovation in basic science and technology design. We show that many significant advances in NLP research have emerged from the intersection of basic principles with application needs, and present a conceptual framework outlining the stakeholders and key questions in translational research. Our framework provides a roadmap for developing Translational NLP as a dedicated research area, and identifies general translational principles to facilitate exchange between basic and applied research.

自然语言处理(NLP)研究通过基础科学将对普遍原理的研究与针对特定用例和设置的应用科学相结合。然而，基础NLP和应用之间的交流过程通常被认为是自然出现的，导致许多创新没有得到应用，许多重要问题没有得到研究。我们描述了一个翻译型NLP的新范式，其目的是构建和促进基础和应用NLP研究相互告知的过程。因此，翻译NLP提出了第三种研究范式，重点是理解应用需求带来的挑战，以及这些挑战如何推动基础科学和技术设计的创新。我们表明，NLP研究的许多重大进展都是从基本原则与应用需求的交叉中出现的，并提出了一个概念框架，概述了转化研究中的利益相关者和关键问题。我们的框架为将翻译型自然语言处理发展为一个专门的研究领域提供了路线图，并确定了一般的翻译原则，以促进基础研究和应用研究之间的交流。

{"title":"Translational NLP: A New Paradigm and General Principles for Natural Language Processing Research","authors":"Denis Newman-Griffis, J. Lehman, C. Ros'e, H. Hochheiser","doi":"10.18653/V1/2021.NAACL-MAIN.325","DOIUrl":"https://doi.org/10.18653/V1/2021.NAACL-MAIN.325","url":null,"abstract":"Natural language processing (NLP) research combines the study of universal principles, through basic science, with applied science targeting specific use cases and settings. However, the process of exchange between basic NLP and applications is often assumed to emerge naturally, resulting in many innovations going unapplied and many important questions left unstudied. We describe a new paradigm of Translational NLP, which aims to structure and facilitate the processes by which basic and applied NLP research inform one another. Translational NLP thus presents a third research paradigm, focused on understanding the challenges posed by application needs and how these challenges can drive innovation in basic science and technology design. We show that many significant advances in NLP research have emerged from the intersection of basic principles with application needs, and present a conceptual framework outlining the stakeholders and key questions in translational research. Our framework provides a roadmap for developing Translational NLP as a dedicated research area, and identifies general translational principles to facilitate exchange between basic and applied research.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"21 1","pages":"4125-4138"},"PeriodicalIF":0.0,"publicationDate":"2021-04-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86528175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 11

Paragraph-level Simplification of Medical Texts 医学文本的分段简化

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2021-04-12 DOI: 10.18653/V1/2021.NAACL-MAIN.395

Ashwin Devaraj, I. Marshall, Byron C. Wallace, J. Li

We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing “jargon” terms; we find that this yields improvements over baselines in terms of readability.

我们考虑学习简化医学文本的问题。这一点很重要，因为大多数可靠的、最新的生物医学信息都充斥着行话，因此外行读者实际上无法理解。此外，人工简化并不适用于快速增长的生物医学文献，这促使人们需要自动化方法。不幸的是，没有大规模的资源可用于此任务。在这项工作中，我们介绍了一个新的语料库平行文本的英语，包括技术和lay总结所有已发表的证据有关不同的临床主题。然后，我们提出了一个基于基于科学文本预训练的屏蔽语言模型的似然分数的新度量。我们表明，这种自动度量比现有的启发式更好地区分了技术摘要和外行摘要。我们引入并评估了基线编码器-解码器转换器模型以简化，并提出了一种新的增强方法，其中我们明确地惩罚解码器产生“术语”术语;我们发现，这在可读性方面比基线有所提高。

{"title":"Paragraph-level Simplification of Medical Texts","authors":"Ashwin Devaraj, I. Marshall, Byron C. Wallace, J. Li","doi":"10.18653/V1/2021.NAACL-MAIN.395","DOIUrl":"https://doi.org/10.18653/V1/2021.NAACL-MAIN.395","url":null,"abstract":"We consider the problem of learning to simplify medical texts. This is important because most reliable, up-to-date information in biomedicine is dense with jargon and thus practically inaccessible to the lay audience. Furthermore, manual simplification does not scale to the rapidly growing body of biomedical literature, motivating the need for automated approaches. Unfortunately, there are no large-scale resources available for this task. In this work we introduce a new corpus of parallel texts in English comprising technical and lay summaries of all published evidence pertaining to different clinical topics. We then propose a new metric based on likelihood scores from a masked language model pretrained on scientific texts. We show that this automated measure better differentiates between technical and lay summaries than existing heuristics. We introduce and evaluate baseline encoder-decoder Transformer models for simplification and propose a novel augmentation to these in which we explicitly penalize the decoder for producing “jargon” terms; we find that this yields improvements over baselines in terms of readability.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"7 1","pages":"4972-4984"},"PeriodicalIF":0.0,"publicationDate":"2021-04-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78695199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 47

TextEssence: A Tool for Interactive Analysis of Semantic Shifts Between Corpora TextEssence:语料库间语义转换交互分析工具

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2021-03-19 DOI: 10.18653/v1/2021.naacl-demos.13

Denis Newman-Griffis, Venkatesh Sivaraman, Adam Perer, E. Fosler-Lussier, H. Hochheiser

Embeddings of words and concepts capture syntactic and semantic regularities of language; however, they have seen limited use as tools to study characteristics of different corpora and how they relate to one another. We introduce TextEssence, an interactive system designed to enable comparative analysis of corpora using embeddings. TextEssence includes visual, neighbor-based, and similarity-based modes of embedding analysis in a lightweight, web-based interface. We further propose a new measure of embedding confidence based on nearest neighborhood overlap, to assist in identifying high-quality embeddings for corpus analysis. A case study on COVID-19 scientific literature illustrates the utility of the system. TextEssence can be found at https://textessence.github.io.

词和概念的嵌入捕捉语言的句法和语义规律;然而，他们认为，作为研究不同语料库的特征以及它们之间如何相互关联的工具，它们的作用有限。我们介绍TextEssence，这是一个交互式系统，旨在使用嵌入对语料库进行比较分析。TextEssence包括可视化的、基于邻居的和基于相似度的嵌入分析模式，在一个轻量级的、基于web的界面中。我们进一步提出了一种新的基于最近邻重叠的嵌入置信度度量，以帮助识别用于语料分析的高质量嵌入。一项关于COVID-19科学文献的案例研究说明了该系统的实用性。TextEssence可以在https://textessence.github.io上找到。

引用次数: 1

Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time. Trialstreamer:实时绘制和浏览医学证据。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2020-07-01 DOI: 10.18653/v1/2020.acl-demos.9

Benjamin E Nye, Ani Nenkova, Iain J Marshall, Byron C Wallace

We introduce Trialstreamer, a living database of clinical trial reports. Here we mainly describe the evidence extraction component; this extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these. Specifically, the system extracts descriptions of trial participants, the treatments compared in each arm (the interventions), and which outcomes were measured. The system then attempts to infer which interventions were reported to work best by determining their relationship with identified trial outcome measures. In addition to summarizing individual trials, these extracted data elements allow automatic synthesis of results across many trials on the same topic. We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic. We make all code and models freely available alongside a demonstration of the web interface.

我们介绍Trialstreamer，一个活生生的临床试验报告数据库。这里我们主要描述了证据提取部分;它从生物医学摘要中提取了临床医生在评价文献时需要的关键信息，以及这些信息之间的关系。具体来说，该系统提取了试验参与者的描述，每个组中比较的治疗方法(干预措施)，以及测量了哪些结果。然后，该系统试图通过确定干预措施与确定的试验结果措施的关系来推断哪些干预措施被报告为效果最好。除了总结单个试验之外，这些提取的数据元素还允许对同一主题的许多试验的结果进行自动合成。我们将该系统大规模应用于MEDLINE索引的所有随机对照试验报告，为证据图的自动生成提供动力，这些证据图结合了来自所有相关临床试验的数据，提供了不同干预措施有效性的全局视图。我们免费提供所有代码和模型以及web界面演示。

{"title":"Trialstreamer: Mapping and Browsing Medical Evidence in Real-Time.","authors":"Benjamin E Nye, Ani Nenkova, Iain J Marshall, Byron C Wallace","doi":"10.18653/v1/2020.acl-demos.9","DOIUrl":"https://doi.org/10.18653/v1/2020.acl-demos.9","url":null,"abstract":"We introduce Trialstreamer, a living database of clinical trial reports. Here we mainly describe the evidence extraction component; this extracts from biomedical abstracts key pieces of information that clinicians need when appraising the literature, and also the relations between these. Specifically, the system extracts descriptions of trial participants, the treatments compared in each arm (the interventions), and which outcomes were measured. The system then attempts to infer which interventions were reported to work best by determining their relationship with identified trial outcome measures. In addition to summarizing individual trials, these extracted data elements allow automatic synthesis of results across many trials on the same topic. We apply the system at scale to all reports of randomized controlled trials indexed in MEDLINE, powering the automatic generation of evidence maps, which provide a global view of the efficacy of different interventions combining data from all relevant clinical trials on a topic. We make all code and models freely available alongside a demonstration of the web interface.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2020 ","pages":"63-69"},"PeriodicalIF":0.0,"publicationDate":"2020-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8204713/pdf/nihms-1593346.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39239461","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 20

Classification of Semantic Paraphasias: Optimization of a Word Embedding Model. 语义错述的分类:一个词嵌入模型的优化。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2019-06-01 DOI: 10.18653/v1/w19-2007

Katy McKinney-Bock, Steven Bedrick

In clinical assessment of people with aphasia, impairment in the ability to recall and produce words for objects (anomia) is assessed using a confrontation naming task, where a target stimulus is viewed and a corresponding label is spoken by the participant. Vector space word embedding models have had inital results in assessing semantic similarity of target-production pairs in order to automate scoring of this task; however, the resulting models are also highly dependent upon training parameters. To select an optimal family of models, we fit a beta regression model to the distribution of performance metrics on a set of 2,880 grid search models and evaluate the resultant first- and second-order effects to explore how parameterization affects model performance. Comparing to SimLex-999, we show that clinical data can be used in an evaluation task with comparable optimal parameter settings as standard NLP evaluation datasets.

在对失语症患者的临床评估中，通过一项对抗命名任务来评估回忆和产生物体单词(失语症)能力的损害，在这项任务中，参与者看到目标刺激并说出相应的标签。向量空间词嵌入模型在评估目标生成对的语义相似度方面取得了初步成果，从而实现了自动评分;然而，得到的模型也高度依赖于训练参数。为了选择最优的模型族，我们将beta回归模型拟合到2880个网格搜索模型上的性能指标分布，并评估由此产生的一阶和二阶效应，以探索参数化如何影响模型性能。与SimLex-999相比，我们表明临床数据可以用于具有可比较的最佳参数设置作为标准NLP评估数据集的评估任务。

引用次数: 3

Extracting Adverse Drug Event Information with Minimal Engineering. 基于最小工程的药物不良事件信息提取。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2019-06-01 DOI: 10.18653/v1/w19-1903

Timothy Miller, Alon Geva, Dmitriy Dligach

In this paper we describe an evaluation of the potential of classical information extraction methods to extract drug-related attributes, including adverse drug events, and compare to more recently developed neural methods. We use the 2018 N2C2 shared task data as our gold standard data set for training. We train support vector machine classifiers to detect drug and drug attribute spans, and pair these detected entities as training instances for an SVM relation classifier, with both systems using standard features. We compare to baseline neural methods that use standard contextualized embedding representations for entity and relation extraction. The SVM-based system and a neural system obtain comparable results, with the SVM system doing better on concepts and the neural system performing better on relation extraction tasks. The neural system obtains surprisingly strong results compared to the system based on years of research in developing features for information extraction.

在本文中，我们描述了经典信息提取方法在提取药物相关属性(包括药物不良事件)方面的潜力评估，并与最近开发的神经方法进行了比较。我们使用2018年N2C2共享任务数据作为训练的黄金标准数据集。我们训练支持向量机分类器来检测药物和药物属性跨度，并将这些检测到的实体配对为支持向量机关系分类器的训练实例，两个系统都使用标准特征。我们将基线神经方法与使用标准上下文化嵌入表示进行实体和关系提取的方法进行比较。基于支持向量机的系统和神经系统得到了相当的结果，支持向量机系统在概念上做得更好，神经系统在关系提取任务上做得更好。与基于多年研究开发信息提取特征的系统相比，神经系统获得了令人惊讶的强大结果。

引用次数: 10

Simplified Neural Unsupervised Domain Adaptation 简化神经无监督域自适应

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2019-05-22 DOI: 10.18653/v1/N19-1039

Timothy Miller

Unsupervised domain adaptation (UDA) is the task of training a statistical model on labeled data from a source domain to achieve better performance on data from a target domain, with access to only unlabeled data in the target domain. Existing state-of-the-art UDA approaches use neural networks to learn representations that are trained to predict the values of subset of important features called “pivot features” on combined data from the source and target domains. In this work, we show that it is possible to improve on existing neural domain adaptation algorithms by 1) jointly training the representation learner with the task learner; and 2) removing the need for heuristically-selected “pivot features.” Our results show competitive performance with a simpler model.

无监督域自适应(Unsupervised domain adaptation, UDA)是在源域的标记数据上训练统计模型，以在目标域的数据上获得更好的性能，只访问目标域的未标记数据。现有的最先进的UDA方法使用神经网络来学习表征，这些表征被训练来预测来自源和目标域的组合数据上称为“枢轴特征”的重要特征子集的值。在这项工作中，我们表明可以通过1)联合训练表征学习器和任务学习器来改进现有的神经域自适应算法;2)消除了对启发式选择的“枢纽特征”的需求。我们的结果显示了一个更简单的模型具有竞争力的表现。

引用次数: 24

Oral-Motor and Lexical Diversity During Naturalistic Conversations in Adults with Autism Spectrum Disorder. 自闭症谱系障碍成人在自然对话中的口语运动和词汇多样性。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2018-06-01 DOI: 10.18653/v1/w18-0616

Julia Parish-Morris, Evangelos Sariyanidi, Casey Zampella, G Keith Bartley, Emily Ferguson, Ashley A Pallathra, Leila Bateman, Samantha Plate, Meredith Cola, Juhi Pandey, Edward S Brodkin, Robert T Schultz, Birkan Tunç

Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by impaired social communication and the presence of restricted, repetitive patterns of behaviors and interests. Prior research suggests that restricted patterns of behavior in ASD may be cross-domain phenomena that are evident in a variety of modalities. Computational studies of language in ASD provide support for the existence of an underlying dimension of restriction that emerges during a conversation. Similar evidence exists for restricted patterns of facial movement. Using tools from computational linguistics, computer vision, and information theory, this study tests whether cognitive-motor restriction can be detected across multiple behavioral domains in adults with ASD during a naturalistic conversation. Our methods identify restricted behavioral patterns, as measured by entropy in word use and mouth movement. Results suggest that adults with ASD produce significantly less diverse mouth movements and words than neurotypical adults, with an increased reliance on repeated patterns in both domains. The diversity values of the two domains are not significantly correlated, suggesting that they provide complementary information.

自闭症谱系障碍（ASD）是一种神经发育性疾病，其特点是社交沟通障碍以及行为和兴趣模式受限、重复。先前的研究表明，自闭症谱系障碍中的限制性行为模式可能是一种跨领域现象，在各种模式中都很明显。对 ASD 患者语言的计算研究支持了在对话中出现的潜在限制维度的存在。面部运动受限模式也有类似的证据。本研究利用计算语言学、计算机视觉和信息论等工具，测试了在自然对话过程中，是否可以在多个行为领域检测到 ASD 成人的认知运动限制。我们的方法可以识别受限的行为模式，并通过词语使用和嘴部动作的熵来进行测量。结果表明，与神经畸形成人相比，患有 ASD 的成人口部动作和言语的多样性明显较低，而且在这两个领域中对重复模式的依赖程度更高。两个领域的多样性值没有明显的相关性，这表明它们提供的信息是互补的。

{"title":"Oral-Motor and Lexical Diversity During Naturalistic Conversations in Adults with Autism Spectrum Disorder.","authors":"Julia Parish-Morris, Evangelos Sariyanidi, Casey Zampella, G Keith Bartley, Emily Ferguson, Ashley A Pallathra, Leila Bateman, Samantha Plate, Meredith Cola, Juhi Pandey, Edward S Brodkin, Robert T Schultz, Birkan Tunç","doi":"10.18653/v1/w18-0616","DOIUrl":"10.18653/v1/w18-0616","url":null,"abstract":"Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by impaired social communication and the presence of restricted, repetitive patterns of behaviors and interests. Prior research suggests that restricted patterns of behavior in ASD may be cross-domain phenomena that are evident in a variety of modalities. Computational studies of language in ASD provide support for the existence of an underlying dimension of restriction that emerges during a conversation. Similar evidence exists for restricted patterns of facial movement. Using tools from computational linguistics, computer vision, and information theory, this study tests whether cognitive-motor restriction can be detected across multiple behavioral domains in adults with ASD during a naturalistic conversation. Our methods identify restricted behavioral patterns, as measured by entropy in word use and mouth movement. Results suggest that adults with ASD produce significantly less diverse mouth movements and words than neurotypical adults, with an increased reliance on repeated patterns in both domains. The diversity values of the two domains are not significantly correlated, suggesting that they provide complementary information.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"147-157"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7558464/pdf/nihms-985188.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38502652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos. 二元对话视频中情感识别的会话记忆网络。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2018-06-01 DOI: 10.18653/v1/n18-1193

Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann

Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.

对话中的情绪识别对于移情机器的发展至关重要。目前的方法在对会话情绪进行分类时，大多忽略了说话人间依赖关系的作用。在本文中，我们讨论了识别二元对话视频中的话语级情绪。我们提出了一个深层神经框架，称为会话记忆网络，它利用会话历史中的上下文信息。该框架采用多模态方法，包括音频、视觉和文本特征，以及门控循环单元，将每个说话者过去的话语建模为记忆。然后，这些记忆通过基于注意力的跳跃来合并，以捕捉说话者之间的依赖关系。实验表明，该方法的精度比目前的方法提高了3-4%。

引用次数: 282

Syntactic Patterns Improve Information Extraction for Medical Search. 语法模式改进医学搜索的信息提取。

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

Pub Date : 2018-06-01

Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron C Wallace

Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both linear and neural) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited, and the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.

医学专业人员通过指定患者类型、医疗干预和感兴趣的结果测量来搜索已发表的文献。在本文中，我们展示了特征编码语法模式如何提高最先进的序列标记模型(线性和神经)的性能，用于这些医学相关类别的信息提取。我们分析了被利用的模式类型，以及为这些模式诱导的语义空间，即为识别的多标记模式学习的分布式表示。我们发现，这些学习到的表征与那些组成单字图的表征有很大的不同，这表明这些模式捕捉了否则会丢失的上下文信息。

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀