首页 > 最新文献

Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting最新文献

英文 中文
Classification of Semantic Paraphasias: Optimization of a Word Embedding Model. 语义错述的分类:一个词嵌入模型的优化。
Katy McKinney-Bock, Steven Bedrick

In clinical assessment of people with aphasia, impairment in the ability to recall and produce words for objects (anomia) is assessed using a confrontation naming task, where a target stimulus is viewed and a corresponding label is spoken by the participant. Vector space word embedding models have had inital results in assessing semantic similarity of target-production pairs in order to automate scoring of this task; however, the resulting models are also highly dependent upon training parameters. To select an optimal family of models, we fit a beta regression model to the distribution of performance metrics on a set of 2,880 grid search models and evaluate the resultant first- and second-order effects to explore how parameterization affects model performance. Comparing to SimLex-999, we show that clinical data can be used in an evaluation task with comparable optimal parameter settings as standard NLP evaluation datasets.

在对失语症患者的临床评估中,通过一项对抗命名任务来评估回忆和产生物体单词(失语症)能力的损害,在这项任务中,参与者看到目标刺激并说出相应的标签。向量空间词嵌入模型在评估目标生成对的语义相似度方面取得了初步成果,从而实现了自动评分;然而,得到的模型也高度依赖于训练参数。为了选择最优的模型族,我们将beta回归模型拟合到2880个网格搜索模型上的性能指标分布,并评估由此产生的一阶和二阶效应,以探索参数化如何影响模型性能。与SimLex-999相比,我们表明临床数据可以用于具有可比较的最佳参数设置作为标准NLP评估数据集的评估任务。
{"title":"Classification of Semantic Paraphasias: Optimization of a Word Embedding Model.","authors":"Katy McKinney-Bock,&nbsp;Steven Bedrick","doi":"10.18653/v1/w19-2007","DOIUrl":"https://doi.org/10.18653/v1/w19-2007","url":null,"abstract":"<p><p>In clinical assessment of people with aphasia, impairment in the ability to recall and produce words for objects (<i>anomia</i>) is assessed using a confrontation naming task, where a target stimulus is viewed and a corresponding label is spoken by the participant. Vector space word embedding models have had inital results in assessing semantic similarity of target-production pairs in order to automate scoring of this task; however, the resulting models are also highly dependent upon training parameters. To select an optimal family of models, we fit a beta regression model to the distribution of performance metrics on a set of 2,880 grid search models and evaluate the resultant first- and second-order effects to explore how parameterization affects model performance. Comparing to SimLex-999, we show that clinical data can be used in an evaluation task with comparable optimal parameter settings as standard NLP evaluation datasets.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2019 RepEval","pages":"52-62"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10328545/pdf/nihms-1908531.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9808366","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Extracting Adverse Drug Event Information with Minimal Engineering. 基于最小工程的药物不良事件信息提取。
Timothy Miller, Alon Geva, Dmitriy Dligach

In this paper we describe an evaluation of the potential of classical information extraction methods to extract drug-related attributes, including adverse drug events, and compare to more recently developed neural methods. We use the 2018 N2C2 shared task data as our gold standard data set for training. We train support vector machine classifiers to detect drug and drug attribute spans, and pair these detected entities as training instances for an SVM relation classifier, with both systems using standard features. We compare to baseline neural methods that use standard contextualized embedding representations for entity and relation extraction. The SVM-based system and a neural system obtain comparable results, with the SVM system doing better on concepts and the neural system performing better on relation extraction tasks. The neural system obtains surprisingly strong results compared to the system based on years of research in developing features for information extraction.

在本文中,我们描述了经典信息提取方法在提取药物相关属性(包括药物不良事件)方面的潜力评估,并与最近开发的神经方法进行了比较。我们使用2018年N2C2共享任务数据作为训练的黄金标准数据集。我们训练支持向量机分类器来检测药物和药物属性跨度,并将这些检测到的实体配对为支持向量机关系分类器的训练实例,两个系统都使用标准特征。我们将基线神经方法与使用标准上下文化嵌入表示进行实体和关系提取的方法进行比较。基于支持向量机的系统和神经系统得到了相当的结果,支持向量机系统在概念上做得更好,神经系统在关系提取任务上做得更好。与基于多年研究开发信息提取特征的系统相比,神经系统获得了令人惊讶的强大结果。
{"title":"Extracting Adverse Drug Event Information with Minimal Engineering.","authors":"Timothy Miller,&nbsp;Alon Geva,&nbsp;Dmitriy Dligach","doi":"10.18653/v1/w19-1903","DOIUrl":"https://doi.org/10.18653/v1/w19-1903","url":null,"abstract":"<p><p>In this paper we describe an evaluation of the potential of classical information extraction methods to extract drug-related attributes, including adverse drug events, and compare to more recently developed neural methods. We use the 2018 N2C2 shared task data as our gold standard data set for training. We train support vector machine classifiers to detect drug and drug attribute spans, and pair these detected entities as training instances for an SVM relation classifier, with both systems using standard features. We compare to baseline neural methods that use standard contextualized embedding representations for entity and relation extraction. The SVM-based system and a neural system obtain comparable results, with the SVM system doing better on concepts and the neural system performing better on relation extraction tasks. The neural system obtains surprisingly strong results compared to the system based on years of research in developing features for information extraction.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2019 ","pages":"22-27"},"PeriodicalIF":0.0,"publicationDate":"2019-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8140592/pdf/nihms-1035507.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39012326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Simplified Neural Unsupervised Domain Adaptation 简化神经无监督域自适应
Timothy Miller
Unsupervised domain adaptation (UDA) is the task of training a statistical model on labeled data from a source domain to achieve better performance on data from a target domain, with access to only unlabeled data in the target domain. Existing state-of-the-art UDA approaches use neural networks to learn representations that are trained to predict the values of subset of important features called “pivot features” on combined data from the source and target domains. In this work, we show that it is possible to improve on existing neural domain adaptation algorithms by 1) jointly training the representation learner with the task learner; and 2) removing the need for heuristically-selected “pivot features.” Our results show competitive performance with a simpler model.
无监督域自适应(Unsupervised domain adaptation, UDA)是在源域的标记数据上训练统计模型,以在目标域的数据上获得更好的性能,只访问目标域的未标记数据。现有的最先进的UDA方法使用神经网络来学习表征,这些表征被训练来预测来自源和目标域的组合数据上称为“枢轴特征”的重要特征子集的值。在这项工作中,我们表明可以通过1)联合训练表征学习器和任务学习器来改进现有的神经域自适应算法;2)消除了对启发式选择的“枢纽特征”的需求。我们的结果显示了一个更简单的模型具有竞争力的表现。
{"title":"Simplified Neural Unsupervised Domain Adaptation","authors":"Timothy Miller","doi":"10.18653/v1/N19-1039","DOIUrl":"https://doi.org/10.18653/v1/N19-1039","url":null,"abstract":"Unsupervised domain adaptation (UDA) is the task of training a statistical model on labeled data from a source domain to achieve better performance on data from a target domain, with access to only unlabeled data in the target domain. Existing state-of-the-art UDA approaches use neural networks to learn representations that are trained to predict the values of subset of important features called “pivot features” on combined data from the source and target domains. In this work, we show that it is possible to improve on existing neural domain adaptation algorithms by 1) jointly training the representation learner with the task learner; and 2) removing the need for heuristically-selected “pivot features.” Our results show competitive performance with a simpler model.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"1 1","pages":"414-419"},"PeriodicalIF":0.0,"publicationDate":"2019-05-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72895171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Oral-Motor and Lexical Diversity During Naturalistic Conversations in Adults with Autism Spectrum Disorder. 自闭症谱系障碍成人在自然对话中的口语运动和词汇多样性。
Julia Parish-Morris, Evangelos Sariyanidi, Casey Zampella, G Keith Bartley, Emily Ferguson, Ashley A Pallathra, Leila Bateman, Samantha Plate, Meredith Cola, Juhi Pandey, Edward S Brodkin, Robert T Schultz, Birkan Tunç

Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by impaired social communication and the presence of restricted, repetitive patterns of behaviors and interests. Prior research suggests that restricted patterns of behavior in ASD may be cross-domain phenomena that are evident in a variety of modalities. Computational studies of language in ASD provide support for the existence of an underlying dimension of restriction that emerges during a conversation. Similar evidence exists for restricted patterns of facial movement. Using tools from computational linguistics, computer vision, and information theory, this study tests whether cognitive-motor restriction can be detected across multiple behavioral domains in adults with ASD during a naturalistic conversation. Our methods identify restricted behavioral patterns, as measured by entropy in word use and mouth movement. Results suggest that adults with ASD produce significantly less diverse mouth movements and words than neurotypical adults, with an increased reliance on repeated patterns in both domains. The diversity values of the two domains are not significantly correlated, suggesting that they provide complementary information.

自闭症谱系障碍(ASD)是一种神经发育性疾病,其特点是社交沟通障碍以及行为和兴趣模式受限、重复。先前的研究表明,自闭症谱系障碍中的限制性行为模式可能是一种跨领域现象,在各种模式中都很明显。对 ASD 患者语言的计算研究支持了在对话中出现的潜在限制维度的存在。面部运动受限模式也有类似的证据。本研究利用计算语言学、计算机视觉和信息论等工具,测试了在自然对话过程中,是否可以在多个行为领域检测到 ASD 成人的认知运动限制。我们的方法可以识别受限的行为模式,并通过词语使用和嘴部动作的熵来进行测量。结果表明,与神经畸形成人相比,患有 ASD 的成人口部动作和言语的多样性明显较低,而且在这两个领域中对重复模式的依赖程度更高。两个领域的多样性值没有明显的相关性,这表明它们提供的信息是互补的。
{"title":"Oral-Motor and Lexical Diversity During Naturalistic Conversations in Adults with Autism Spectrum Disorder.","authors":"Julia Parish-Morris, Evangelos Sariyanidi, Casey Zampella, G Keith Bartley, Emily Ferguson, Ashley A Pallathra, Leila Bateman, Samantha Plate, Meredith Cola, Juhi Pandey, Edward S Brodkin, Robert T Schultz, Birkan Tunç","doi":"10.18653/v1/w18-0616","DOIUrl":"10.18653/v1/w18-0616","url":null,"abstract":"<p><p>Autism spectrum disorder (ASD) is a neurodevelopmental condition characterized by impaired social communication and the presence of restricted, repetitive patterns of behaviors and interests. Prior research suggests that restricted patterns of behavior in ASD may be cross-domain phenomena that are evident in a variety of modalities. Computational studies of language in ASD provide support for the existence of an underlying dimension of restriction that emerges during a conversation. Similar evidence exists for restricted patterns of facial movement. Using tools from computational linguistics, computer vision, and information theory, this study tests whether cognitive-motor restriction can be detected across multiple behavioral domains in adults with ASD during a naturalistic conversation. Our methods identify restricted behavioral patterns, as measured by entropy in word use and mouth movement. Results suggest that adults with ASD produce significantly less diverse mouth movements and words than neurotypical adults, with an increased reliance on repeated patterns in both domains. The diversity values of the two domains are not significantly correlated, suggesting that they provide complementary information.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"147-157"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7558464/pdf/nihms-985188.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38502652","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos. 二元对话视频中情感识别的会话记忆网络。
Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann

Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.

对话中的情绪识别对于移情机器的发展至关重要。目前的方法在对会话情绪进行分类时,大多忽略了说话人间依赖关系的作用。在本文中,我们讨论了识别二元对话视频中的话语级情绪。我们提出了一个深层神经框架,称为会话记忆网络,它利用会话历史中的上下文信息。该框架采用多模态方法,包括音频、视觉和文本特征,以及门控循环单元,将每个说话者过去的话语建模为记忆。然后,这些记忆通过基于注意力的跳跃来合并,以捕捉说话者之间的依赖关系。实验表明,该方法的精度比目前的方法提高了3-4%。
{"title":"Conversational Memory Network for Emotion Recognition in Dyadic Dialogue Videos.","authors":"Devamanyu Hazarika,&nbsp;Soujanya Poria,&nbsp;Amir Zadeh,&nbsp;Erik Cambria,&nbsp;Louis-Philippe Morency,&nbsp;Roger Zimmermann","doi":"10.18653/v1/n18-1193","DOIUrl":"https://doi.org/10.18653/v1/n18-1193","url":null,"abstract":"<p><p>Emotion recognition in conversations is crucial for the development of empathetic machines. Present methods mostly ignore the role of inter-speaker dependency relations while classifying emotions in conversations. In this paper, we address recognizing utterance-level emotions in dyadic conversational videos. We propose a deep neural framework, termed conversational memory network, which leverages contextual information from the conversation history. The framework takes a multimodal approach comprising audio, visual and textual features with gated recurrent units to model past utterances of each speaker into memories. Such memories are then merged using attention-based hops to capture inter-speaker dependencies. Experiments show an accuracy improvement of 3-4% over the state of the art.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"2122-2132"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.18653/v1/n18-1193","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"37778199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 282
Syntactic Patterns Improve Information Extraction for Medical Search. 语法模式改进医学搜索的信息提取。
Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron C Wallace

Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both linear and neural) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited, and the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.

医学专业人员通过指定患者类型、医疗干预和感兴趣的结果测量来搜索已发表的文献。在本文中,我们展示了特征编码语法模式如何提高最先进的序列标记模型(线性和神经)的性能,用于这些医学相关类别的信息提取。我们分析了被利用的模式类型,以及为这些模式诱导的语义空间,即为识别的多标记模式学习的分布式表示。我们发现,这些学习到的表征与那些组成单字图的表征有很大的不同,这表明这些模式捕捉了否则会丢失的上下文信息。
{"title":"Syntactic Patterns Improve Information Extraction for Medical Search.","authors":"Roma Patel, Yinfei Yang, Iain Marshall, Ani Nenkova, Byron C Wallace","doi":"","DOIUrl":"","url":null,"abstract":"<p><p>Medical professionals search the published literature by specifying the type of <i>patients</i>, the medical <i>intervention(s)</i> and the <i>outcome</i> measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both linear and neural) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited, and the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 Short Paper","pages":"371-377"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6174535/pdf/nihms-988061.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36563083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multi-Context Character Prediction Model for a Brain-Computer Interface. 脑机接口的多上下文字符预测模型。
Shiran Dudy, Steven Bedrick, Shaobin Xu, David A Smith

Brain-computer interfaces and other augmentative and alternative communication devices introduce language-modeing challenges distinct from other character-entry methods. In particular, the acquired signal of the EEG (electroencephalogram) signal is noisier, which, in turn, makes the user intent harder to decipher. In order to adapt to this condition, we propose to maintain ambiguous history for every time step, and to employ, apart from the character language model, word information to produce a more robust prediction system. We present preliminary results that compare this proposed Online-Context Language Model (OCLM) to current algorithms that are used in this type of setting. Evaluations on both perplexity and predictive accuracy demonstrate promising results when dealing with ambiguous histories in order to provide to the front end a distribution of the next character the user might type.

脑机接口和其他增强和替代通信设备引入了不同于其他字符输入方法的语言建模挑战。特别是,采集到的EEG(脑电图)信号噪声更大,这反过来又使用户意图更难被破译。为了适应这种情况,我们建议在每个时间步保持模糊历史,并且除了使用字符语言模型外,还使用单词信息来产生更稳健的预测系统。我们提出了初步的结果,将这个提议的在线上下文语言模型(OCLM)与当前在这种类型的设置中使用的算法进行比较。在处理模棱两可的历史时,为了向前端提供用户可能键入的下一个字符的分布,对困惑度和预测准确性的评估显示了有希望的结果。
{"title":"A Multi-Context Character Prediction Model for a Brain-Computer Interface.","authors":"Shiran Dudy,&nbsp;Steven Bedrick,&nbsp;Shaobin Xu,&nbsp;David A Smith","doi":"10.18653/v1/w18-1210","DOIUrl":"https://doi.org/10.18653/v1/w18-1210","url":null,"abstract":"<p><p>Brain-computer interfaces and other augmentative and alternative communication devices introduce language-modeing challenges distinct from other character-entry methods. In particular, the acquired signal of the EEG (electroencephalogram) signal is noisier, which, in turn, makes the user intent harder to decipher. In order to adapt to this condition, we propose to maintain ambiguous history for every time step, and to employ, apart from the character language model, word information to produce a more robust prediction system. We present preliminary results that compare this proposed Online-Context Language Model (OCLM) to current algorithms that are used in this type of setting. Evaluations on both perplexity and predictive accuracy demonstrate promising results when dealing with ambiguous histories in order to provide to the front end a distribution of the next character the user might type.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"72-77"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8087439/pdf/nihms-1001613.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"38861816","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
EMR Coding with Semi-Parametric Multi-Head Matching Networks. 半参数多头匹配网络的EMR编码。
Anthony Rios, Ramakanth Kavuluru

Coding EMRs with diagnosis and procedure codes is an indispensable task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and mis-interpretation of a patient's well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. In this paper, we present a new neural network architecture that combines ideas from few-shot learning matching networks, multi-label loss functions, and convolutional neural networks for text classification to significantly outperform other state-of-the-art models. Our evaluations are conducted using a well known deidentified EMR dataset (MIMIC) with a variety of multi-label performance measures.

使用诊断和程序代码对电子病历进行编码是计费、辅助数据分析和监测健康趋势不可或缺的任务。编码的速度和准确性都至关重要。虽然编码错误可能导致更多的患者方面的经济负担和对患者健康状况的错误解释,但也需要及时编码,以避免积压和医疗机构的额外成本。在本文中,我们提出了一种新的神经网络架构,它结合了来自少量学习匹配网络、多标签损失函数和用于文本分类的卷积神经网络的思想,显著优于其他最先进的模型。我们的评估是使用一个众所周知的未识别EMR数据集(MIMIC)进行的,其中包含各种多标签性能测量。
{"title":"EMR Coding with Semi-Parametric Multi-Head Matching Networks.","authors":"Anthony Rios,&nbsp;Ramakanth Kavuluru","doi":"10.18653/v1/N18-1189","DOIUrl":"https://doi.org/10.18653/v1/N18-1189","url":null,"abstract":"<p><p>Coding EMRs with diagnosis and procedure codes is an indispensable task for billing, secondary data analyses, and monitoring health trends. Both speed and accuracy of coding are critical. While coding errors could lead to more patient-side financial burden and mis-interpretation of a patient's well-being, timely coding is also needed to avoid backlogs and additional costs for the healthcare facility. In this paper, we present a new neural network architecture that combines ideas from few-shot learning matching networks, multi-label loss functions, and convolutional neural networks for text classification to significantly outperform other state-of-the-art models. Our evaluations are conducted using a well known deidentified EMR dataset (MIMIC) with a variety of multi-label performance measures.</p>","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"2018 ","pages":"2081-2091"},"PeriodicalIF":0.0,"publicationDate":"2018-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6105925/pdf/nihms-985153.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"36432294","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Syntactic Patterns Improve Information Extraction for Medical Search 语法模式改进医学搜索的信息提取
Roma Patel, Yinfei Yang, I. Marshall, A. Nenkova, Byron C. Wallace
Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both neural and linear) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited and of the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.
医学专业人员通过指定患者类型、医疗干预和感兴趣的结果测量来搜索已发表的文献。在本文中,我们展示了特征编码语法模式如何提高最先进的序列标记模型(神经和线性)的性能,用于这些医学相关类别的信息提取。我们分析了利用的模式类型和为这些模式诱导的语义空间,即为已识别的多标记模式学习的分布式表示。我们发现,这些学习到的表征与那些组成单字图的表征有很大的不同,这表明这些模式捕捉了否则会丢失的上下文信息。
{"title":"Syntactic Patterns Improve Information Extraction for Medical Search","authors":"Roma Patel, Yinfei Yang, I. Marshall, A. Nenkova, Byron C. Wallace","doi":"10.18653/v1/N18-2060","DOIUrl":"https://doi.org/10.18653/v1/N18-2060","url":null,"abstract":"Medical professionals search the published literature by specifying the type of patients, the medical intervention(s) and the outcome measure(s) of interest. In this paper we demonstrate how features encoding syntactic patterns improve the performance of state-of-the-art sequence tagging models (both neural and linear) for information extraction of these medically relevant categories. We present an analysis of the type of patterns exploited and of the semantic space induced for these, i.e., the distributed representations learned for identified multi-token patterns. We show that these learned representations differ substantially from those of the constituent unigrams, suggesting that the patterns capture contextual information that is otherwise lost.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"18 1","pages":"371-377"},"PeriodicalIF":0.0,"publicationDate":"2018-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74621974","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Bidirectional RNN for Medical Event Detection in Electronic Health Records 基于双向RNN的电子病历医疗事件检测
Abhyuday N. Jagannatha, Hong Yu
Sequence labeling for extraction of medical events and their attributes from unstructured text in Electronic Health Record (EHR) notes is a key step towards semantic understanding of EHRs. It has important applications in health informatics including pharmacovigilance and drug surveillance. The state of the art supervised machine learning models in this domain are based on Conditional Random Fields (CRFs) with features calculated from fixed context windows. In this application, we explored recurrent neural network frameworks and show that they significantly out-performed the CRF models.
从电子健康记录(EHR)笔记的非结构化文本中提取医疗事件及其属性的序列标记是实现电子健康记录语义理解的关键一步。它在包括药物警戒和药物监测在内的卫生信息学中有着重要的应用。该领域最先进的监督机器学习模型是基于条件随机场(CRFs)的,其特征是从固定的上下文窗口计算出来的。在这个应用中,我们探索了循环神经网络框架,并表明它们明显优于CRF模型。
{"title":"Bidirectional RNN for Medical Event Detection in Electronic Health Records","authors":"Abhyuday N. Jagannatha, Hong Yu","doi":"10.18653/v1/N16-1056","DOIUrl":"https://doi.org/10.18653/v1/N16-1056","url":null,"abstract":"Sequence labeling for extraction of medical events and their attributes from unstructured text in Electronic Health Record (EHR) notes is a key step towards semantic understanding of EHRs. It has important applications in health informatics including pharmacovigilance and drug surveillance. The state of the art supervised machine learning models in this domain are based on Conditional Random Fields (CRFs) with features calculated from fixed context windows. In this application, we explored recurrent neural network frameworks and show that they significantly out-performed the CRF models.","PeriodicalId":74542,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting","volume":"40 1","pages":"473-482"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82718166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 271
期刊
Proceedings of the conference. Association for Computational Linguistics. North American Chapter. Meeting
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1