首页 > 最新文献

Workshop on Biomedical Natural Language Processing最新文献

英文 中文
Prediction of Protein Sub-cellular Localization using Information from Texts and Sequences. 利用文本和序列信息预测蛋白质亚细胞定位。
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572324
H. Chun, Chisato Yamasaki, Naomi Saichi, Masayuki Tanaka, T. Hishiki, T. Imanishi, T. Gojobori, Jin-Dong Kim, Junichi Tsujii, T. Takagi
This paper presents a novel prediction approach for protein sub-cellular localization. We have incorporated text and sequence-based approaches.
提出了一种新的蛋白质亚细胞定位预测方法。我们结合了文本和基于序列的方法。
{"title":"Prediction of Protein Sub-cellular Localization using Information from Texts and Sequences.","authors":"H. Chun, Chisato Yamasaki, Naomi Saichi, Masayuki Tanaka, T. Hishiki, T. Imanishi, T. Gojobori, Jin-Dong Kim, Junichi Tsujii, T. Takagi","doi":"10.3115/1572306.1572324","DOIUrl":"https://doi.org/10.3115/1572306.1572324","url":null,"abstract":"This paper presents a novel prediction approach for protein sub-cellular localization. We have incorporated text and sequence-based approaches.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130805553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge Sources for Word Sense Disambiguation of Biomedical Text 生物医学文本词义消歧的知识来源
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572321
Mark Stevenson, Yikun Guo, R. Gaizauskas, David Martínez
Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of a variety of knowledge sources including linguistic information (from the context in which the ambiguous term is used) and domain-specific resources (such as UMLS). In this paper we compare a range of knowledge sources which have been previously used and introduce a novel one: MeSH terms. The best performance is obtained using linguistic features in combination with MeSH terms. Results from our system outperform published results for previously reported systems on a standard test set (the NLM-WSD corpus).
与其他领域的文本一样,生物医学文档包含一系列具有多种可能含义的术语。这些歧义构成了生物医学文本自动处理的重大障碍。以前解决这个问题的方法利用了各种知识来源,包括语言信息(来自使用歧义术语的上下文中)和特定于领域的资源(例如UMLS)。在本文中,我们比较了一系列以前使用的知识来源,并引入了一种新的知识来源:MeSH术语。将语言特征与MeSH术语相结合,可以获得最佳的性能。我们系统的结果在标准测试集(NLM-WSD语料库)上优于先前报告的系统的公布结果。
{"title":"Knowledge Sources for Word Sense Disambiguation of Biomedical Text","authors":"Mark Stevenson, Yikun Guo, R. Gaizauskas, David Martínez","doi":"10.3115/1572306.1572321","DOIUrl":"https://doi.org/10.3115/1572306.1572321","url":null,"abstract":"Like text in other domains, biomedical documents contain a range of terms with more than one possible meaning. These ambiguities form a significant obstacle to the automatic processing of biomedical texts. Previous approaches to resolving this problem have made use of a variety of knowledge sources including linguistic information (from the context in which the ambiguous term is used) and domain-specific resources (such as UMLS). In this paper we compare a range of knowledge sources which have been previously used and introduce a novel one: MeSH terms. The best performance is obtained using linguistic features in combination with MeSH terms. Results from our system outperform published results for previously reported systems on a standard test set (the NLM-WSD corpus).","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131375842","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
A Pilot Annotation to Investigate Discourse Connectivity in Biomedical Text 生物医学语篇连通性研究的试点标注
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572325
Hong Yu, Nadya Frid, S. McRoy, R. Prasad, Alan Lee, A. Joshi
The goal of the Penn Discourse Treebank (PDTB) project is to develop a large-scale corpus, annotated with coherence relations marked by discourse connectives. Currently, the primary application of the PDTB annotation has been to news articles. In this study, we tested whether the PDTB guidelines can be adapted to a different genre. We annotated discourse connectives and their arguments in one 4,937-token full-text biomedical article. Two linguist annotators showed an agreement of 85% after simple conventions were added. For the remaining 15% cases, we found that biomedical domain-specific knowledge is needed to capture the linguistic cues that can be used to resolve inter-annotator disagreement. We found that the two annotators were able to reach an agreement after discussion. Thus our experiments suggest that the PDTB annotation can be adapted to new domains by minimally adjusting the guidelines and by adding some further domain-specific linguistic cues.
宾大语篇树库(PDTB)项目的目标是开发一个大规模的语料库,用语篇连接词标记连贯关系。目前,PDTB注释的主要应用是新闻文章。在这项研究中,我们测试了PDTB指南是否可以适用于不同的类型。我们在一篇4,937 token的生物医学全文文章中注释了话语连接词及其论点。两位语言学家的注释显示,在加入简单的约定后,一致性达到85%。对于其余15%的案例,我们发现需要生物医学领域特定知识来捕获可用于解决注释者之间分歧的语言线索。经过讨论,我们发现两位注释者能够达成一致。因此,我们的实验表明,PDTB注释可以通过最小限度地调整指南和添加一些进一步的特定于领域的语言线索来适应新的领域。
{"title":"A Pilot Annotation to Investigate Discourse Connectivity in Biomedical Text","authors":"Hong Yu, Nadya Frid, S. McRoy, R. Prasad, Alan Lee, A. Joshi","doi":"10.3115/1572306.1572325","DOIUrl":"https://doi.org/10.3115/1572306.1572325","url":null,"abstract":"The goal of the Penn Discourse Treebank (PDTB) project is to develop a large-scale corpus, annotated with coherence relations marked by discourse connectives. Currently, the primary application of the PDTB annotation has been to news articles. In this study, we tested whether the PDTB guidelines can be adapted to a different genre. We annotated discourse connectives and their arguments in one 4,937-token full-text biomedical article. Two linguist annotators showed an agreement of 85% after simple conventions were added. For the remaining 15% cases, we found that biomedical domain-specific knowledge is needed to capture the linguistic cues that can be used to resolve inter-annotator disagreement. We found that the two annotators were able to reach an agreement after discussion. Thus our experiments suggest that the PDTB annotation can be adapted to new domains by minimally adjusting the guidelines and by adding some further domain-specific linguistic cues.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130050004","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Conditional Random Fields and Support Vector Machines for Disorder Named Entity Recognition in Clinical Texts 临床文本中无序命名实体识别的条件随机场和支持向量机
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572326
Dingcheng Li, G. Savova, K. Schuler
We present a comparative study between two machine learning methods, Conditional Random Fields and Support Vector Machines for clinical named entity recognition. We explore their applicability to clinical domain. Evaluation against a set of gold standard named entities shows that CRFs outperform SVMs. The best F-score with CRFs is 0.86 and for the SVMs is 0.64 as compared to a baseline of 0.60.
我们提出了两种机器学习方法的比较研究,条件随机场和支持向量机用于临床命名实体识别。探讨其在临床领域的适用性。对一组黄金标准命名实体的评估表明,crf优于svm。与基线0.60相比,CRFs的最佳f值为0.86,svm的最佳f值为0.64。
{"title":"Conditional Random Fields and Support Vector Machines for Disorder Named Entity Recognition in Clinical Texts","authors":"Dingcheng Li, G. Savova, K. Schuler","doi":"10.3115/1572306.1572326","DOIUrl":"https://doi.org/10.3115/1572306.1572326","url":null,"abstract":"We present a comparative study between two machine learning methods, Conditional Random Fields and Support Vector Machines for clinical named entity recognition. We explore their applicability to clinical domain. Evaluation against a set of gold standard named entities shows that CRFs outperform SVMs. The best F-score with CRFs is 0.86 and for the SVMs is 0.64 as compared to a baseline of 0.60.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115545257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 106
CBR-Tagger: a case-based reasoning approach to the gene/protein mention problem CBR-Tagger:基于案例的基因/蛋白质提及问题推理方法
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572333
M. Neves, M. Chagoyen, J. Carazo, A. Pascual-Montano
This work proposes a case-based classifier to tackle the gene/protein mention problem in biomedical literature. The so called gene mention problem consists of the recognition of gene and protein entities in scientific texts. A classification process aiming at deciding if a term is a gene mention or not is carried out for each word in the text. It is based on the selection of the best or most similar case in a base of known and unknown cases. The approach was evaluated on several datasets for different organisms and results show the suitability of this approach for the gene mention problem.
本文提出了一种基于案例的分类器来解决生物医学文献中的基因/蛋白质提及问题。所谓基因提及问题是指科学文本中基因和蛋白质实体的识别问题。对文本中的每个单词进行分类过程,目的是确定一个术语是否为基因提及。它是基于在已知和未知案例的基础上选择最佳或最相似的案例。在不同生物的多个数据集上对该方法进行了评估,结果表明该方法适用于基因提及问题。
{"title":"CBR-Tagger: a case-based reasoning approach to the gene/protein mention problem","authors":"M. Neves, M. Chagoyen, J. Carazo, A. Pascual-Montano","doi":"10.3115/1572306.1572333","DOIUrl":"https://doi.org/10.3115/1572306.1572333","url":null,"abstract":"This work proposes a case-based classifier to tackle the gene/protein mention problem in biomedical literature. The so called gene mention problem consists of the recognition of gene and protein entities in scientific texts. A classification process aiming at deciding if a term is a gene mention or not is carried out for each word in the text. It is based on the selection of the best or most similar case in a base of known and unknown cases. The approach was evaluated on several datasets for different organisms and results show the suitability of this approach for the gene mention problem.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121479359","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Adaptive Information Extraction for Complex Biomedical Tasks 复杂生物医学任务的自适应信息提取
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572339
D. Feng, Gully A. Burns, E. Hovy
Biomedical information extraction tasks are often more complex and contain uncertainty at each step during problem solving processes. We present an adaptive information extraction framework and demonstrate how to explore uncertainty using feedback integration.
生物医学信息提取任务通常更复杂,并且在问题解决过程中的每一步都包含不确定性。我们提出了一个自适应信息提取框架,并演示了如何使用反馈集成来探索不确定性。
{"title":"Adaptive Information Extraction for Complex Biomedical Tasks","authors":"D. Feng, Gully A. Burns, E. Hovy","doi":"10.3115/1572306.1572339","DOIUrl":"https://doi.org/10.3115/1572306.1572339","url":null,"abstract":"Biomedical information extraction tasks are often more complex and contain uncertainty at each step during problem solving processes. We present an adaptive information extraction framework and demonstrate how to explore uncertainty using feedback integration.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115207302","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Extracting Clinical Relationships from Patient Narratives 从病人叙述中提取临床关系
Pub Date : 2008-06-19 DOI: 10.3115/1572306.1572309
A. Roberts, R. Gaizauskas, Mark Hepple
The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records, for clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to clinical relationships. We describe a supervised machine learning system, trained with a corpus of oncology narratives hand-annotated with clinically important relationships. Various shallow features are extracted from these texts, and used to train statistical classifiers. We compare the suitability of these features for clinical relationship extraction, how extraction varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships.
临床电子科学框架(CLEF)项目建立了一个系统,从医疗记录的文本成分中提取临床重要信息,用于临床研究、循证医疗保健和基因型与表型相遇信息学。该系统的一部分是识别文本中临床重要实体之间的关系。该领域中关系抽取的典型方法使用了完整解析、特定于领域的语法和编码领域知识的大型知识库。在生物医学NLP的其他领域,统计机器学习方法现在通常应用于关系提取。我们报告了这些统计技术在临床关系中的新应用。我们描述了一个有监督的机器学习系统,该系统使用临床重要关系手工注释的肿瘤学叙述语料库进行训练。从这些文本中提取各种浅层特征,并用于训练统计分类器。我们比较了这些特征在临床关系提取中的适用性,句子间和句子内关系的提取是如何变化的,并检查了学习各种关系所需的训练数据量。
{"title":"Extracting Clinical Relationships from Patient Narratives","authors":"A. Roberts, R. Gaizauskas, Mark Hepple","doi":"10.3115/1572306.1572309","DOIUrl":"https://doi.org/10.3115/1572306.1572309","url":null,"abstract":"The Clinical E-Science Framework (CLEF) project has built a system to extract clinically significant information from the textual component of medical records, for clinical research, evidence-based healthcare and genotype-meets-phenotype informatics. One part of this system is the identification of relationships between clinically important entities in the text. Typical approaches to relationship extraction in this domain have used full parses, domain-specific grammars, and large knowledge bases encoding domain knowledge. In other areas of biomedical NLP, statistical machine learning approaches are now routinely applied to relationship extraction. We report on the novel application of these statistical techniques to clinical relationships. \u0000 \u0000We describe a supervised machine learning system, trained with a corpus of oncology narratives hand-annotated with clinically important relationships. Various shallow features are extracted from these texts, and used to train statistical classifiers. We compare the suitability of these features for clinical relationship extraction, how extraction varies between inter- and intra-sentential relationships, and examine the amount of training data needed to learn various relationships.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132998206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Raising the Compatibility of Heterogeneous Annotations: A Case Study on 提高异构注释的兼容性:以
Pub Date : 2008-06-01 DOI: 10.3115/1572306.1572338
Yue Wang, Kazuhiro Yoshida, Jin-Dong Kim, Rune Saetre, Junichi Tsujii
While there are several corpora which claim to have annotations for protein references, the heterogeneity between the annotations is recognized as an obstacle to develop expensive resources in a synergistic way. Here we present a series of experimental results which show the differences of protein mention annotations made to two corpora, GENIA and AImed.
虽然有几个语料库声称具有蛋白质参考的注释,但注释之间的异质性被认为是以协同方式开发昂贵资源的障碍。在此,我们提出了一系列实验结果,显示了对GENIA和aim两种语料库的蛋白质提及注释的差异。
{"title":"Raising the Compatibility of Heterogeneous Annotations: A Case Study on","authors":"Yue Wang, Kazuhiro Yoshida, Jin-Dong Kim, Rune Saetre, Junichi Tsujii","doi":"10.3115/1572306.1572338","DOIUrl":"https://doi.org/10.3115/1572306.1572338","url":null,"abstract":"While there are several corpora which claim to have annotations for protein references, the heterogeneity between the annotations is recognized as an obstacle to develop expensive resources in a synergistic way. Here we present a series of experimental results which show the differences of protein mention annotations made to two corpora, GENIA and AImed.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123271773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ADEQA: A Question Answer based approach for joint ADE-Suspect Extraction using Sequence-To-Sequence Transformers ADEQA:一种基于问答的方法,用于使用序列到序列变压器的联合ade可疑提取
Pub Date : 1900-01-01 DOI: 10.18653/v1/2023.bionlp-1.17
Vinayak Arannil, Tomal Deb, Atanu Roy
Early identification of Adverse Drug Events (ADE) is critical for taking prompt actions while introducing new drugs into the market. These ADEs information are available through various unstructured data sources like clinical study reports, patient health records, social media posts, etc. Extracting ADEs and the related suspect drugs using machine learning is a challenging task due to the complex linguistic relations between drug ADE pairs in textual data and unavailability of large corpus of labelled datasets. This paper introduces ADEQA, a question- answer(QA) based approach using quasi supervised labelled data and sequence-to-sequence transformers to extract ADEs, drug suspects and the relationships between them. Unlike traditional QA models, natural language generation (NLG) based models don’t require extensive token level labelling and thereby reduces the adoption barrier significantly. On a public ADE corpus, we were able to achieve state-of-the-art results with an F1 score of 94% on establishing the relationships between ADEs and the respective suspects.
早期识别药物不良事件(ADE)对于在向市场推出新药时迅速采取行动至关重要。这些不良事件信息可通过各种非结构化数据源获得,如临床研究报告、患者健康记录、社交媒体帖子等。由于文本数据中药物ADE对之间复杂的语言关系以及标记数据集的大型语料库不可用,使用机器学习提取ADE和相关可疑药物是一项具有挑战性的任务。本文介绍了一种基于准监督标记数据和序列到序列转换器的问答方法来提取ade、毒品嫌疑人及其之间的关系。与传统的QA模型不同,基于自然语言生成(NLG)的模型不需要大量的令牌级别标记,因此显著降低了采用障碍。在公共ADE语料库上,我们能够获得最先进的结果,在建立ADE和各自嫌疑人之间的关系方面,F1得分为94%。
{"title":"ADEQA: A Question Answer based approach for joint ADE-Suspect Extraction using Sequence-To-Sequence Transformers","authors":"Vinayak Arannil, Tomal Deb, Atanu Roy","doi":"10.18653/v1/2023.bionlp-1.17","DOIUrl":"https://doi.org/10.18653/v1/2023.bionlp-1.17","url":null,"abstract":"Early identification of Adverse Drug Events (ADE) is critical for taking prompt actions while introducing new drugs into the market. These ADEs information are available through various unstructured data sources like clinical study reports, patient health records, social media posts, etc. Extracting ADEs and the related suspect drugs using machine learning is a challenging task due to the complex linguistic relations between drug ADE pairs in textual data and unavailability of large corpus of labelled datasets. This paper introduces ADEQA, a question- answer(QA) based approach using quasi supervised labelled data and sequence-to-sequence transformers to extract ADEs, drug suspects and the relationships between them. Unlike traditional QA models, natural language generation (NLG) based models don’t require extensive token level labelling and thereby reduces the adoption barrier significantly. On a public ADE corpus, we were able to achieve state-of-the-art results with an F1 score of 94% on establishing the relationships between ADEs and the respective suspects.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123415083","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated Preamble Detection in Dictated Medical Reports 口述医学报告的自动前言检测
Pub Date : 1900-01-01 DOI: 10.18653/v1/W17-2336
Wael Salloum, Greg P. Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft
Dictated medical reports very often feature a preamble containing metainformation about the report such as patient and physician names, location and name of the clinic, date of procedure, and so on. In the medical transcription process, the preamble is usually omitted from the final report, as it contains information already available in the electronic medical record. We present a method which is able to automatically identify preambles in medical dictations. The method makes use of stateof-the-art NLP techniques including word embeddings and Bi-LSTMs and achieves preamble detection performance superior to humans.
听写的医疗报告通常有一个包含报告元信息的序言,如病人和医生的姓名、诊所的位置和名称、手术日期等等。在医学转录过程中,序言部分通常从最终报告中省略,因为它包含电子病历中已有的信息。我们提出了一种能够自动识别医学口述前言的方法。该方法利用了最先进的自然语言处理技术,包括词嵌入和bi - lstm,实现了优于人类的序言检测性能。
{"title":"Automated Preamble Detection in Dictated Medical Reports","authors":"Wael Salloum, Greg P. Finley, Erik Edwards, Mark Miller, David Suendermann-Oeft","doi":"10.18653/v1/W17-2336","DOIUrl":"https://doi.org/10.18653/v1/W17-2336","url":null,"abstract":"Dictated medical reports very often feature a preamble containing metainformation about the report such as patient and physician names, location and name of the clinic, date of procedure, and so on. In the medical transcription process, the preamble is usually omitted from the final report, as it contains information already available in the electronic medical record. We present a method which is able to automatically identify preambles in medical dictations. The method makes use of stateof-the-art NLP techniques including word embeddings and Bi-LSTMs and achieves preamble detection performance superior to humans.","PeriodicalId":200974,"journal":{"name":"Workshop on Biomedical Natural Language Processing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115542861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
期刊
Workshop on Biomedical Natural Language Processing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1