Two-stage Federated Phenotyping and Patient Representation Learning.

Proceedings of the conference. Association for Computational Linguistics. Meeting Pub Date : 2019-08-01 DOI:10.18653/v1/W19-5030

Dianbo Liu, Dmitriy Dligach, Timothy Miller

{"title":"Two-stage Federated Phenotyping and Patient Representation Learning.","authors":"Dianbo Liu, Dmitriy Dligach, Timothy Miller","doi":"10.18653/v1/W19-5030","DOIUrl":null,"url":null,"abstract":"<p><p>A large percentage of medical information is in unstructured text format in electronic medical record systems. Manual extraction of information from clinical notes is extremely time consuming. Natural language processing has been widely used in recent years for automatic information extraction from medical texts. However, algorithms trained on data from a single healthcare provider are not generalizable and error-prone due to the heterogeneity and uniqueness of medical documents. We develop a two-stage federated natural language processing method that enables utilization of clinical notes from different hospitals or clinics without moving the data, and demonstrate its performance using obesity and comorbities phenotyping as medical task. This approach not only improves the quality of a specific clinical task but also facilitates knowledge progression in the whole healthcare system, which is an essential part of learning health system. To the best of our knowledge, this is the first application of federated machine learning in clinical NLP.</p>","PeriodicalId":74541,"journal":{"name":"Proceedings of the conference. Association for Computational Linguistics. Meeting","volume":" ","pages":"283-291"},"PeriodicalIF":0.0000,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8072229/pdf/nihms-1063931.pdf","citationCount":"58","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the conference. Association for Computational Linguistics. Meeting","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18653/v1/W19-5030","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 58

Abstract

A large percentage of medical information is in unstructured text format in electronic medical record systems. Manual extraction of information from clinical notes is extremely time consuming. Natural language processing has been widely used in recent years for automatic information extraction from medical texts. However, algorithms trained on data from a single healthcare provider are not generalizable and error-prone due to the heterogeneity and uniqueness of medical documents. We develop a two-stage federated natural language processing method that enables utilization of clinical notes from different hospitals or clinics without moving the data, and demonstrate its performance using obesity and comorbities phenotyping as medical task. This approach not only improves the quality of a specific clinical task but also facilitates knowledge progression in the whole healthcare system, which is an essential part of learning health system. To the best of our knowledge, this is the first application of federated machine learning in clinical NLP.

Abstract Image

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

两阶段联合表型和患者表征学习。

在电子病历系统中，很大比例的医疗信息采用非结构化文本格式。人工从临床记录中提取信息非常耗时。近年来，自然语言处理被广泛应用于医学文本信息的自动提取。然而，由于医疗文档的异质性和唯一性，对来自单个医疗保健提供者的数据进行训练的算法不具有通用性，而且容易出错。我们开发了一种两阶段联合自然语言处理方法，可以在不移动数据的情况下利用来自不同医院或诊所的临床记录，并使用肥胖和合并症表型作为医疗任务来演示其性能。这种方法不仅提高了特定临床任务的质量，而且促进了整个医疗保健系统的知识进步，这是学习卫生系统的重要组成部分。据我们所知，这是联邦机器学习在临床NLP中的首次应用。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the conference. Association for Computational Linguistics. Meeting

自引率

0.00%

发文量