Extracting PICO Elements From RCT Abstracts Using 1-2gram Analysis And Multitask Classification

Proceedings of the 3rd International Conference on Medical and Health Informatics Pub Date : 2019-01-24 DOI:10.1145/3340037.3340043

X. Yuan, Liao Xiaoli, Liu Shilei, Shi Qinwen, Li Ke

引用次数: 3

Abstract

The core of evidence-based medicine is to read and analyze numerous papers in the medical literature on a specific clinical problem and summarize the authoritative answers to that problem. Currently, to formulate a clear and focused clinical problem, the popular PICO framework is usually adopted, in which each clinical problem is considered to consist of four parts: patient/problem (P), intervention (I), comparison (C) and outcome (O). In this study, we compared several classification models that are commonly used in traditional machine learning. Next, we developed a multitask classification model based on a soft-margin SVM with a specialized feature engineering method that combines 1-2gram analysis with TF-IDF analysis. Finally, we trained and tested several generic models on an open-source data set from BioNLP 2018. The results show that the proposed multitask SVM classification model based on 1-2gram TF-IDF features exhibits the best performance among the tested models.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

利用1-2g分析和多任务分类从RCT摘要中提取PICO元素

循证医学的核心是阅读和分析医学文献中关于特定临床问题的大量论文，并总结出对该问题的权威答案。目前，为了制定一个清晰而有重点的临床问题，通常采用流行的PICO框架，其中每个临床问题被认为由四部分组成:患者/问题(P)、干预(I)、比较(C)和结果(O)。在本研究中，我们比较了传统机器学习中常用的几种分类模型。接下来，我们利用1-2gram分析与TF-IDF分析相结合的专业特征工程方法，开发了基于软边支持向量机的多任务分类模型。最后，我们在BioNLP 2018的开源数据集上训练和测试了几个通用模型。结果表明，基于1-2gram TF-IDF特征的多任务SVM分类模型在测试模型中表现出最好的性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Proceedings of the 3rd International Conference on Medical and Health Informatics

自引率

0.00%

发文量