{"title":"Feature Split-based Information Extraction in the Field of Medicine","authors":"Jing Wan, Huanchun Yan, Xuechao Zhang","doi":"10.22323/1.300.0029","DOIUrl":null,"url":null,"abstract":"In recent years, more and more studies have been done on symptom information extraction. \nThese studies are mostly based on clinical medical records, and they focus only on symptom \nentities, which are not sufficient to convey the full symptom information. This paper presents a \nfeature split-based approach to extract symptom information from Chinese medicine instruction \ntexts. In this approach, the symptom information is split into two parts: symptom subject entity \nand symptom manifestation entity. The main idea of this method is to automatically recognize \nthe symptom subject and symptom manifestation first, and then add these two identification \nresults as features to the symptom information extraction task. Through a series of experiments \nbased on Conditional Random Fields (CRF)- an effective model proved by lots of experiments in \nthe field of medicine, it is obvious that the feature split-based approach proposed in this paper \ncan obtain higher accuracy and recall rate in symptom information extraction.","PeriodicalId":93366,"journal":{"name":"Proceedings. IEEE International Conference on Cloud Computing","volume":"15 1","pages":"029"},"PeriodicalIF":0.0000,"publicationDate":"2018-02-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings. IEEE International Conference on Cloud Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.22323/1.300.0029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
In recent years, more and more studies have been done on symptom information extraction.
These studies are mostly based on clinical medical records, and they focus only on symptom
entities, which are not sufficient to convey the full symptom information. This paper presents a
feature split-based approach to extract symptom information from Chinese medicine instruction
texts. In this approach, the symptom information is split into two parts: symptom subject entity
and symptom manifestation entity. The main idea of this method is to automatically recognize
the symptom subject and symptom manifestation first, and then add these two identification
results as features to the symptom information extraction task. Through a series of experiments
based on Conditional Random Fields (CRF)- an effective model proved by lots of experiments in
the field of medicine, it is obvious that the feature split-based approach proposed in this paper
can obtain higher accuracy and recall rate in symptom information extraction.
近年来,对症状信息提取的研究越来越多。这些研究大多基于临床病历,只关注症状实体,不足以传达完整的症状信息。提出了一种基于特征分割的中药说明书症状信息提取方法。该方法将症状信息分为两部分:症状主体实体和症状表现实体。该方法的主要思想是首先自动识别症状主体和症状表现,然后将这两个识别结果作为特征加入到症状信息提取任务中。通过一系列基于条件随机场(Conditional Random Fields, CRF)的实验,可以看出本文提出的基于特征分割的方法在症状信息提取中可以获得更高的准确率和召回率。条件随机场是医学领域中被大量实验证明的有效模型。