Zenghua Liao, Zongqiang Yang, Peixin Huang, Ning Pang, Xiang Zhao
{"title":"基于多模型融合的中国疫情事件层次提取。","authors":"Zenghua Liao, Zongqiang Yang, Peixin Huang, Ning Pang, Xiang Zhao","doi":"10.1007/s41019-022-00203-6","DOIUrl":null,"url":null,"abstract":"<p><p>In recent years, Coronavirus disease 2019 (COVID-19) has become a global epidemic, and some efforts have been devoted to tracking and controlling its spread. Extracting structured knowledge from involved epidemic case reports can inform the surveillance system, which is important for controlling the spread of outbreaks. Therefore, in this paper, we focus on the task of Chinese epidemic event extraction (EE), which is defined as the detection of epidemic-related events and corresponding arguments in the texts of epidemic case reports. To facilitate the research of this task, we first define the epidemic-related event types and argument roles. Then we manually annotate a Chinese COVID-19 epidemic dataset, named COVID-19 Case Report (CCR). We also propose a novel hierarchical EE architecture, named <i>m</i>ulti-model <i>f</i>usion-based <i>h</i>ierarchical <i>e</i>vent <i>e</i>xtraction (MFHEE). In MFHEE, we introduce a multi-model fusion strategy to tackle the issue of recognition bias of previous EE models. The experimental results on CCR dataset show that our method can effectively extract epidemic events and outperforms other baselines on this dataset. The comparative experiments results on other generic datasets show that our method has good scalability and portability. The ablation studies also show that the proposed hierarchical structure and multi-model fusion strategy contribute to the precision of our model.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s41019-022-00203-6.</p>","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":"8 1","pages":"73-83"},"PeriodicalIF":5.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807097/pdf/","citationCount":"1","resultStr":"{\"title\":\"Multi-Model Fusion-Based Hierarchical Extraction for Chinese Epidemic Event.\",\"authors\":\"Zenghua Liao, Zongqiang Yang, Peixin Huang, Ning Pang, Xiang Zhao\",\"doi\":\"10.1007/s41019-022-00203-6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>In recent years, Coronavirus disease 2019 (COVID-19) has become a global epidemic, and some efforts have been devoted to tracking and controlling its spread. Extracting structured knowledge from involved epidemic case reports can inform the surveillance system, which is important for controlling the spread of outbreaks. Therefore, in this paper, we focus on the task of Chinese epidemic event extraction (EE), which is defined as the detection of epidemic-related events and corresponding arguments in the texts of epidemic case reports. To facilitate the research of this task, we first define the epidemic-related event types and argument roles. Then we manually annotate a Chinese COVID-19 epidemic dataset, named COVID-19 Case Report (CCR). We also propose a novel hierarchical EE architecture, named <i>m</i>ulti-model <i>f</i>usion-based <i>h</i>ierarchical <i>e</i>vent <i>e</i>xtraction (MFHEE). In MFHEE, we introduce a multi-model fusion strategy to tackle the issue of recognition bias of previous EE models. The experimental results on CCR dataset show that our method can effectively extract epidemic events and outperforms other baselines on this dataset. The comparative experiments results on other generic datasets show that our method has good scalability and portability. The ablation studies also show that the proposed hierarchical structure and multi-model fusion strategy contribute to the precision of our model.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s41019-022-00203-6.</p>\",\"PeriodicalId\":52220,\"journal\":{\"name\":\"Data Science and Engineering\",\"volume\":\"8 1\",\"pages\":\"73-83\"},\"PeriodicalIF\":5.1000,\"publicationDate\":\"2023-01-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807097/pdf/\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Science and Engineering\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1007/s41019-022-00203-6\",\"RegionNum\":2,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41019-022-00203-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
Multi-Model Fusion-Based Hierarchical Extraction for Chinese Epidemic Event.
In recent years, Coronavirus disease 2019 (COVID-19) has become a global epidemic, and some efforts have been devoted to tracking and controlling its spread. Extracting structured knowledge from involved epidemic case reports can inform the surveillance system, which is important for controlling the spread of outbreaks. Therefore, in this paper, we focus on the task of Chinese epidemic event extraction (EE), which is defined as the detection of epidemic-related events and corresponding arguments in the texts of epidemic case reports. To facilitate the research of this task, we first define the epidemic-related event types and argument roles. Then we manually annotate a Chinese COVID-19 epidemic dataset, named COVID-19 Case Report (CCR). We also propose a novel hierarchical EE architecture, named multi-model fusion-based hierarchical event extraction (MFHEE). In MFHEE, we introduce a multi-model fusion strategy to tackle the issue of recognition bias of previous EE models. The experimental results on CCR dataset show that our method can effectively extract epidemic events and outperforms other baselines on this dataset. The comparative experiments results on other generic datasets show that our method has good scalability and portability. The ablation studies also show that the proposed hierarchical structure and multi-model fusion strategy contribute to the precision of our model.
Supplementary information: The online version contains supplementary material available at 10.1007/s41019-022-00203-6.
期刊介绍:
The journal of Data Science and Engineering (DSE) responds to the remarkable change in the focus of information technology development from CPU-intensive computation to data-intensive computation, where the effective application of data, especially big data, becomes vital. The emerging discipline data science and engineering, an interdisciplinary field integrating theories and methods from computer science, statistics, information science, and other fields, focuses on the foundations and engineering of efficient and effective techniques and systems for data collection and management, for data integration and correlation, for information and knowledge extraction from massive data sets, and for data use in different application domains. Focusing on the theoretical background and advanced engineering approaches, DSE aims to offer a prime forum for researchers, professionals, and industrial practitioners to share their knowledge in this rapidly growing area. It provides in-depth coverage of the latest advances in the closely related fields of data science and data engineering. More specifically, DSE covers four areas: (i) the data itself, i.e., the nature and quality of the data, especially big data; (ii) the principles of information extraction from data, especially big data; (iii) the theory behind data-intensive computing; and (iv) the techniques and systems used to analyze and manage big data. DSE welcomes papers that explore the above subjects. Specific topics include, but are not limited to: (a) the nature and quality of data, (b) the computational complexity of data-intensive computing,(c) new methods for the design and analysis of the algorithms for solving problems with big data input,(d) collection and integration of data collected from internet and sensing devises or sensor networks, (e) representation, modeling, and visualization of big data,(f) storage, transmission, and management of big data,(g) methods and algorithms of data intensive computing, such asmining big data,online analysis processing of big data,big data-based machine learning, big data based decision-making, statistical computation of big data, graph-theoretic computation of big data, linear algebraic computation of big data, and big data-based optimization. (h) hardware systems and software systems for data-intensive computing, (i) data security, privacy, and trust, and(j) novel applications of big data.