Multi-Model Fusion-Based Hierarchical Extraction for Chinese Epidemic Event.

IF 5.1 2区 计算机科学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Science and Engineering Pub Date : 2023-01-01 DOI:10.1007/s41019-022-00203-6
Zenghua Liao, Zongqiang Yang, Peixin Huang, Ning Pang, Xiang Zhao
{"title":"Multi-Model Fusion-Based Hierarchical Extraction for Chinese Epidemic Event.","authors":"Zenghua Liao,&nbsp;Zongqiang Yang,&nbsp;Peixin Huang,&nbsp;Ning Pang,&nbsp;Xiang Zhao","doi":"10.1007/s41019-022-00203-6","DOIUrl":null,"url":null,"abstract":"<p><p>In recent years, Coronavirus disease 2019 (COVID-19) has become a global epidemic, and some efforts have been devoted to tracking and controlling its spread. Extracting structured knowledge from involved epidemic case reports can inform the surveillance system, which is important for controlling the spread of outbreaks. Therefore, in this paper, we focus on the task of Chinese epidemic event extraction (EE), which is defined as the detection of epidemic-related events and corresponding arguments in the texts of epidemic case reports. To facilitate the research of this task, we first define the epidemic-related event types and argument roles. Then we manually annotate a Chinese COVID-19 epidemic dataset, named COVID-19 Case Report (CCR). We also propose a novel hierarchical EE architecture, named <i>m</i>ulti-model <i>f</i>usion-based <i>h</i>ierarchical <i>e</i>vent <i>e</i>xtraction (MFHEE). In MFHEE, we introduce a multi-model fusion strategy to tackle the issue of recognition bias of previous EE models. The experimental results on CCR dataset show that our method can effectively extract epidemic events and outperforms other baselines on this dataset. The comparative experiments results on other generic datasets show that our method has good scalability and portability. The ablation studies also show that the proposed hierarchical structure and multi-model fusion strategy contribute to the precision of our model.</p><p><strong>Supplementary information: </strong>The online version contains supplementary material available at 10.1007/s41019-022-00203-6.</p>","PeriodicalId":52220,"journal":{"name":"Data Science and Engineering","volume":"8 1","pages":"73-83"},"PeriodicalIF":5.1000,"publicationDate":"2023-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9807097/pdf/","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Science and Engineering","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1007/s41019-022-00203-6","RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}
引用次数: 1

Abstract

In recent years, Coronavirus disease 2019 (COVID-19) has become a global epidemic, and some efforts have been devoted to tracking and controlling its spread. Extracting structured knowledge from involved epidemic case reports can inform the surveillance system, which is important for controlling the spread of outbreaks. Therefore, in this paper, we focus on the task of Chinese epidemic event extraction (EE), which is defined as the detection of epidemic-related events and corresponding arguments in the texts of epidemic case reports. To facilitate the research of this task, we first define the epidemic-related event types and argument roles. Then we manually annotate a Chinese COVID-19 epidemic dataset, named COVID-19 Case Report (CCR). We also propose a novel hierarchical EE architecture, named multi-model fusion-based hierarchical event extraction (MFHEE). In MFHEE, we introduce a multi-model fusion strategy to tackle the issue of recognition bias of previous EE models. The experimental results on CCR dataset show that our method can effectively extract epidemic events and outperforms other baselines on this dataset. The comparative experiments results on other generic datasets show that our method has good scalability and portability. The ablation studies also show that the proposed hierarchical structure and multi-model fusion strategy contribute to the precision of our model.

Supplementary information: The online version contains supplementary material available at 10.1007/s41019-022-00203-6.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于多模型融合的中国疫情事件层次提取。
近年来,2019冠状病毒病(COVID-19)已成为一种全球性流行病,人们在追踪和控制其传播方面做出了一些努力。从相关的流行病病例报告中提取结构化知识可以为监测系统提供信息,这对于控制疫情的传播非常重要。因此,本文重点研究中国疫情事件提取(Chinese epidemic event extraction, EE)的任务,将其定义为在疫情报告文本中发现与疫情相关的事件和相应的论点。为了便于本课题的研究,我们首先定义了与流行病相关的事件类型和争论角色。然后我们手工标注了一个中国COVID-19流行数据集,命名为COVID-19病例报告(CCR)。我们还提出了一种新的分层事件提取体系结构,称为基于多模型融合的分层事件提取(MFHEE)。在MFHEE中,我们引入了一种多模型融合策略来解决先前的EE模型的识别偏差问题。在CCR数据集上的实验结果表明,我们的方法可以有效地提取流行病事件,并且优于该数据集上的其他基线。在其他通用数据集上的对比实验结果表明,该方法具有良好的可扩展性和可移植性。烧蚀实验还表明,所提出的分层结构和多模型融合策略有助于提高模型的精度。补充信息:在线版本包含补充资料,下载地址:10.1007/s41019-022- 00206 -6。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Data Science and Engineering
Data Science and Engineering Engineering-Computational Mechanics
CiteScore
10.40
自引率
2.40%
发文量
26
审稿时长
12 weeks
期刊介绍: The journal of Data Science and Engineering (DSE) responds to the remarkable change in the focus of information technology development from CPU-intensive computation to data-intensive computation, where the effective application of data, especially big data, becomes vital. The emerging discipline data science and engineering, an interdisciplinary field integrating theories and methods from computer science, statistics, information science, and other fields, focuses on the foundations and engineering of efficient and effective techniques and systems for data collection and management, for data integration and correlation, for information and knowledge extraction from massive data sets, and for data use in different application domains. Focusing on the theoretical background and advanced engineering approaches, DSE aims to offer a prime forum for researchers, professionals, and industrial practitioners to share their knowledge in this rapidly growing area. It provides in-depth coverage of the latest advances in the closely related fields of data science and data engineering. More specifically, DSE covers four areas: (i) the data itself, i.e., the nature and quality of the data, especially big data; (ii) the principles of information extraction from data, especially big data; (iii) the theory behind data-intensive computing; and (iv) the techniques and systems used to analyze and manage big data. DSE welcomes papers that explore the above subjects. Specific topics include, but are not limited to: (a) the nature and quality of data, (b) the computational complexity of data-intensive computing,(c) new methods for the design and analysis of the algorithms for solving problems with big data input,(d) collection and integration of data collected from internet and sensing devises or sensor networks, (e) representation, modeling, and visualization of  big data,(f)  storage, transmission, and management of big data,(g) methods and algorithms of  data intensive computing, such asmining big data,online analysis processing of big data,big data-based machine learning, big data based decision-making, statistical computation of big data, graph-theoretic computation of big data, linear algebraic computation of big data, and  big data-based optimization. (h) hardware systems and software systems for data-intensive computing, (i) data security, privacy, and trust, and(j) novel applications of big data.
期刊最新文献
Uncovering Flat and Hierarchical Topics by Community Discovery on Word Co-occurrence Network. AIoT-CitySense: AI and IoT-Driven City-Scale Sensing for Roadside Infrastructure Maintenance Anomaly Detection with Sub-Extreme Values: Health Provider Billing Graph Neural Network-Based Short‑Term Load Forecasting with Temporal Convolution Joint Representation Learning with Generative Adversarial Imputation Network for Improved Classification of Longitudinal Data
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1