COKG-QA:新冠肺炎知识图谱上的多孔问答

IF 1.3 3区计算机科学 Q3 COMPUTER SCIENCE, INFORMATION SYSTEMS Data Intelligence Pub Date : 2022-07-01 DOI:10.1162/dint_a_00154

Huifang Du, Zhongwen Le, Haofen Wang, Yunwen Chen, Jing Yu

{"title":"COKG-QA:新冠肺炎知识图谱上的多孔问答","authors":"Huifang Du, Zhongwen Le, Haofen Wang, Yunwen Chen, Jing Yu","doi":"10.1162/dint_a_00154","DOIUrl":null,"url":null,"abstract":"Abstract COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering (QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge (i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset (called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph OpenKG-COVID19 launched by OpenKG①, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"471-492"},"PeriodicalIF":1.3000,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"COKG-QA: Multi-hop Question Answering over COVID-19 Knowledge Graphs\",\"authors\":\"Huifang Du, Zhongwen Le, Haofen Wang, Yunwen Chen, Jing Yu\",\"doi\":\"10.1162/dint_a_00154\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Abstract COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering (QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge (i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset (called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph OpenKG-COVID19 launched by OpenKG①, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.\",\"PeriodicalId\":34023,\"journal\":{\"name\":\"Data Intelligence\",\"volume\":\"4 1\",\"pages\":\"471-492\"},\"PeriodicalIF\":1.3000,\"publicationDate\":\"2022-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Data Intelligence\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://doi.org/10.1162/dint_a_00154\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q3\",\"JCRName\":\"COMPUTER SCIENCE, INFORMATION SYSTEMS\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Data Intelligence","FirstCategoryId":"94","ListUrlMain":"https://doi.org/10.1162/dint_a_00154","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"COMPUTER SCIENCE, INFORMATION SYSTEMS","Score":null,"Total":0}

引用次数: 6

摘要

摘要新冠肺炎迅速发展，世界各地的许多人都希望立即获得新冠肺炎信息，如概述、临床知识、疫苗、预防措施和新冠肺炎变异。问答（QA）已经成为用户通过提出自然语言问题来消费日益增长的信息的主流互动方式。因此，迫切需要建立一个随时提供咨询服务的质量保证体系，以缓解卫生服务的压力。特别是，在持续的大流行期间，人们越来越关注复杂的多跳问题，而不是简单的问题，但现有的新冠肺炎QA系统无法满足他们复杂的信息需求。在本文中，我们介绍了一种称为COKG-QA的新型多跳QA系统，该系统在大规模新冠肺炎知识图上推理多个关系以返回给定问题的答案。在知识图上的问答领域，目前的方法通常基于一些知识嵌入模型来表示实体和模式，并使用预先训练的模型来表示问题。虽然基于指定的嵌入来表示不同的知识（即实体和问题）是方便的，但存在一个问题，即这些单独的表示来自异构向量空间。我们通过一种简单但有效的嵌入投影机制，将问题嵌入与知识嵌入对准在一个公共语义空间中。此外，我们建议将实体嵌入与其作为重要先验知识的相应模式嵌入相结合，以帮助搜索指定类型的正确答案实体。此外，我们基于OpenKG①推出的链接知识图OpenKG-COVID19，为COKG-QA导出了一个大型多跳中文新冠肺炎数据集（称为COKG-DATA，用于记忆），其中包括关于新冠肺炎的全面和有代表性的信息。COKG-QA在1跳和2跳数据中实现了相当有竞争力的性能，同时在3跳中获得了显著改进的最佳结果。并且它在用户的QA系统中使用更有效。此外，用户研究表明，该系统不仅提供了准确和可解释的答案，而且易于使用，并配有智能提示和建议。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

COKG-QA: Multi-hop Question Answering over COVID-19 Knowledge Graphs

Abstract COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering (QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge (i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset (called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph OpenKG-COVID19 launched by OpenKG①, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊