Abstract With the prevailing COVID-19 pandemic, the lack of digitally-recorded and connected health data poses a challenge for analysing the situation. Virus outbreaks, such as the current pandemic, allow for the optimisation and reuse of data, which can be beneficial in managing future outbreaks. However, there is a general lack of knowledge about the actual flow of information in health facilities, which is also the case in Uganda. In Uganda, where this case study was conducted, there is no comprehensive knowledge about what type of data is collected or how it is collected along the journey of a patient through a health facility. This study investigates information flows of clinical patient data in health facilities in Uganda. The study found that almost all health facilities in Uganda store patient information in paper files on shelves. Hospitals in Uganda are provided with paper tools, such as reporting forms, registers and manuals, in which district data is collected as aggregate data and submitted in the form of digital reports to the Ministry of Health Resource Center. These reporting forms are not digitised and, thus, not machine-actionable. Hence, it is not easy for health facilities, researchers, and others to find and access patient and research data. It is also not easy to reuse and connect this data with other digital health data worldwide, leading to the incorrect conclusion that there is less health data in Uganda. The a FAIR architecture has the potential to solve such problems and facilitate the transition from paper to digital records in the Uganda health system.
{"title":"Information Streams in Health Facilities: The Case of Uganda","authors":"Mariam Basajja, Mutwalibi Nambobi","doi":"10.1162/dint_a_00177","DOIUrl":"https://doi.org/10.1162/dint_a_00177","url":null,"abstract":"Abstract With the prevailing COVID-19 pandemic, the lack of digitally-recorded and connected health data poses a challenge for analysing the situation. Virus outbreaks, such as the current pandemic, allow for the optimisation and reuse of data, which can be beneficial in managing future outbreaks. However, there is a general lack of knowledge about the actual flow of information in health facilities, which is also the case in Uganda. In Uganda, where this case study was conducted, there is no comprehensive knowledge about what type of data is collected or how it is collected along the journey of a patient through a health facility. This study investigates information flows of clinical patient data in health facilities in Uganda. The study found that almost all health facilities in Uganda store patient information in paper files on shelves. Hospitals in Uganda are provided with paper tools, such as reporting forms, registers and manuals, in which district data is collected as aggregate data and submitted in the form of digital reports to the Ministry of Health Resource Center. These reporting forms are not digitised and, thus, not machine-actionable. Hence, it is not easy for health facilities, researchers, and others to find and access patient and research data. It is also not easy to reuse and connect this data with other digital health data worldwide, leading to the incorrect conclusion that there is less health data in Uganda. The a FAIR architecture has the potential to solve such problems and facilitate the transition from paper to digital records in the Uganda health system.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"882-898"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44587182","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract The FAIR Guidelines—that data should be Findable, Accessible, Interoperable and Reusable (FAIR)—aim to improve the management of digital data assets for improved decision making. FAIR comprises 15 elements (called facets) that explain how data should be able to be reused by researchers and policymakers. For this research, eight policy documents were reviewed from Zimbabwe's Ministry of Health and Ministry of Information and Communication Technology (ICT) from 1999 to 2020. These were scrutinised to determine the mention of the FAIR Guidelines or FAIR Equivalent principles. The vision, mission statement and objectives of these documents were analysed relative to the 15 facets of FAIR. The research found that none of the policy documents in health/eHealth or ICT in Zimbabwe explicitly mention the FAIR Guidelines, but all contain some FAIR Equivalent principles. Hence, the regulatory framework for health/eHealth data management in Zimbabwe is aligned with the FAIR Guidelines and, therefore, a policy window is open for the adoption of FAIR Guidelines in relation to health/eHealth data management.
{"title":"Regulatory Framework for eHealth Data Policies in Zimbabwe: Measuring FAIR Equivalency","authors":"Kudakwashe Chindoza","doi":"10.1162/dint_a_00173","DOIUrl":"https://doi.org/10.1162/dint_a_00173","url":null,"abstract":"Abstract The FAIR Guidelines—that data should be Findable, Accessible, Interoperable and Reusable (FAIR)—aim to improve the management of digital data assets for improved decision making. FAIR comprises 15 elements (called facets) that explain how data should be able to be reused by researchers and policymakers. For this research, eight policy documents were reviewed from Zimbabwe's Ministry of Health and Ministry of Information and Communication Technology (ICT) from 1999 to 2020. These were scrutinised to determine the mention of the FAIR Guidelines or FAIR Equivalent principles. The vision, mission statement and objectives of these documents were analysed relative to the 15 facets of FAIR. The research found that none of the policy documents in health/eHealth or ICT in Zimbabwe explicitly mention the FAIR Guidelines, but all contain some FAIR Equivalent principles. Hence, the regulatory framework for health/eHealth data management in Zimbabwe is aligned with the FAIR Guidelines and, therefore, a policy window is open for the adoption of FAIR Guidelines in relation to health/eHealth data management.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"827-838"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42286409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mariam Basajja, M. van Reisen, Francisca Onaolapo Oladipo
Abstract This study explores the possibility of opening a policy window for the adoption of the FAIR Guidelines— that data be Findable, Accessible, Interoperable, and Reusable (FAIR)—in Uganda's eHealth sector. Although the FAIR Guidelines were not mentioned in any of the policy documents relevant to Uganda's eHealth sector, the study found that 83% of the documents mentioned FAIR Equivalent efforts, such as the adoption of the National Identification Number (NIN) as a unique identifier in Uganda's national Electronic Health Management Information System (eHMIS) (findability), the planned/ongoing integration of various information systems (interoperability), and the alignment of various projects with international best practices/standards (reusability). A FAIR Equivalency Score (FE-Score), devised in this study as an aggregate score of the mention of the equivalent of FAIR facets in the policy documents, showed that the documents at the core of Uganda's digital health/eHealth policy have the highest score of all the documents analysed, indicating that there is a degree of alignment between Uganda's National eHealth Vision and the FAIR Guidelines. Therefore, it can be concluded that favourable conditions exist for the adoption and implementation of the FAIR Guidelines in Uganda's eHealth sector. Hence, it is recommended that the FAIR community adopt a capacity building strategy through organisations with a worldwide mandate, such as the World Health Organization, to promote the adoption of the FAIR Guidelines as part of international best practices.
{"title":"FAIR Equivalency with Regulatory Framework for Digital Health in Uganda","authors":"Mariam Basajja, M. van Reisen, Francisca Onaolapo Oladipo","doi":"10.1162/dint_a_00170","DOIUrl":"https://doi.org/10.1162/dint_a_00170","url":null,"abstract":"Abstract This study explores the possibility of opening a policy window for the adoption of the FAIR Guidelines— that data be Findable, Accessible, Interoperable, and Reusable (FAIR)—in Uganda's eHealth sector. Although the FAIR Guidelines were not mentioned in any of the policy documents relevant to Uganda's eHealth sector, the study found that 83% of the documents mentioned FAIR Equivalent efforts, such as the adoption of the National Identification Number (NIN) as a unique identifier in Uganda's national Electronic Health Management Information System (eHMIS) (findability), the planned/ongoing integration of various information systems (interoperability), and the alignment of various projects with international best practices/standards (reusability). A FAIR Equivalency Score (FE-Score), devised in this study as an aggregate score of the mention of the equivalent of FAIR facets in the policy documents, showed that the documents at the core of Uganda's digital health/eHealth policy have the highest score of all the documents analysed, indicating that there is a degree of alignment between Uganda's National eHealth Vision and the FAIR Guidelines. Therefore, it can be concluded that favourable conditions exist for the adoption and implementation of the FAIR Guidelines in Uganda's eHealth sector. Hence, it is recommended that the FAIR community adopt a capacity building strategy through organisations with a worldwide mandate, such as the World Health Organization, to promote the adoption of the FAIR Guidelines as part of international best practices.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"771-797"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45247962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mariam Basajja, Mutwalibi Nambobi, K. Wolstencroft
Abstract The digital health landscape in Uganda is plagued by problems with interoperability and sustainability, due to fragmentation and a lack of integrated digital health solutions. This can be partly attributed to the absence of policies on the interoperability of data, as well as the fact that there is no common goal to make digital data and data infrastructure interoperable across the data ecosystem. The promulgation of the FAIR Guidelines in 2016 brought together various data stewards and stakeholders to adopt a common vision on data management and enable greater interoperability. This article explores the potential of enhancing digital health interoperability through FAIR by analysing the digital solutions piloted in Uganda and their sustainability. It looks at the factors that are currently hindering interoperability by examining existing digital health solutions in Uganda, such as the Digital Health Atlas Uganda (DHA-U) and Uganda Digital Health Dashboard (UDHD). The level of FAIRness of the two dashboards was determined using the FAIR Evaluation Services tool. Analysis was also carried out to discover the level of FAIRness of the digital health solutions within the dashboards and the most frequently used software applications and data standards by the different digital health interventions in Uganda.
{"title":"Possibility of Enhancing Digital Health Interoperability in Uganda through FAIR Data","authors":"Mariam Basajja, Mutwalibi Nambobi, K. Wolstencroft","doi":"10.1162/dint_a_00178","DOIUrl":"https://doi.org/10.1162/dint_a_00178","url":null,"abstract":"Abstract The digital health landscape in Uganda is plagued by problems with interoperability and sustainability, due to fragmentation and a lack of integrated digital health solutions. This can be partly attributed to the absence of policies on the interoperability of data, as well as the fact that there is no common goal to make digital data and data infrastructure interoperable across the data ecosystem. The promulgation of the FAIR Guidelines in 2016 brought together various data stewards and stakeholders to adopt a common vision on data management and enable greater interoperability. This article explores the potential of enhancing digital health interoperability through FAIR by analysing the digital solutions piloted in Uganda and their sustainability. It looks at the factors that are currently hindering interoperability by examining existing digital health solutions in Uganda, such as the Digital Health Atlas Uganda (DHA-U) and Uganda Digital Health Dashboard (UDHD). The level of FAIRness of the two dashboards was determined using the FAIR Evaluation Services tool. Analysis was also carried out to discover the level of FAIRness of the digital health solutions within the dashboards and the most frequently used software applications and data standards by the different digital health interventions in Uganda.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"899-916"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47974530","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Getu Tadele Taye, S. Amare, T. G. Gebremeskel, A. Medhanyie, W. Ayele, Tigist Habtamu, M. Reisen
Abstract This paper investigates whether or not there is a policy window for making health data ‘Findable’, ‘Accessible’ (under well-defined conditions), ‘Interoperable’ and ‘Reusable’ (FAIR) in Ethiopia. The question is answered by studying the alignment of policies for health data in Ethiopia with the FAIR Guidelines or their ‘FAIR Equivalency’. Policy documents relating to the digitalisation of health systems in Ethiopia were examined to determine their FAIR Equivalency. Although the documents are fragmented and have no overarching governing framework, it was found that they aim to make the disparate health data systems in Ethiopia interoperable and boost the discoverability and (re)usability of data for research and better decision making. Hence, the FAIR Guidelines appear to be aligned with the regulatory frameworks for ICT and digital health in Ethiopia and, under the right conditions, a policy window could open for their adoption and implementation.
{"title":"FAIR Equivalency with Regulatory Framework for Digital Health in Ethiopia","authors":"Getu Tadele Taye, S. Amare, T. G. Gebremeskel, A. Medhanyie, W. Ayele, Tigist Habtamu, M. Reisen","doi":"10.1162/dint_a_00172","DOIUrl":"https://doi.org/10.1162/dint_a_00172","url":null,"abstract":"Abstract This paper investigates whether or not there is a policy window for making health data ‘Findable’, ‘Accessible’ (under well-defined conditions), ‘Interoperable’ and ‘Reusable’ (FAIR) in Ethiopia. The question is answered by studying the alignment of policies for health data in Ethiopia with the FAIR Guidelines or their ‘FAIR Equivalency’. Policy documents relating to the digitalisation of health systems in Ethiopia were examined to determine their FAIR Equivalency. Although the documents are fragmented and have no overarching governing framework, it was found that they aim to make the disparate health data systems in Ethiopia interoperable and boost the discoverability and (re)usability of data for research and better decision making. Hence, the FAIR Guidelines appear to be aligned with the regulatory frameworks for ICT and digital health in Ethiopia and, under the right conditions, a policy window could open for their adoption and implementation.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"813-826"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47337829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi Lin, Putu Hadi Purnama Jati, Aliya Aktau, M. Ghardallou, Sara Nodehi, M. Reisen
Abstract This study provides an analysis of the implementation of FAIR Guidelines in selected non-Western geographies. The analysis was based on a systematic literature review to determine if the findability, accessibility, interoperability, and reusability of data is seen as an issue, if the adoption of the FAIR Guidelines is seen as a solution, and if the climate is conducive to the implementation of the FAIR Guidelines. The results show that the FAIR Guidelines have been discussed in most of the countries studied, which have identified data sharing and the reusability of research data as an issue (e.g., Kazakhstan, Russia, countries in the Middle East), and partially introduced in others (e.g., Indonesia). In Indonesia, a FAIR equivalent system has been introduced, although certain functions need to be added for data to be entirely FAIR. In Japan, both FAIR equivalent systems and FAIR-based systems have been adopted and created, and the acceptance of FAIR-based systems is recommended by the Government of Japan. In a number of African countries, the FAIR Guidelines are in the process of being implemented and the implementation of FAIR is well supported. In conclusion, a window of opportunity for implementing the FAIR Guidelines is open in most of the countries studied, however, more awareness needs to be raised about the benefits of FAIR in Russia and Kazakhstan to place it firmly on the policy agenda.
{"title":"Implementation of FAIR Guidelines in Selected Non-Western Geographies","authors":"Yi Lin, Putu Hadi Purnama Jati, Aliya Aktau, M. Ghardallou, Sara Nodehi, M. Reisen","doi":"10.1162/dint_a_00169","DOIUrl":"https://doi.org/10.1162/dint_a_00169","url":null,"abstract":"Abstract This study provides an analysis of the implementation of FAIR Guidelines in selected non-Western geographies. The analysis was based on a systematic literature review to determine if the findability, accessibility, interoperability, and reusability of data is seen as an issue, if the adoption of the FAIR Guidelines is seen as a solution, and if the climate is conducive to the implementation of the FAIR Guidelines. The results show that the FAIR Guidelines have been discussed in most of the countries studied, which have identified data sharing and the reusability of research data as an issue (e.g., Kazakhstan, Russia, countries in the Middle East), and partially introduced in others (e.g., Indonesia). In Indonesia, a FAIR equivalent system has been introduced, although certain functions need to be added for data to be entirely FAIR. In Japan, both FAIR equivalent systems and FAIR-based systems have been adopted and created, and the acceptance of FAIR-based systems is recommended by the Government of Japan. In a number of African countries, the FAIR Guidelines are in the process of being implemented and the implementation of FAIR is well supported. In conclusion, a window of opportunity for implementing the FAIR Guidelines is open in most of the countries studied, however, more awareness needs to be raised about the benefits of FAIR in Russia and Kazakhstan to place it firmly on the policy agenda.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"747-770"},"PeriodicalIF":3.9,"publicationDate":"2022-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"48428666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tommaso Rodani, E. Osmenaj, A. Cazzaniga, M. Panighel, C. Africh, S. Cozzini
ABSTRACT In this paper, we describe the data management practices and services developed for making FAIR compliant a scientific archive of Scanning Tunneling Microscopy (STM) images. As a first step, we extracted the instrument metadata of each image of the dataset to create a structured database. We then enriched these metadata with information on the structure and composition of the surface by means of a pipeline that leverages human annotation, machine learning techniques, and instrument metadata filtering. To visually explore both images and metadata, as well as to improve the accessibility and usability of the dataset, we developed “STM explorer” as a web service integrated within the Trieste Advanced Data services (TriDAS) website. On top of these data services and tools, we propose an implementation of the W3C PROV standard to describe provenance metadata of STM images.
{"title":"Towards the FAIRification of Scanning Tunneling Microscopy Images","authors":"Tommaso Rodani, E. Osmenaj, A. Cazzaniga, M. Panighel, C. Africh, S. Cozzini","doi":"10.1162/dint_a_00164","DOIUrl":"https://doi.org/10.1162/dint_a_00164","url":null,"abstract":"ABSTRACT In this paper, we describe the data management practices and services developed for making FAIR compliant a scientific archive of Scanning Tunneling Microscopy (STM) images. As a first step, we extracted the instrument metadata of each image of the dataset to create a structured database. We then enriched these metadata with information on the structure and composition of the surface by means of a pipeline that leverages human annotation, machine learning techniques, and instrument metadata filtering. To visually explore both images and metadata, as well as to improve the accessibility and usability of the dataset, we developed “STM explorer” as a web service integrated within the Trieste Advanced Data services (TriDAS) website. On top of these data services and tools, we propose an implementation of the W3C PROV standard to describe provenance metadata of STM images.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"27-42"},"PeriodicalIF":3.9,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"45371982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Luiz Olavo Bonino da Silva Santos, K. Burger, R. Kaliyaperumal, Mark D. Wilkinson
ABSTRACT Metadata, data about other digital objects, play an important role in FAIR with a direct relation to all FAIR principles. In this paper we present and discuss the FAIR Data Point (FDP), a software architecture aiming to define a common approach to publish semantically-rich and machine-actionable metadata according to the FAIR principles. We present the core components and features of the FDP, its approach to metadata provision, the criteria to evaluate whether an application adheres to the FDP specifications and the service to register, index and allow users to search for metadata content of available FDPs.
{"title":"FAIR Data Point: A FAIR-Oriented Approach for Metadata Publication","authors":"Luiz Olavo Bonino da Silva Santos, K. Burger, R. Kaliyaperumal, Mark D. Wilkinson","doi":"10.1162/dint_a_00160","DOIUrl":"https://doi.org/10.1162/dint_a_00160","url":null,"abstract":"ABSTRACT Metadata, data about other digital objects, play an important role in FAIR with a direct relation to all FAIR principles. In this paper we present and discuss the FAIR Data Point (FDP), a software architecture aiming to define a common approach to publish semantically-rich and machine-actionable metadata according to the FAIR principles. We present the core components and features of the FDP, its approach to metadata provision, the criteria to evaluate whether an application adheres to the FDP specifications and the service to register, index and allow users to search for metadata content of available FDPs.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"163-183"},"PeriodicalIF":3.9,"publicationDate":"2022-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47715872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Huifang Du, Zhongwen Le, Haofen Wang, Yunwen Chen, Jing Yu
Abstract COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering (QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge (i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset (called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph OpenKG-COVID19 launched by OpenKG①, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.
{"title":"COKG-QA: Multi-hop Question Answering over COVID-19 Knowledge Graphs","authors":"Huifang Du, Zhongwen Le, Haofen Wang, Yunwen Chen, Jing Yu","doi":"10.1162/dint_a_00154","DOIUrl":"https://doi.org/10.1162/dint_a_00154","url":null,"abstract":"Abstract COVID-19 evolves rapidly and an enormous number of people worldwide desire instant access to COVID-19 information such as the overview, clinic knowledge, vaccine, prevention measures, and COVID-19 mutation. Question answering (QA) has become the mainstream interaction way for users to consume the ever-growing information by posing natural language questions. Therefore, it is urgent and necessary to develop a QA system to offer consulting services all the time to relieve the stress of health services. In particular, people increasingly pay more attention to complex multi-hop questions rather than simple ones during the lasting pandemic, but the existing COVID-19 QA systems fail to meet their complex information needs. In this paper, we introduce a novel multi-hop QA system called COKG-QA, which reasons over multiple relations over large-scale COVID-19 Knowledge Graphs to return answers given a question. In the field of question answering over knowledge graph, current methods usually represent entities and schemas based on some knowledge embedding models and represent questions using pre-trained models. While it is convenient to represent different knowledge (i.e., entities and questions) based on specified embeddings, an issue raises that these separate representations come from heterogeneous vector spaces. We align question embeddings with knowledge embeddings in a common semantic space by a simple but effective embedding projection mechanism. Furthermore, we propose combining entity embeddings with their corresponding schema embeddings which served as important prior knowledge, to help search for the correct answer entity of specified types. In addition, we derive a large multi-hop Chinese COVID-19 dataset (called COKG-DATA for remembering) for COKG-QA based on the linked knowledge graph OpenKG-COVID19 launched by OpenKG①, including comprehensive and representative information about COVID-19. COKG-QA achieves quite competitive performance in the 1-hop and 2-hop data while obtaining the best result with significant improvements in the 3-hop. And it is more efficient to be used in the QA system for users. Moreover, the user study shows that the system not only provides accurate and interpretable answers but also is easy to use and comes with smart tips and suggestions.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"471-492"},"PeriodicalIF":3.9,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44925604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract In this study, we uncover the topics of Chinese public cultural activities in 2020 with a two-step short text clustering (self-taught neural networks and graph-based clustering) and topic modeling approach. The dataset we use for this research is collected from 108 websites of libraries and cultural centers, containing over 17,000 articles. With the novel framework we propose, we derive 3 clusters and 8 topics from 21 provincial-level regions in China. By plotting the topic distribution of each cluster, we are able to shows unique tendencies of local cultural institutes, that is, free lessons and lectures on art and culture, entertainment and service for socially vulnerable groups, and the preservation of intangible cultural heritage respectively. The findings of our study provide decision-making support for cultural institutes, thus promoting public cultural service from a data-driven perspective.
{"title":"Uncovering Topics of Public Cultural Activities: Evidence from China","authors":"Zixin Zeng, Bolin Hua","doi":"10.1162/dint_a_00121","DOIUrl":"https://doi.org/10.1162/dint_a_00121","url":null,"abstract":"Abstract In this study, we uncover the topics of Chinese public cultural activities in 2020 with a two-step short text clustering (self-taught neural networks and graph-based clustering) and topic modeling approach. The dataset we use for this research is collected from 108 websites of libraries and cultural centers, containing over 17,000 articles. With the novel framework we propose, we derive 3 clusters and 8 topics from 21 provincial-level regions in China. By plotting the topic distribution of each cluster, we are able to shows unique tendencies of local cultural institutes, that is, free lessons and lectures on art and culture, entertainment and service for socially vulnerable groups, and the preservation of intangible cultural heritage respectively. The findings of our study provide decision-making support for cultural institutes, thus promoting public cultural service from a data-driven perspective.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"4 1","pages":"509-528"},"PeriodicalIF":3.9,"publicationDate":"2022-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44608347","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}