ABSTRACT In the process of developing the C919 large aircraft customer service intelligence system, we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision making. Thus, to solve this problem, we propose to introduce the idea of ontology modeling and reasoning into competitive intelligence system building in this paper. We first present the building principles and methods of the civil aviation customer service ontology. We then define the classes and properties to contribute a real-world civil aviation customer service ontology, which is published on the Web (http://www.openkg.cn/dataset/cacso). We finally design SWRL rules corresponding to different intelligence analysis targets to support reasoning in our designed competitive intelligence system.
ABSTRACT 在开发 C919 大飞机客户服务智能系统的过程中,我们发现异构和不完整的数据会导致决策的低效和不准确。因此,为了解决这一问题,本文提出将本体建模与推理的思想引入竞争情报系统的构建中。我们首先介绍了民航客户服务本体的构建原则和方法。然后,我们定义了类和属性,以贡献一个发布在 Web 上的真实民航客户服务本体(http://www.openkg.cn/dataset/cacso)。最后,我们设计了与不同情报分析目标相对应的 SWRL 规则,以支持我们设计的竞争情报系统中的推理。
{"title":"A Civil Aviation Customer Service Ontology and Its Applications","authors":"Meixiang Lv, Xudong Cao, Tianxing Wu, Yuehua Li","doi":"10.1162/dint_a_00237","DOIUrl":"https://doi.org/10.1162/dint_a_00237","url":null,"abstract":"ABSTRACT In the process of developing the C919 large aircraft customer service intelligence system, we find that heterogeneous and incomplete data cause the inefficient and inaccurate decision making. Thus, to solve this problem, we propose to introduce the idea of ontology modeling and reasoning into competitive intelligence system building in this paper. We first present the building principles and methods of the civil aviation customer service ontology. We then define the classes and properties to contribute a real-world civil aviation customer service ontology, which is published on the Web (http://www.openkg.cn/dataset/cacso). We finally design SWRL rules corresponding to different intelligence analysis targets to support reasoning in our designed competitive intelligence system.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"183 1","pages":"1063-1081"},"PeriodicalIF":3.9,"publicationDate":"2023-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"139292767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract With the emergence of the IoT era, wireless sensor networks will be more and more widely used. In addition to collecting, transmitting and processing simple data such as humidity, temperature and density of the dome, they can also provide multimedia information services such as video and images. It enables more comprehensive and accurate environmental monitoring. Therefore, MSDs have a huge demand in military, daily, forestry, biomedicine and other fields. The intensive city model has obvious advantages in meeting people's diverse needs and comfortable life. Most obviously, it speeds up the rhythm of life for residents, thereby increasing efficiency and saving time. Starting from this aspect, this paper conducts a research on the evaluation index system of public built on the following areas of open space IoT and mental health. In this paper, the GRNN neural network model is constructed, the mean condition is calculated, the density function can be estimated, the network output, and the schematic diagram of the generalized regression neural network is improved. Using the established system, the index in 2018 is selected as the base year, and after transformation, the standardized values of the past years are formed, which are substituted into the cells to form different matrices. The value of each cell is counted to obtain the subsystem coordination degree, and the global coordination degree is obtained through calculation. The evaluation results of ecological civilization construction and development in 2018 and 2019, 2020 and 2021 were compared. The experimental data shows that compared with 2018, economic development will change from 1 to 2.000, social harmony will change from 1 to 2.480, ecological health will decrease to 0.850, environmental friendliness will decrease to 0.750, and comprehensive evaluation will decrease to 0.513. This shows that while the economy is developing this year, the construction of ecological civilization has been gradually carried out, and good results have been achieved. This reflects the effectiveness of the system. The subject of the evaluation index system of green public open space based on the Internet of Things and mental health has been well completed.
{"title":"Evaluation Index System of Green Public Open Space Based on Internet of Things and Mental Health","authors":"Jiexu Li, Faziawati binti Abdul Aziz, Ning Zhang","doi":"10.1162/dint_a_00219","DOIUrl":"https://doi.org/10.1162/dint_a_00219","url":null,"abstract":"Abstract With the emergence of the IoT era, wireless sensor networks will be more and more widely used. In addition to collecting, transmitting and processing simple data such as humidity, temperature and density of the dome, they can also provide multimedia information services such as video and images. It enables more comprehensive and accurate environmental monitoring. Therefore, MSDs have a huge demand in military, daily, forestry, biomedicine and other fields. The intensive city model has obvious advantages in meeting people's diverse needs and comfortable life. Most obviously, it speeds up the rhythm of life for residents, thereby increasing efficiency and saving time. Starting from this aspect, this paper conducts a research on the evaluation index system of public built on the following areas of open space IoT and mental health. In this paper, the GRNN neural network model is constructed, the mean condition is calculated, the density function can be estimated, the network output, and the schematic diagram of the generalized regression neural network is improved. Using the established system, the index in 2018 is selected as the base year, and after transformation, the standardized values of the past years are formed, which are substituted into the cells to form different matrices. The value of each cell is counted to obtain the subsystem coordination degree, and the global coordination degree is obtained through calculation. The evaluation results of ecological civilization construction and development in 2018 and 2019, 2020 and 2021 were compared. The experimental data shows that compared with 2018, economic development will change from 1 to 2.000, social harmony will change from 1 to 2.480, ecological health will decrease to 0.850, environmental friendliness will decrease to 0.750, and comprehensive evaluation will decrease to 0.513. This shows that while the economy is developing this year, the construction of ecological civilization has been gradually carried out, and good results have been achieved. This reflects the effectiveness of the system. The subject of the evaluation index system of green public open space based on the Internet of Things and mental health has been well completed.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"39 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135273961","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT Lung infiltration is a non-communicable condition where materials with higher density than air exist in the parenchyma tissue of the lungs. Lung infiltration can be hard to be detected in an X-ray scan even for a radiologist, especially at the early stages making it a leading cause of death. In response, several deep learning approaches have been evolved to address this problem. This paper proposes the Slide-Detect technique which is a Deep Neural Networks (DNN) model based on Convolutional Neural Networks (CNNs) that is trained to diagnose lung infiltration with Area Under Curve (AUC) up to 91.47%, accuracy of 93.85% and relatively low computational resources.
肺浸润是一种非传染性疾病,肺实质组织中存在密度高于空气的物质。即使是放射科医生,也很难在x射线扫描中发现肺浸润,特别是在早期阶段,这使其成为死亡的主要原因。作为回应,已经发展了几种深度学习方法来解决这个问题。本文提出了基于卷积神经网络(cnn)的深度神经网络(DNN)模型Slide-Detect技术,该技术经过训练后诊断肺浸润,曲线下面积(Area Under Curve, AUC)高达91.47%,准确率为93.85%,计算资源相对较少。
{"title":"Slide-Detect: An Accurate Deep Learning Diagnosis of Lung Infiltration","authors":"Ahmed E. Mohamed, Magda B. Fayek, Mona Farouk","doi":"10.1162/dint_a_00233","DOIUrl":"https://doi.org/10.1162/dint_a_00233","url":null,"abstract":"ABSTRACT Lung infiltration is a non-communicable condition where materials with higher density than air exist in the parenchyma tissue of the lungs. Lung infiltration can be hard to be detected in an X-ray scan even for a radiologist, especially at the early stages making it a leading cause of death. In response, several deep learning approaches have been evolved to address this problem. This paper proposes the Slide-Detect technique which is a Deep Neural Networks (DNN) model based on Convolutional Neural Networks (CNNs) that is trained to diagnose lung infiltration with Area Under Curve (AUC) up to 91.47%, accuracy of 93.85% and relatively low computational resources.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"246 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135902369","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Linhan Li, Huaping Zhang, Chunjin Li, Haowen You, Wenyao Cui
Abstract ChatGPT has attracted extension attention of academia and industry. This paper aims to evaluate ChatGPT in Chinese language understanding capability on 6 tasks using 11 datasets. Experiments indicate that ChatGPT achieved competitive results in sentiment analysis, summary, and reading comprehension in Chinese, while it is prone to factual errors in closed-book QA. Further, on two more difficult Chinese understanding tasks, that is, idiom fill-in-the-blank and cants understanding, we found that a simple chain-of-thought prompt can improve the accuracy of ChatGPT in complex reasoning. This paper further analyses the possible risks of using ChatGPT based on the results. Finally, we briefly describe the research and development progress of our ChatBIT.
{"title":"Evaluation on ChatGPT for Chinese Language Understanding","authors":"Linhan Li, Huaping Zhang, Chunjin Li, Haowen You, Wenyao Cui","doi":"10.1162/dint_a_00232","DOIUrl":"https://doi.org/10.1162/dint_a_00232","url":null,"abstract":"Abstract ChatGPT has attracted extension attention of academia and industry. This paper aims to evaluate ChatGPT in Chinese language understanding capability on 6 tasks using 11 datasets. Experiments indicate that ChatGPT achieved competitive results in sentiment analysis, summary, and reading comprehension in Chinese, while it is prone to factual errors in closed-book QA. Further, on two more difficult Chinese understanding tasks, that is, idiom fill-in-the-blank and cants understanding, we found that a simple chain-of-thought prompt can improve the accuracy of ChatGPT in complex reasoning. This paper further analyses the possible risks of using ChatGPT based on the results. Finally, we briefly describe the research and development progress of our ChatBIT.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135879702","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Management information system (MIS), a human-computer system that deeply integrates next-generation information technology and management services, has become the nerve center of society and organizations. With the development of next-generation information technology, MIS has gradually entered the smart period. However, research on smart management information systems (SMIS) is still limited, lacking systematic summarization of its conceptual definition, evolution, research hotspots, and typical applications. Therefore, this paper defines the conceptual characteristics of SMIS, provides an overview of the evolution of SMIS, examines research focus areas using bibliometric methods, and elaborates on typical application practices of SMIS in fields such as health care, elderly care, manufacturing, and transportation. Furthermore, we discuss the future development directions of SMIS in four key areas: smart interaction, smart decision-making, efficient resource allocation, and flexible system architecture. These discussions provide guidance and a foundation for the theoretical development and practical application of SMIS.
{"title":"Smart management information systems (SMIS): Concept, evolution, research hotspots and applications","authors":"Changyong Liang, Xiaoxiao Wang, Dong-xiao Gu, Pengyu Li, Hui Chen, Zhengfei Xu","doi":"10.1162/dint_a_00231","DOIUrl":"https://doi.org/10.1162/dint_a_00231","url":null,"abstract":"\u0000 Management information system (MIS), a human-computer system that deeply integrates next-generation information technology and management services, has become the nerve center of society and organizations. With the development of next-generation information technology, MIS has gradually entered the smart period. However, research on smart management information systems (SMIS) is still limited, lacking systematic summarization of its conceptual definition, evolution, research hotspots, and typical applications. Therefore, this paper defines the conceptual characteristics of SMIS, provides an overview of the evolution of SMIS, examines research focus areas using bibliometric methods, and elaborates on typical application practices of SMIS in fields such as health care, elderly care, manufacturing, and transportation. Furthermore, we discuss the future development directions of SMIS in four key areas: smart interaction, smart decision-making, efficient resource allocation, and flexible system architecture. These discussions provide guidance and a foundation for the theoretical development and practical application of SMIS.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44041436","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pneumoconiosis is a disease characterized by pulmonary tissue deposition caused by dust exposure in the workplace. In China, due to the large number and wide distribution of pneumoconiosis patients, there is a high demand for the case data of lung biopsy during the diagnosis of pneumoconiosis. This text studied the application of medical image detection technology in pneumoconiosis diagnosis based on deep learning (DL). A medical image detection and convolution neural network (CNN) based on DL was analyzed, and the application of DL medical image technology in pneumoconiosis diagnosis was researched. The experimental results in this paper showed that in the last round of testing, the accuracy of ResNet model including deconvolution structure reached 95.2%. The area under curve (AUC) value of the working characteristics of the subject is 0.987. The sensitivity was 99.66%, and the specificity was 88.61%. The non staging diagnosis of pneumoconiosis improved the diagnostic sensitivity while ensuring high specificity. At the same time, Delong test method was used to conduct AUC analysis on the three models, and the results showed that model C was more effective than model A and model B. There is no significant difference between model A and model B, and there is no significant difference in diagnostic efficiency. In a word, the diagnosis of the model has high sensitivity and low probability of missed diagnosis, which can greatly reduce the working pressure of diagnostic doctors and effectively improve the efficiency of diagnosis.
{"title":"Application of Medical Image Detection Technology Based on Deep Learning in Pneumoconiosis Diagnosis","authors":"Shengguang Peng","doi":"10.1162/dint_a_00228","DOIUrl":"https://doi.org/10.1162/dint_a_00228","url":null,"abstract":"\u0000 Pneumoconiosis is a disease characterized by pulmonary tissue deposition caused by dust exposure in the workplace. In China, due to the large number and wide distribution of pneumoconiosis patients, there is a high demand for the case data of lung biopsy during the diagnosis of pneumoconiosis. This text studied the application of medical image detection technology in pneumoconiosis diagnosis based on deep learning (DL). A medical image detection and convolution neural network (CNN) based on DL was analyzed, and the application of DL medical image technology in pneumoconiosis diagnosis was researched. The experimental results in this paper showed that in the last round of testing, the accuracy of ResNet model including deconvolution structure reached 95.2%. The area under curve (AUC) value of the working characteristics of the subject is 0.987. The sensitivity was 99.66%, and the specificity was 88.61%. The non staging diagnosis of pneumoconiosis improved the diagnostic sensitivity while ensuring high specificity. At the same time, Delong test method was used to conduct AUC analysis on the three models, and the results showed that model C was more effective than model A and model B. There is no significant difference between model A and model B, and there is no significant difference in diagnostic efficiency. In a word, the diagnosis of the model has high sensitivity and low probability of missed diagnosis, which can greatly reduce the working pressure of diagnostic doctors and effectively improve the efficiency of diagnosis.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"1 1","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"42662945","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
ABSTRACT Nowadays, natural language processing (NLP) is one of the most popular areas of, broadly understood, artificial intelligence. Therefore, every day, new research contributions are posted, for instance, to the arXiv repository. Hence, it is rather difficult to capture the current “state of the field” and thus, to enter it. This brought the id-art NLP techniques to analyse the NLP-focused literature. As a result, (1) meta-level knowledge, concerning the current state of NLP has been captured, and (2) a guide to use of basic NLP tools is provided. It should be noted that all the tools and the dataset described in this contribution are publicly available. Furthermore, the originality of this review lies in its full automation. This allows easy reproducibility and continuation and updating of this research in the future as new researches emerge in the field of NLP.
{"title":"The State of the Art of Natural Language Processing—A Systematic Automated Review of NLP Literature Using NLP Techniques","authors":"Jan Sawicki, M. Ganzha, M. Paprzycki","doi":"10.1162/dint_a_00213","DOIUrl":"https://doi.org/10.1162/dint_a_00213","url":null,"abstract":"ABSTRACT Nowadays, natural language processing (NLP) is one of the most popular areas of, broadly understood, artificial intelligence. Therefore, every day, new research contributions are posted, for instance, to the arXiv repository. Hence, it is rather difficult to capture the current “state of the field” and thus, to enter it. This brought the id-art NLP techniques to analyse the NLP-focused literature. As a result, (1) meta-level knowledge, concerning the current state of NLP has been captured, and (2) a guide to use of basic NLP tools is provided. It should be noted that all the tools and the dataset described in this contribution are publicly available. Furthermore, the originality of this review lies in its full automation. This allows easy reproducibility and continuation and updating of this research in the future as new researches emerge in the field of NLP.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"5 1","pages":"707-749"},"PeriodicalIF":3.9,"publicationDate":"2023-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"64531695","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The health care system encompasses the participation of individuals, groups, agencies, and resources that offer services to address the requirements of the person, community, and population in terms of health. Parallel to the rising debates on the healthcare systems in relation to diseases, treatments, interventions, medication, and clinical practice guidelines, the world is currently discussing the healthcare industry, technology perspectives, and healthcare costs. To gain a comprehensive understanding of the healthcare systems research paradigm, we offered a novel contextual topic modeling approach that links up the CombinedTM model with our healthcare Bert to discover the contextual topics in the domain of healthcare. This research work discovered 60 contextual topics among them fifteen topics are the hottest which include smart medical monitoring systems, causes, and effects of stress and anxiety, and healthcare cost estimation and twelve topics are the coldest. Moreover, thirty-three topics are showing insignificant trends. We further investigated various clusters and correlations among the topics exploring inter-topic distance maps which add depth to the understanding of the research structure of this scientific domain. The current study enhances the prior topic modeling methodologies that examine the healthcare literature from a particular disciplinary perspective. It further extends the existing topic modeling approaches that do not incorporate contextual information in the topic discovery process adding contextual information by creating sentence embedding vectors through transformers-based models. We also utilized corpus tuning, the mean pooling technique, and the hugging face tool. Our method gives a higher coherence score as compared to the state-of-the-art models (LSA, LDA, and Ber Topic).
{"title":"Revealing the trends in the academic landscape of the health care system using contextual topic modeling","authors":"Muhammad Inaam ul haq, Qianmu Li","doi":"10.1162/dint_a_00217","DOIUrl":"https://doi.org/10.1162/dint_a_00217","url":null,"abstract":"\u0000 The health care system encompasses the participation of individuals, groups, agencies, and resources that offer services to address the requirements of the person, community, and population in terms of health. Parallel to the rising debates on the healthcare systems in relation to diseases, treatments, interventions, medication, and clinical practice guidelines, the world is currently discussing the healthcare industry, technology perspectives, and healthcare costs. To gain a comprehensive understanding of the healthcare systems research paradigm, we offered a novel contextual topic modeling approach that links up the CombinedTM model with our healthcare Bert to discover the contextual topics in the domain of healthcare. This research work discovered 60 contextual topics among them fifteen topics are the hottest which include smart medical monitoring systems, causes, and effects of stress and anxiety, and healthcare cost estimation and twelve topics are the coldest. Moreover, thirty-three topics are showing insignificant trends. We further investigated various clusters and correlations among the topics exploring inter-topic distance maps which add depth to the understanding of the research structure of this scientific domain. The current study enhances the prior topic modeling methodologies that examine the healthcare literature from a particular disciplinary perspective. It further extends the existing topic modeling approaches that do not incorporate contextual information in the topic discovery process adding contextual information by creating sentence embedding vectors through transformers-based models. We also utilized corpus tuning, the mean pooling technique, and the hugging face tool. Our method gives a higher coherence score as compared to the state-of-the-art models (LSA, LDA, and Ber Topic).","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":" ","pages":""},"PeriodicalIF":3.9,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"44972089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Abstract Sustainable development denotes the enhancement of living standards in the present without compromising future generations’ resources. Sustainable Development Goals (SDGs) quantify the accomplishment of sustainable development and pave the way for a world worth living in for future generations. Scholars can contribute to the achievement of the SDGs by guiding the actions of practitioners based on the analysis of SDG data, as intended by this work. We propose a framework of algorithms based on dimensionality reduction methods with the use of Hilbert Space Filling Curves (HSFCs) in order to semantically cluster new uncategorised SDG data and novel indicators, and efficiently place them in the environment of a distributed knowledge graph store. First, a framework of algorithms for insertion of new indicators and projection on the HSFC curve based on their transformer-based similarity assessment, for retrieval of indicators and load-balancing along with an approach for data classification of entrant-indicators is described. Then, a thorough case study in a distributed knowledge graph environment experimentally evaluates our framework. The results are presented and discussed in light of theory along with the actual impact that can have for practitioners analysing SDG data, including intergovernmental organizations, government agencies and social welfare organizations. Our approach empowers SDG knowledge graphs for causal analysis, inference, and manifold interpretations of the societal implications of SDG-related actions, as data are accessed in reduced retrieval times. It facilitates quicker measurement of influence of users and communities on specific goals and serves for faster distributed knowledge matching, as semantic cohesion of data is preserved.
可持续发展是指在不损害子孙后代资源的前提下,提高当代人的生活水平。可持续发展目标(sdg)量化了可持续发展的成就,为子孙后代创造一个值得生活的世界铺平了道路。学者可以在分析可持续发展目标数据的基础上指导实践者的行动,从而为实现可持续发展目标做出贡献,这也是本工作的目的。我们提出了一种基于降维方法的算法框架,利用希尔伯特空间填充曲线(Hilbert Space Filling Curves, hsfc)对新的未分类的可持续发展目标数据和新的指标进行语义聚类,并有效地将它们放置在分布式知识图存储环境中。首先,描述了基于变压器相似性评估的新指标插入和HSFC曲线投影的算法框架,用于指标检索和负载平衡,以及进入指标的数据分类方法。然后,在分布式知识图环境中进行了全面的案例研究,实验评估了我们的框架。结果在理论的基础上提出和讨论,以及对分析可持续发展目标数据的实践者的实际影响,包括政府间组织、政府机构和社会福利组织。我们的方法使可持续发展目标知识图谱能够进行因果分析、推理,并对可持续发展目标相关行动的社会影响进行多种解释,因为数据可以在更短的检索时间内访问。它有助于更快地衡量用户和社区对特定目标的影响,并有助于更快地进行分布式知识匹配,因为数据的语义内聚得到了保留。
{"title":"A knowledge graph-based deep learning framework for efficient content similarity search of Sustainable Development Goals data","authors":"Irene Kilanioti, George A. Papadopoulos","doi":"10.1162/dint_a_00206","DOIUrl":"https://doi.org/10.1162/dint_a_00206","url":null,"abstract":"Abstract Sustainable development denotes the enhancement of living standards in the present without compromising future generations’ resources. Sustainable Development Goals (SDGs) quantify the accomplishment of sustainable development and pave the way for a world worth living in for future generations. Scholars can contribute to the achievement of the SDGs by guiding the actions of practitioners based on the analysis of SDG data, as intended by this work. We propose a framework of algorithms based on dimensionality reduction methods with the use of Hilbert Space Filling Curves (HSFCs) in order to semantically cluster new uncategorised SDG data and novel indicators, and efficiently place them in the environment of a distributed knowledge graph store. First, a framework of algorithms for insertion of new indicators and projection on the HSFC curve based on their transformer-based similarity assessment, for retrieval of indicators and load-balancing along with an approach for data classification of entrant-indicators is described. Then, a thorough case study in a distributed knowledge graph environment experimentally evaluates our framework. The results are presented and discussed in light of theory along with the actual impact that can have for practitioners analysing SDG data, including intergovernmental organizations, government agencies and social welfare organizations. Our approach empowers SDG knowledge graphs for causal analysis, inference, and manifold interpretations of the societal implications of SDG-related actions, as data are accessed in reduced retrieval times. It facilitates quicker measurement of influence of users and communities on specific goals and serves for faster distributed knowledge matching, as semantic cohesion of data is preserved.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136106992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero
Abstract In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.
{"title":"MillenniumDB: An Open-Source Graph Database System","authors":"Domagoj Vrgoč, Carlos Rojas, Renzo Angles, Marcelo Arenas, Diego Arroyuelo, Carlos Buil-Aranda, Aidan Hogan, Gonzalo Navarro, Cristian Riveros, Juan Romero","doi":"10.1162/dint_a_00209","DOIUrl":"https://doi.org/10.1162/dint_a_00209","url":null,"abstract":"Abstract In this systems paper, we present MillenniumDB: a novel graph database engine that is modular, persistent, and open source. MillenniumDB is based on a graph data model, which we call domain graphs, that provides a simple abstraction upon which a variety of popular graph models can be supported, thus providing a flexible data management engine for diverse types of knowledge graph. The engine itself is founded on a combination of tried and tested techniques from relational data management, state-of-the-art algorithms for worst-case-optimal joins, as well as graph-specific algorithms for evaluating path queries. In this paper, we present the main design principles underlying MillenniumDB, describing the abstract graph model and query semantics supported, the concrete data model and query syntax implemented, as well as the storage, indexing, query planning and query evaluation techniques used. We evaluate MillenniumDB over real-world data and queries from the Wikidata knowledge graph, where we find that it outperforms other popular persistent graph database engines (including both enterprise and open source alternatives) that support similar query features.","PeriodicalId":34023,"journal":{"name":"Data Intelligence","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136106994","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}